Two-stage normalization using background intensities in cDNA microarray data

Background In the microarray experiment, many undesirable systematic variations are commonly observed. Normalization is the process of removing such variation that affects the measured gene expression levels. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization. One major source of variation is the background intensities. Recently, some methods have been employed for correcting the background intensities. However, all these methods focus on defining signal intensities appropriately from foreground and background intensities in the image analysis. Although a number of normalization methods have been proposed, no systematic methods have been proposed using the background intensities in the normalization process. Results In this paper, we propose a two-stage method adjusting for the effect of background intensities in the normalization process. The first stage fits a regression model to adjust for the effect of background intensities and the second stage applies the usual normalization method such as a nonlinear LOWESS method to the background-adjusted intensities. In order to carry out the two-stage normalization method, we consider nine different background measures and investigate their performances in normalization. The performance of two-stage normalization is compared to those of global median normalization as well as intensity dependent nonlinear LOWESS normalization. We use the variability among the replicated slides to compare performance of normalization methods. Conclusions For the selected background measures, the proposed two-stage normalization method performs better than global or intensity dependent nonlinear LOWESS normalization method. Especially, when there is a strong relationship between the background intensity and the signal intensity, the proposed method performs much better. Regardless of background correction methods used in the image analysis, the proposed two-stage normalization method can be applicable as long as both signal intensity and background intensity are available.

Background cDNA microarrays consist of thousands of individual DNA sequences printed in a high density array on a glass slide. After being reverse-transcribed into cDNA and labeled using red (Cy5) and green (Cy3) fluorescent dyes, two target mRNA samples are hybridized with the arrayed DNA sequences or probes. Then, the relative abundance of these spotted DNA sequences can be measured. The ratio of the fluorescence intensity for each spot represents the relative abundance of the corresponding DNA sequence.
In cDNA microarray experiments, there are many sources of systematic variation. Normalization is the process of removing such variation that affects the measured gene expression levels. The main idea of normalization is to adjust for artifact differences in intensity of the two labels. Such differences result from differences in affinity of the two labels for DNA, differences in amounts of sample and label used, differences in photomultiplier tube and laser voltage settings and differences in photon emission response to laser excitation. Although normalization alone cannot control all systematic variations, normalization plays an important role in the earlier stage of microarray data analysis.
Many normalization methods have been proposed by using the statistical regression models. Kerr et al. [1] and Kerr et al. [2] suggested the ANOVA model approach. Wolfinger et al. [3] proposed a mixed effect model for normalization. Schadt et al. [4] proposed smoothing splines with generalized cross-validation (GCVSS). Kepler et al. [5] used a local polynomial regression to estimate the normalized expression levels as well as the expression level dependent error variance. Yang et al. [6] summarized a number of normalization methods for dual labeled microarrays such as global normalization and robust locally weighted scatter plot smoothing (LOWESS, Cleveland [7]). Workman et al. [8] proposed a robust nonlinear method for normalization using array signal distribution analysis and cubic splines. Wang et al. [9] suggested iterative normalization of cDNA microarray data to estimate normalization coefficients and to identify control gene set. Chen et al. [10] presented subset normalization to adjust for location biases combined with global normalization for intensity biases.
After performing two dye cDNA microarray experiments, we get foreground and background intensities from red channel and green channel, respectively. Although a complex modeling approach can be used, the signal intensity is usually computed by subtracting the background intensity from the foreground intensity. Thus, the noise in the background intensity may have a large effect on the signal intensity.
Several approaches have been proposed for decreasing the background noises in image analyses (Yang et al. [11] and Kim et al. [12]). Kim et al. [13] found out the influences of background intensities on signal intensities, and showed that background intensities could play an important role in normalization. Recently, some background correction methods have been proposed using Bayesian method or smoothing function rather than simple subtraction when defining signal intensity (Kooperberg et al. [14] and Edwards [15]. As pointed out by Kim et al. [13], the signal intensities need to be robust to the local background intensity. In general, the signal intensities tend to have some correlations with background intensities (Figure 1). We think it is important to reduce variation in signal intensities caused by the background intensities. However, no systematic methods have been proposed that use the background intensities in normalization. In order to make the effect of background intensities more robust to the signal intensities, we propose a new method so called 'two-stage normalization method' to adjust for the effect of the background intensities. The first stage fits a regression model to adjust for the effect of background, and the second stage applies the usual normalization method such as a nonlinear LOWESS method to the backgroundadjusted intensities obtained from the first stage.
In order to perform the two-stage normalization method, we consider nine different background measures and investigate their performances in normalization. A detailed description on background measures is given in Methods section. Also, Methods section describes the proposed two-stage normalization methods. Results section describes the results from NCI 60 cDNA microarray experiment, which illustrates the effects of background intensities (Zhou et al. [16]). In addition, some comparative results are presented from cDNA microarrays of cortical stem cells of rat (Park et al. [17]) and those from kidney, liver, and testis cells from mice (Pritchard et al. [18]). The performance of two-stage normalization is compared to those of global normalization as well as intensity dependent nonlinear LOWESS normalization. We use the variability among the replicated slides to compare the performance of normalization methods. For certain selected background measures, the proposed twostage normalization performs better than global or intensity dependent nonlinear normalization method. Finally, Conclusion section summarizes the concluding remarks.

Methods
We propose a two-stage normalization method for the cDNA microarray data analysis using background intensities. At the first stage, we adjust for the effect of background intensities on M; at the second stage, we correct bias on M caused by other sources of systematic variation.

Stage 1. Background normalization
Let g fi and g bi be the means(or medians) of the i th foreground and background intensity of green channel, respectively; r fi and r bi be the corresponding means(or medians) of red channel, respectively. Then for each spot, we have two pairs of intensities: (g fi , g bi ) and (r fi , r bi ), i = 1,..., p, where p is the number of spots in a slide. (For simplicity, we omit the subscript and define A and M using notation of Yang et al. [6] as follows: In cDNA microarray experiments, there are red and green background intensities. It would be desirable to consider the background intensities that are more closely related with the signal intensities. We consider nine possible background measures from red channel, green channel, and both channels as follows: (a) Red channel (8) (c) Both channels For each category, there are three types of background measures. The first one is log-transformed background intensities. The second one is the log-transformed ratio of foreground and background intensities. The third one is the ratio of the log-transformed values of foreground and background intensities. Here we used log base 2, but any logarithmic base can be used as desired. Figure 2 shows Example of correlations between background intensity and signal intensity from one of the NCI 60 data of Zhuo et al. [16]   log(background) the relationship between signal intensities and these background measures. At the first stage, we adjust for the effect of background intensities by fitting the usual nonlinear LOWESS curve.
For simplicity, let Y k (k = 1,..., 9) be an appropriate background intensity defined by red channel (a), green channel (b), or two channels (c). Then, fit the nonlinear LOWESS curve as follows:  9 (13) where C (1) (·) represents the LOWESS curve and M k (1) is the residual from the curve. Note that M k (1) is the log-ratio of relative intensities after removing the effect of background intensities. For these ratios, we can perform the usual MA normalization at the second stage.

Stage 2. MA normalization
In the second stage, we perform the normalization process as follows: where C (2) (·) is the LOWESS curve and M k (2) is the residuals from the curve in the second stage.
Note that at the second stage any normalization method can be applied including a simple global normalization method.

Results of NCI 60 data
We first apply the proposed two-stage normalization method to a microarray data set of the NCI 60 cancer cell lines. These cell lines derived from human tumors have been widely used for investigations on various drugs and molecular targets (http://discover.nci.nih.gov). The National Cancer Institute's Developmental Therapeutic Programs (http://dtp.nci.nih.gov) has been studying a large number of anti-cancer drug compounds and molecular targets on the 60 cancer cell lines (Weinstein et al. [19]). In particular, the NCI 60 microarray data have been frequently reanalyzed as an experimental model due to the inaccessibility to human tumor tissues for various studies on cancer. Using HCT116, one of the colon cell lines in the NCI 60 panel, Zhuo et al. [16] performed gene expression profile of dose-and time-dependent effects by the topoisomerase inhibitor I camptothecin compound (CPT). We here use a subset of the array data set consisting of ten slides. These slides were randomly selected to demonstrate the proposed method. Each slide contains 2,208 spotted clone sequences. We also apply global median normalization and intensity dependent nonlinear LOW-ESS normalization to this data set.
From ten slides we choose one slide to illustrate the proposed method. Figure (13) demonstrate some reduction in variability, which can be seen more clearly by comparing Figure 2 with Figure 3. Note that each correlation coefficient between M (1) and Y k have values close to zero. Using M (1) , we carry out the second stage normalization. Figure 4 shows MA plots. The first row shows the MA plots for original (before normalization), after global normalization, and after nonlinear LOWESS normalization, respectively (from left to right), The second row shows M (2) versus A plots after two-stage normalization for Y 1 , Y 2 , and Y 3 , respectively. The third row shows MA plots for Y 4 , Y 5 , and Y 6 respectively, and the bottom row shows MA plots for Y 7 , Y 8 , and Y 9 , respectively. We can see that the two-stage normalization methods using Y 2 and Y 3 have the effect of reducing the variability among Ms and perform better than global and non-linear LOWESS normalization methods.

Comparative studies
The goal of this study is to compare performances of normalization methods. We compare two-stage normalization to global median normalization and intensity dependent nonlinear LOWESS normalization. Following the idea of Park et al. [20], we use the variability among the replicated slides as comparison measures, which can be estimated by the mean square error (MSE). For each gene, we can calculate MSE l (l = 1,..., number of gene) which is the variance estimator for each gene derived from replicated slides. The main idea is that the better the normalization method, the smaller the variation among the replicated observations. Here, we use three different sets of microarray data: colorectal cancer data of NCI 60 (Zhou, et al. [16]), cortical stem cells data (Park,et al. [17]), and mouse gene expression data (Pritchard et al. [18]).
The goal of cortical stem cells study is to identify genes that are associated with neuronal differentiation of cortical stem cells. In this experiment, there are 3,840 genes in each slide from two experimental groups for comparison measured at six different time points (12 hrs, 1 day, 2 days, 3 days, 4 days, 5 days). All experiments were replicated three times, thus we have 36 slides for the analysis.
The objective of mouse gene expression study of Pritchard et al. [18] is to assess natural differences in murine gene expression. A 5406-clone spotted cDNA microarray was used to measure transcript levels in the kidney, liver, and testis from each of 6 normal male C57BL6 mice. Experiments were replicated four times per each mouse organ, two red fluorescent dye sample and two green dye samples. Since there are three organs, we have three sets of microarrays. In each organ, there are 24 slides available for the analysis.
In this comparative study, all five microarray data sets were used: colorectal cancer data set from NCI 60, cortical stem cells data set from Park et al. [17], and three organ data sets from Pritchard et al. [18]. Since results are similar among three organs, we only present the results of kidney. For simplicity, denote CCD for colorectal cancer data, M (1) versus Y k plot after the first stage normalization MA plots before and after normalization Figure 4 MA plots before and after normalization. The first row shows the MA plots for original (before normalization), after global normalization, and after nonlinear LOWESS normalization, respectively (from left to right), The second row shows M (2) versus A plots after two-stage normalization for Y 1 , Y 2 , and Y 3 (from left to right), respectively. The third row shows MA plots for Y 4 , Y 5 , and Y 6 , respectively and the bottom row shows MA plots for Y 7 , Y 8 , and Y 9 , respectively. The two-stage normalization methods using Y 2 and Y 3 perform better than global and non-linear LOWESS normalization methods.
SCD for stem cells data, and KD for kidney data, respectively.
In this study, the performances of two-stage normalization using nine background measures are compared to global normalization and intensity dependent nonlinear LOWESS normalization. Figure 5 shows dot plots of logtransformed variance estimates for (a) CCD, (b) SCD, and (c) KD. Here each dot represents the mean value of the log-transformed MSEs for all genes. For all three different data sets, the global normalization reduces variability of the original data but the nonlinear LOWESS reduces variability much more. In general, these dot plots show that the two-stage normalization method using background intensities and the nonlinear normalization method have much smaller variabilities than those of global normalization. However, if we compare the two-stage normalization methods with the nonlinear normalization, the results differ depending on the background measures. That is, the background measures Y 2 and Y 3 in two-stage normalization methods always yield better performances than the nonlinear normalization method, while the other background measures yield comparable results to those of the nonlinear normalization. Thus, we suggest either Y 2 or Y 3 as background measures in the two-stage normalization.

Conclusions
In microarray studies, many undesirable systematic variations are commonly observed. Normalization becomes a standard process for removing some of the variation that affects the measured gene expression levels. One major source of variation is the background intensities. Recently, some methods have been employed for adjusting for the background intensities. However, all these methods focus on defining signal intensities appropriately from foreground and background intensities during the image analysis (Kooperberg et al. [14], Edwards [15]). Although a number of normalization methods have been proposed, no systematic methods have been proposed using the background intensities in the normalization process.
In this paper, we propose two-stage normalization for adjusting for the effect of background intensities systematically. The motivating idea is that the noise caused by background intensities may increase the variability in signal intensities. Although we use the log-transformed ratio of two channels denoted by M in most subsequent analysis, the noise caused by background intensities may still remain in M even after normalization. The two-stage normalization may be quite effective especially when there is a high correlation between M and background measures such as Y 2 and Y 3 .
Among nine background measures, we recommend two background measures Y 2 and Y 3 based on the results of the comparative studies. For these two background measures, we show that the two-stage normalization method always performs better than the global normalization methods and the nonlinear LOWESS normalization method.
We wondered if the relative good performance of twostage normalization using Y 2 and Y 3 is due to low intensities. We investigated this problem for NCI 60 data after removing spots with low intensities. The spots whose ratio of foreground and background intensities were smaller than 1.5 were removed in the analysis. This new data set also provided quite similar results.
The main reasons why background measures Y 2 and Y 3 perform better than other backgrounds are as follows. The background fluorescence might be relatively strong in the Cy5 channel due to interaction between the slide substrate and the hybridization materials. This effect is weaker in the Cy3 channel. It might be also possible that the background fluorescence in the Cy5 channel inflates the values of r b without correspondingly inflating the values of r f . This means that for weakly-responding spots, the The comparison is based on the variability measures derived from the replicated microarray samples. These variability measures can be easily derived from any replicated microarray experiment. Although we have studied only a limited number of data sets, our findings are quite consistent.
For these data sets, we also conduct a similar analysis to see the effect of print-tips in normalization process. The results were consistent to those in Figure 5 and not reported here. Background measures Y 2 and Y 3 yielded better results than other background measures.
One major concern for normalization is about over-fitting which may cause overcorrection of real biological significance. In fact, every normalization method has a possibility of overcorrecting the data and removing some existing biological significance in the data. For the NCI60 data, for example, it is not easy to find out whether such negatively low expression genes are biologically significant genes or not. We and T3, respectively). The density plot is in Figure 6. The distributions of M for T2 and T3 are quite similar to that of L. This simple empirical investigation shows that the proposed method does not always cause a bigger overcorrection than the LOWESS normalization. However, we must be careful for over-fitting which may overcorrects the biological findings of interest.
In addition, we performed the hypothesis test for M equal to zero and counted the number of genes that have different expression level between two channels. The number of significant gene with different expression value is 909 for the original data before normalization, 326 for the data after global normalization, 302 for the data set after LOW-ESS normalization, and 297 for the data after two-stage normalization. The numbers of significant genes do not differ much among normalization methods.
The proposed method can be applicable when both background and foreground intensities are available after the image analysis. Many image software provides both signal intensity as well as background intensity of spots. In most cases, the signal intensity is defined by simple subtraction such as (r f -r b ) for the red channel. Our method was illustrated using the usual subtraction (r f -r b ). Although our method starts with a most common approach based on the usual subtraction, it can be applied to any other models for defining spot intensity r = f(r f ,r b ), where r is the signal intensity, as long as a background intensity is available. Our method can be applicable as long as the image analysis provides the signal intensity and background intensity. For example, our method using Y 2 builds on the relationship between r = f(r f ,r b ) and r b .
We recommend that experimentalists examine their data carefully and consider applying the two-stage normalization methods using Y 2 and Y 3 . The performance of the two-stage normalization method tends to depend on the correlations between background measures and M. That is, if there is a strong relationship between them, our method has a large effect. Thus, for the experimentalists it might be important to tell when to use the two-stage normalization method.
In order to answer this question, we compute correlations between M and all background Density plots of M for one of the colorectal cancer data set from NCI 60   Figure 7 shows the boxplots of correlation coefficients between M and Y k s for each data set. For example, Figure 7(a) shows the distribution of correlation coefficients from the NCI 60 data set. The correlation coefficients were computed for all ten slides. As expected, the correlations of Y 2 and Y 3 are relatively higher than those from other background measures. We also think that is why Y 2 and Y 3 have better performances than the other background measures. Figure 7 also shows that the median values of correlations of Y 2 and Y 3 are higher than 0.5 for both CCD and SCD, while those for KD are smaller than 0.5. Note that CCD and SCD have more reduction in variability than KD in the two-stage normalization. For lower correlations of Y k s do not reduce variability much compared to the usual LOWESS normalization.