
Normalize intensities across samples using cyclic LOESS normalization
Source:R/normalize.R
normalize_cyclic_loess.Rd
The steps the algorithm takes are the following:
log2 transform the intensities
Choose 2 samples to generate an MA-plot from
Fit a LOESS curve
Subtract half of the difference between the predicted value and the true value from the intensity of sample 1 and add the same amount to the intensity of Sample 2
Repeat for all unique combinations of samples
Repeat all steps until the model converges or
n_iter
is reached.
Convergence is assumed if the confidence intervals of all LOESS smooths include the 0 line. If fixed_iter = TRUE
, the algorithm will perform exactly n_iter
iterations.
If fixed_iter = FALSE
, the algorithm will perform a maximum of n_iter
iterations.
See the reference section for details.
Usage
normalize_cyclic_loess(
data,
n_iter = 3,
fixed_iter = TRUE,
loess_span = 0.7,
level = 0.95,
verbose = FALSE,
...
)
Arguments
- data
A tidy tibble created by
read_featuretable
.- n_iter
The number of iterations to perform. If
fixed_iter = TRUE
exactlyn_iter
will be performed. Iffixed_iter = FALSE
a maximum ofn_iter
will be performed and the algorithm will stop whether convergence is reached or not.- fixed_iter
Should a fixed number of iterations be performed?
- loess_span
The span of the LOESS fit. A larger span produces a smoother line.
- level
The confidence level for the convergence criterion. Note that a a larger confidence level produces larger confidence intervals and therefore the algorithm stops earlier.
- verbose
TRUE
orFALSE
. Should messages be printed to the console?- ...
Arguments passed onto
loess
. For example,degree = 1, family = "symmetric", iterations = 4, surface = "direct"
produces a LOWESS fit.
References
B. M. Bolstad, R. A. Irizarry, M. Åstrand, T. P. Speed, Bioinformatics 2003, 19, 185–193, DOI 10.1093/bioinformatics/19.2.185.
Karla Ballman, Diane Grill, Ann Oberg, Terry Therneau, “Faster cyclic loess: normalizing DNA arrays via linear models” can be found under https://www.mayo.edu/research/documents/biostat-68pdf/doc-10027897, 2004.
K. V. Ballman, D. E. Grill, A. L. Oberg, T. M. Therneau, Bioinformatics 2004, 20, 2778–2786, DOI 10.1093/bioinformatics/bth327.
Examples
toy_metaboscape %>%
impute_lod() %>%
normalize_cyclic_loess()
#> # A tibble: 110 × 8
#> UID Feature Sample Intensity RT `m/z` Name Formula
#> <int> <chr> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 1 161.10519 Da 26.98 s Sample1 2.85 0.45 162. NA C7H15N…
#> 2 2 276.13647 Da 27.28 s Sample1 4.31 0.45 277. Octyl hyd… C16H22…
#> 3 3 304.24023 Da 32.86 s Sample1 1.77 0.55 305. Arachidon… C20H32…
#> 4 4 417.23236 Da 60.08 s Sample1 3.76 1 418. NA NA
#> 5 5 104.10753 Da 170.31 s Sample1 2.36 2.84 105. NA C5H14NO
#> 6 6 105.04259 Da 199.80 s Sample1 2.46 3.33 106. NA C3H8NO3
#> 7 7 237.09204 Da 313.24 s Sample1 2.09 5.22 238. Ketamine C13H16…
#> 8 8 745.09111 Da 382.23 s Sample1 1.72 6.37 746. NADPH C21H30…
#> 9 9 427.02942 Da 424.84 s Sample1 3.08 7.08 428. ADP C10H15…
#> 10 10 1284.34904 Da 498.94 s Sample1 0.705 8.32 1285. NA NA
#> # ℹ 100 more rows