matchRtMz matches elements (features) in x with elements in table based on similarity of their m/z and retention time. With parameter duplicates = "closest", the function return the index of the best match, considering the m/z difference of features that are within the acceptable retention time difference (defined by rt_tolerance). With duplicates = "keep" the index of all matching rows in table are returned.

matchRtMz(
  x,
  table,
  nomatch = NA_integer_,
  rt_tolerance = 2,
  tolerance = 0,
  ppm = 20,
  duplicates = c("closest", "keep"),
  mzcol = "mz",
  rtcol = "rt"
)

Arguments

x

data.frame, matrix or DataFrame with feature definitions (i.e. m/z and retentin times) that should be matched against features in table.

table

data.frame, matrix or DataFrame with feature definitions to match features in x against.

nomatch

value that should be returned if no match for a feature is found.

rt_tolerance

numeric(1) with the largest acceptable difference in retention time.

tolerance

numeric(1) with a constant acceptable difference of m/z values for features to be considered matching.

ppm

numeric(1) with a m/z-dependent relative acceptable difference (in parts per million) of m/z values.

duplicates

character(1) whether the best (duplicates = "closest", default) or all matches (duplicates = "keep"`) shpuld be returned.

mzcol

character(1) with the name of the column containing the m/z ratios.

rtcol

character(1) with the name of the column containing the retention times.

Value

for duplicates = "closest": integer of length equal to nrow(x) with the index of the row in table that matches each row in x (e.g. c(3, 4) means the first feature in x matches with the 3rd feature in table. For duplicates = "keep": list of length equal to nrow(x) with indices of all rows in table that match each row in x.

Note

The function first finds features in table with a difference of retention time which is smaller than rt_tolerance and matches these using the closest() function.

Author

Johannes Rainer

Examples

x <- data.frame(mz = c(23.4, 45.6, 56.9, 76.5, 76.5, 76.5, 80.1), rt = c(12, 34, 59, 34, 67, 65, 67)) set.seed(123) y <- rbind(x, x) y$mz <- y$mz + rnorm(nrow(y), sd = 0.0002) y$rt[1:nrow(x)] <- x$rt + 2 y <- y[order(y$mz), ] matchRtMz(x, y)
#> [1] 2 4 5 7 8 9 13
## Keeping all matches matchRtMz(x, y, duplicates = "keep")
#> [[1]] #> [1] 1 2 #> #> [[2]] #> [1] 3 4 #> #> [[3]] #> [1] 5 6 #> #> [[4]] #> [1] 7 11 #> #> [[5]] #> [1] 8 9 10 12 #> #> [[6]] #> [1] 9 10 12 #> #> [[7]] #> [1] 13 14 #>
## Lower ppm matchRtMz(x, y, duplicates = "keep", ppm = 5)
#> [[1]] #> [1] 2 #> #> [[2]] #> [1] 3 4 #> #> [[3]] #> [1] 5 #> #> [[4]] #> [1] 7 11 #> #> [[5]] #> [1] 8 9 10 12 #> #> [[6]] #> [1] 9 10 12 #> #> [[7]] #> [1] 13 14 #>
## even lower matchRtMz(x, y, duplicates = "keep", ppm = 2)
#> [[1]] #> [1] NA #> #> [[2]] #> [1] 4 #> #> [[3]] #> [1] 5 #> #> [[4]] #> [1] 7 #> #> [[5]] #> [1] 8 9 10 #> #> [[6]] #> [1] 9 10 #> #> [[7]] #> [1] 13 14 #>
matchRtMz(x, y, ppm = 0)
#> [1] NA NA NA NA NA NA NA