Skip to contents

This function merges the output of multiple data frames obtained using the `download_bold` and `download_ncbi` functions or from private data obtained with the function `loadBarcodeOre`. It resolves conflicts originated from internal mining performed by the online database, which would result in duplicates in the final object. This operation can be avoided using the argument `resolve.conflicts`.

Usage

mergeBarcodeOres(..., resolve.conflicts = TRUE)

Arguments

...

`data.frame` Any number of refdb-formatted data frames, as those obtained with `download_ncbi`, `download_bold` and `loadBarcodeOre`.

resolve.conflicts

`logical` If set to FALSE, the script will merge the refdb objects from different sources (BOLD, NCBI and custom) without any quality control steps. Otherwise, it searches for "mining duplicates", records that appear in both sources due to internal mining, and returns only the original record. Defaults to `TRUE`.

Value

`data.frame` A refdb object, including the records and sequences from all refdb objects provided as arguments.

Examples

# search and download Maldane sarsi records:
tax_ncbi <- get_ncbi_taxonomy("Maldane sarsi", ask = FALSE)
tax_bold <- get_bold_taxonomy("Maldane sarsi", ask = FALSE)
rec_ncbi <- download_ncbi(tax_ncbi, ask = FALSE)
rec_bold <- download_bold(tax_bold, ask = FALSE)

# merge all results into one
mergeBarcodeOres(rec_ncbi, rec_bold)
#> '1' records from NCBI were mined from BOLD.
#> If they are already represented by the BOLD barcodeOre they will be removed to avoid duplicates.
#> Duplicated records obtained from the BOLD will be kept.
#> # A tibble: 25 × 30
#>    recordID   markerCode  DNA_seq phylum class order family genus species source
#>    <chr>      <chr>       <DNA>   <chr>  <chr> <chr> <chr>  <chr> <chr>   <chr> 
#>  1 OQ053050.1 COX1        AACCTT… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  2 OQ071313.1 large subu… GAGGGA… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  3 OQ071256.1 small subu… CCTTCG… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  4 KX867346.1 16S riboso… GTATCC… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  5 KX867345.1 16S riboso… TATCCT… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  6 AY612628.1 28S riboso… CCAACT… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  7 AY612617.1 18S riboso… TATCTT… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  8 AY569681.1 16S riboso… CGCGGT… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#>  9 AY569669.1 28S riboso… TGTGCG… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#> 10 AY569655.1 18S riboso… TGCCAG… Annel… Poly… NA    Malda… Mald… Maldan… NCBI  
#> # ℹ 15 more rows
#> # ℹ 20 more variables: lat <dbl>, lon <dbl>, lengthGene <int>, sampleID <chr>,
#> #   QueryName <chr>, identified_by <chr>, taxNotes <lgl>, db_xref <chr>,
#> #   sourceID <chr>, NCBI_ID <chr>, institutionStoring <chr>,
#> #   collected_by <chr>, collection_date <chr>, altitude <int>, depth <dbl>,
#> #   country <chr>, directionPrimers <chr>, lengthSource <int>,
#> #   PCR_primers <chr>, note <lgl>