
Load sequences and corresponding records as a refdb object
Source:R/loadBarcodeOres.R
loadBarcodeOre.Rd
This function allows the user to load a custom refdb-formatted data frame object with additional records and sequences obtained from private analyses. The objects can be loaded from both files (tsv and fasta) or loaded objects (data.frame and DNAStringSet).
Arguments
- records
`data.frame` or `character` A character string with path (or file name if in the working directory) leading to strictly "tsv file" with records information. It can also correspond to a data frame, provided that it has the fields included in the example data 'barcodeMineR::example_record'.
- sequences
`DNAStringSet` or `character` A character string with path (or file name if in the working directory) leading to fasta file with sequences corresponding to each record. It can also correspond to a "DNAStringSet" object, as the one in the example data `barcodeMineR::example_sequences`. The name of each sequence from the DNAStringSet object must correspond to the concatenation of the fields `sourceID` and `markerCode`, separated by a pipe `|`.
- prefix
`character` A character string that will be used to create numbered custom ids for each record in ascending order. The prefix will compose the recordID field in the final object. Default to `NULL`, using the information extracted from the field `sourceID`.
Examples
if (FALSE) { # \dontrun{
# load from tsv and fasta files
loadBarcodeOre("path/to/table.tsv", "path/to/sequences.fasta")
} # }
# load from data.frame and DNAStringSet loaded objects
loadBarcodeOre(example_record, example_sequence)
#> # A tibble: 1 × 30
#> recordID markerCode DNA_seq phylum class order family genus species source
#> <chr> <chr> <DNA> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 SEQ_01 COI AAACTCAAAG… Chord… Acti… Perc… Notot… Diss… Dissos… ACRON…
#> # ℹ 20 more variables: lat <dbl>, lon <dbl>, lengthGene <int>, sampleID <chr>,
#> # QueryName <chr>, identified_by <chr>, taxNotes <chr>, db_xref <chr>,
#> # sourceID <chr>, NCBI_ID <chr>, institutionStoring <chr>,
#> # collected_by <chr>, collection_date <chr>, altitude <chr>, depth <dbl>,
#> # country <chr>, directionPrimers <chr>, lengthSource <int>,
#> # PCR_primers <chr>, note <chr>