Skip to contents

This function allows the user to load a custom refdb-formatted data frame object with additional records and sequences obtained from private analyses. The objects can be loaded from both files (tsv and fasta) or loaded objects (data.frame and DNAStringSet).

Usage

loadBarcodeOre(records, sequences, prefix = NULL)

Arguments

records

`data.frame` or `character` A character string with path (or file name if in the working directory) leading to strictly "tsv file" with records information. It can also correspond to a data frame, provided that it has the fields included in the example data 'barcodeMineR::example_record'.

sequences

`DNAStringSet` or `character` A character string with path (or file name if in the working directory) leading to fasta file with sequences corresponding to each record. It can also correspond to a "DNAStringSet" object, as the one in the example data `barcodeMineR::example_sequences`. The name of each sequence from the DNAStringSet object must correspond to the concatenation of the fields `sourceID` and `markerCode`, separated by a pipe `|`.

prefix

`character` A character string that will be used to create numbered custom ids for each record in ascending order. The prefix will compose the recordID field in the final object. Default to `NULL`, using the information extracted from the field `sourceID`.

Value

`data.frame` A refdb data frame, including the DNA sequence as a field.

Examples

if (FALSE) { # \dontrun{
# load from tsv and fasta files
loadBarcodeOre("path/to/table.tsv", "path/to/sequences.fasta")
} # }

# load from data.frame and DNAStringSet loaded objects
loadBarcodeOre(example_record, example_sequence)
#> # A tibble: 1 × 30
#>   recordID markerCode DNA_seq     phylum class order family genus species source
#>   <chr>    <chr>      <DNA>       <chr>  <chr> <chr> <chr>  <chr> <chr>   <chr> 
#> 1 SEQ_01   COI        AAACTCAAAG… Chord… Acti… Perc… Notot… Diss… Dissos… ACRON…
#> # ℹ 20 more variables: lat <dbl>, lon <dbl>, lengthGene <int>, sampleID <chr>,
#> #   QueryName <chr>, identified_by <chr>, taxNotes <chr>, db_xref <chr>,
#> #   sourceID <chr>, NCBI_ID <chr>, institutionStoring <chr>,
#> #   collected_by <chr>, collection_date <chr>, altitude <chr>, depth <dbl>,
#> #   country <chr>, directionPrimers <chr>, lengthSource <int>,
#> #   PCR_primers <chr>, note <chr>