There are mainly two data types accepted as an input in ResistoXplorer: a list of antimicrobial resistance genes (ARGs) or a data table from metagenomic-based AMR studies.
The list data is a list of ARGs with optional abundance or fold change values. The data table is a table or matrix in tab-separated text or comma-separated values (.csv) file format
containing information on features (ARGs or taxa) and samples. There are three types of data tables (files) required: an abundance profile (resistome or microbiome), an annotation (functional or taxonomic) file and a metadata file.
User can explore the ARG-microbe (host) associations by entering or pasting a list of ARGs (name/ID) of interest with optional fold change or abundance values.
Such list can be those significant ARGs detected in differential abundance testing from metagenomic-based AMR studies or
those identified through high-throughput qPCR. Currently, ResistoXplorer supports five primary reference databases (ResFinder, CARD,
ARDB, BacMet and AMP dataset) for network-based exploratory analysis of ARGs. These
databases consist of diverse variety of ARGs (antibiotic, antifungal, biocide, metal, antimicrobial peptide (AMP) and others) and can be classified based on the type of information or ARGs present within them.
Additionally, the amount, type of information, gene nomenclature and annotation scheme used between databases
vary considerably. As a result, there is a possibility that you might get different number or no hits for your input list of ARGs based upon database selected for mapping. So, please select
the appropriate database based on your input ARG type and their naming format. Specific information related to each database is described below in detail:
ResFinder database (Download example gene list here || Database link here)
CARD (version: 2.0) (Download example gene list here || Database link here)
consists of 2617 ARGs. Note: microbial host associations and annotation information of ARGs present only in "protein homolog" model files which are used for BLAST of metagenomic datasets is collected.
BacMet (version: 2.0) (Download gene list here || Database link here)
consists of 772 biocide and metal resistance genes (only experimentally validated).
Note: associations and annotations for only experimentally confirmed biocide and metal resistance genes are collected. Also,
simple, acyclic and hierarchical classification scheme designed for BacMet in MegaRes 2.0 is used here for functional annotations.
Resistome profile derived from whole-genome shotgun metagenomic data can be uploaded.
The tab-separated (.txt) or comma-separated values (.csv) file format is used for resistome profile.
Basically, it is a data table or matrix containing abundance values (raw read counts from metagenomic data saved as a tab
delimited text (.txt) or comma-separated (.csv) file with rows for features (ARGs) and columns for samples).
This delimited file can be generated from any spreadsheet or text editor software.
Such file has to be in specific format which is described below:
It should contain sample names or IDs in first row beginning with "#NAME" in first column;
Both sample and feature names must be unique and consist of a combination of common English letters, underscores and numbers for naming purpose. Other special characters (e.g. single (') or double (") quotes) can also be used for feature (ARG) names. Latin/Greek letters are not supported;
Data values (read counts) should contain only numeric and positive values. Blank cells or with NA values are not allowed. Such values should be replaced by zero.
Non specific feature names (e.g. ARG_0001) can also be used as first column. In such case, a tab-delimited (.txt) or comma-separated (.csv) annotation mapping file
must also be uploaded which contains functional annotation information at multiple levels, for each feature (ARG);
Lastly, in case of selecting already compiled database for functional annotation, the user should make sure that the feature (ARGs) names in abundance table should be in the same format as required by selected database.
For more details on format for each database, kindly refer to "Annotation" tab from above.
Resistome abundance profile with features (ARGs) annotated through ResFinder database (Download here)
In case of Integration module, the user is also required to upload taxonomic abundance profile along with the resistome.
Taxonomic profiles derived from both 16S rRNA marker gene survey data or whole-genome shotgun metagenomic data can be uploaded.
In case of taxonomic abundance profile, data values consist of read count (abundance) of taxa in each sample. The required file formats
and data formatting for taxonomic profile is exactly same as stated above for resistome profile. Additionally, the user can also
provide a taxonomic annotation mapping file separately for performing analysis at multiple taxonomic level (e.g. species, genus, phylum). Please note,
parsing of features (taxa) names containing multiple taxonomic levels in abundance profile is not possible, hence an additional annotation file is always
provided in such cases.
Resistome abundance profile (Download here)
along with mapping taxonomic annotation file (Download here)
User can provide annotation information of features (ARGs) either by uploading a separate functional annotation file with their own annotation scheme or by
just selecting the appropriate database (if available) used while annotating ARGs during upstream analysis of resistome data. In ResistoXplorer,
we have manually collected and curated the functional annotation information from 11 (14 in total) most widely used antimicrobial resistance (AMR) databases to
support analysis and profiling of resistome abundances at various functional levels. The required annotation file format or database annotation structure is described below
in detail:
Tab-separated (.txt) or comma-separated values (.csv) format is also used for annotation file. For annotation file, first row should contain functional (in case of resistome profile) or
taxonomic (in case microbiome profile) levels beginning with "#ANNOTATION" in the first column. All the feature (ARG or taxa) names will be present in
the first column of file. Additionally, there is no requirement to include information for multiple levels,
and there is no minimum or maximum functional or taxonomic annotation levels that must be included. Kindly consider the following points while formatting
the annotation file:
Use the same feature (ARG or taxa) or row names as in your input resistome or taxonomic abundance table;
Use the simple, hierarchical and acyclic annotation structure containing information from higher to lower level for each feature name for accurate count-based profiling;
Also make sure that your data values do not contain tab or comma,
as these are used as delimiter to separate values;
Data values should consist of a combination of common English letters, special characters and numbers for naming purpose. Latin/Greek letters are not supported;
Using blank cells or "NA" values (without quotes) for missing values are permitted in case of annotation table.
Currently, ResistoXplorer supports manually curated functional annotation information from several most widely
and commonly used primary and secondary AMR databases including ResFinder, CARD, ARDB, ARG-ANNOT, MegaRes, AMRFinder,
SARG, DeepARG-DB, ARGminer, BacMet and AMP database. All these databases use their own naming, annotation
and classification scheme for annotation of features (i.e., ARGs) which have been identified in the resistome profile. Additionally,
the functional annotation information as well annotation levels at which resistome profile can be analyzed vary considerably between databases.
User must make sure that the feature (row) names in their uploaded resistome profile should be in same format as present in the
selected database in order to use their functional hierarchical annotations without uploading it as a separate file. Please note, all the
feature (row) names are unique in the collected databases annotation table.
Here is an example of how the features (ARGs) are annotated (first column) and the functional annotation levels (first row) are organized in each of the database:
ResFinder (version: 4.1) (Download example here || Database link here)
consists of 3152 features annotated at three functional levels (Class, Mechanism and Gene).
CARD (version: 3.1.3) (Download example here || Database link here)
consists of 2979 features annotated at three functional levels (Mechanism, Family and Gene).
Note: annotation information of features (ARGs) present only in "nucloetide fasta protein homolog" model files which are used for BLAST of metagenomic datasets is collected.
MegaRes (version: 2.0) Full (Download example here || Database link here)
consists of 7868 features annotated at four functional levels (Type, Class, Mechanism and Group).
Note: contain annotation information for all the ARGs conferring resistance to drugs, biocides and metals.
MegaRes (version: 2.0) Drugs only (Download example here || Database link here)
consists of 7126 features annotated at three functional levels (Class, Mechanism and Group).
Note: contain annotation information for all the ARGs conferring resistance to drugs only.
#ANNOTATIONClassMechanismGroup
Bla|OXA-223|JN248564|1-825|825|betalactams|Class_D_betalactamases|OXA betalactams Class D betalactamases OXA
gi|698174209|gb|KM087859.1|betalactams|Class_C_betalactamases|MIR betalactams Class C betalactamases MIR
1172|AF317511.1|AF317511|betalactams|Class_B_betalactamases|VIM betalactams Class B betalactamases VIM
959|M97297.1|TRNVAN|Glycopeptides|VanA-type_accessory_protein|VANZA Glycopeptides VanA-type accessory protein VANZA
Gly|VanY-A|M97297|9052-9963|912|Glycopeptides|VanA-type_accessory_protein|VANYA Glycopeptides VanA-type accessory protein VANYA
Mdr|AY769962.1|gene1|Multi-drug_resistance|Multi-drug_efflux_pumps|ADEAI Multi-drug resistance Multi-drug efflux pumps ADEAI
617|HQ875016.1|HQ875016|Phenicol|Phenicol_efflux_pumps|CML Phenicol Phenicol efflux pumps CML
MegaRes (version: 1.0.1) (Download example here || Database link here)
consists of 3824 features annotated at three functional levels (Class, Mechanism and Group).
#ANNOTATIONClassMechanismGroup
Bla|OXA-223|JN248564|1-825|825|betalactams|Class_D_betalactamases|OXA betalactams Class D betalactamases OXA
gi|698174209|gb|KM087859.1|betalactams|Class_C_betalactamases|MIR betalactams Class C betalactamases MIR
1172|AF317511.1|AF317511|betalactams|Class_B_betalactamases|VIM betalactams Class B betalactamases VIM
959|M97297.1|TRNVAN|Glycopeptides|VanA-type_accessory_protein|VANZA Glycopeptides VanA-type accessory protein VANZA
Gly|VanY-A|M97297|9052-9963|912|Glycopeptides|VanA-type_accessory_protein|VANYA Glycopeptides VanA-type accessory protein VANYA
Mdr|AY769962.1|gene1|Multi-drug_resistance|Multi-drug_efflux_pumps|ADEAI Multi-drug resistance Multi-drug efflux pumps ADEAI
617|HQ875016.1|HQ875016|Phenicol|Phenicol_efflux_pumps|CML Phenicol Phenicol efflux pumps CML
AMRFinder (Download example here || Database link here)
consists of 4156 features annotated at two functional levels (Class and Mechanism).
BacMet (version: 2.0) (Download example here || Database link here)
consists of 607 features annotated at two functional levels (Mechanism and Class).
Note: annotations for only experimentally confirmed biocide and metal resistance genes are collected. Also,
simple, acyclic and hierarchical classification scheme designed for BacMet in MegaRes 2.0 is used here for functional annotations.
#ANNOTATIONMechanismClass
abeM Drug and biocide MATE efflux pumps Drug and biocide resistance
abeS Drug and biocide SMR efflux pumps Drug and biocide resistance
abuO Multi-biocide RND efflux pump Multi-biocide resistance
acn Iron resistance protein Iron resistance
acr3 Arsenic resistance membrane transporter Arsenic resistance
acrA Drug and biocide RND efflux pumps Drug and biocide resistance
acrB Drug and biocide RND efflux pumps Drug and biocide resistance
acrC Drug and biocide RND efflux pumps Drug and biocide resistance
Antimicrobial peptide (AMP) dataset (Download example here || Database link here)
consists of 131 features annotated at Mechanism functional level.
Tab-separated (.txt) or comma-separated values (.csv) format is also used for metadata file. Sample names or IDs are in first column starting with "#NAME" in first row.
In metadata files, sample names should be present in rows and metadata types (experimental factor) (e.g. Treatment) in columns. Kindly consider the following points while
formatting the metadata file:
Use the same sample names or IDs as in your input resistome or microbiome abundance table;
Data values (Metadata labels) should be discrete and qualitative (e.g. HIGH, MED, LOW);
File does not contain any blank cells or with NA values;
Also make sure that neither your metadata type names or metadata labels include tab or comma,
as these are used as delimiter to separate values.
#NAMETreatmentTimePointGender
Sample1 Control Day_0 M
Sample2 Control Day_0 F
Sample3 Control Day_11 M
Sample4 Control Day_11 F
Sample5 Antibiotics Day_0 M
Sample6 Antibiotics Day_0 F
Sample7 Antibiotics Day_11 M
Sample8 Antibiotics Day_11 F