# Loading QBMS library
library(QBMS)
# Configuring the connection
set_qbms_config(url = "https://gigwa.icarda.org:8443/gigwa/", engine = "gigwa")
login_gigwa("Tamara", "-Zy2gn5ijQcW!EE")12 Module 3.2: Using the QBMS Package to Query Genotypic Data
We have explored how to import genotypic data from files into R, but we can also retrieve it directly from online databases, such as Gigwa, using ICARDA’s QBMS package.
Once logged in, we can use the gigwa_list_dbs() function to view all available data bases.
gigwa_list_dbs() [1] "BarleySubData" "Barley_Hvulgare2"
[3] "Barley_Hvulgare3" "Barley_Hvulgare4"
[5] "Barley_Hvulgare5" "Barley_MegaProject1"
[7] "Barley_MegaProject1_public" "Cactus_Copuntia1"
[9] "Chickpea_Carietinum1" "Chickpea_Carietinum2"
[11] "Chickpea_Carietinum3" "Chickpea_Carietinum4"
[13] "Faba_Vfaba1" "GrassPea_Lsativus1"
[15] "GrassPea_Lsativus2" "Musa_Macuminata1"
[17] "WheatDurum_Tdurum1" "WheatDurum_Tdurum2"
[19] "WheatDurum_Tdurum3" "WheatDurum_Tdurum4"
[21] "WheatDurum_Tdurum5" "WheatDurum_Tdurum6"
[23] "WheatDurum_Tdurum7" "WheatDurum_Tdurum8"
[25] "WheatWild_Tspp1" "Wheat_Taestivum1"
[27] "Wheat_Taestivum2" "Wheat_Taestivum3"
[29] "Wheat_Taestivum4" "Wheat_Taestivum5"
For this example we will be choosing the “BarleySubData” database.
# To set a data base
gigwa_set_db("BarleySubData")Once we have defined a data base, we have to define a project and a run. We can do this the following way.
# To view available projects
gigwa_list_projects() studyName
1 BarleySubData
# To set a project
gigwa_set_project("BarleySubData")
# To view available runs
gigwa_list_runs() variantSetName
1 Run1
# To set a run
gigwa_set_run("Run1")Once we have defined a data base, a project and a run, there are many tools we can use to extract relevant information.
gigwa_get_samples(): Retrieves a list of samples associated with defined GIGWA projectgigwa_get_sequences(): Retrieves a list of chromosomes associated with defined GIGWA projectgigwa_get_markers(start = NULL, end = NULL, chrom = NULL, simplify = TRUE): Retrieves a list of SNP variants from selected GIGWA run. We can define the following parameters:start: starting position of queryend: ending position of querychrom: chromosomesimplify: defaults as TRUE, returns data in HapMap format with columns for rs#, alleles, chromosome and position
gigwa_get_allelematrix(samples = NULL, start = 0, end = "", chrom = NULL, snps = NULL, simplify = TRUE): Retrieves a two-dimensional matrix of genotype data from the defined GIGWA run.samples: optional list of sample IDs, if NULL, all samples are includedstart: starting position of queryend: ending position of querychrom: chromosomesnps: list of SNP variants to filtersimplify: defaults as TRUE, returns data in numeric coding (0, 1, 2 for diploids)
gigwa_get_metadata(): Retrieves associated metadata (if available)
# Get a list of all samples in the selected run
samples <- gigwa_get_samples()
|
| | 0%
|
|======================================================================| 100%
# Get sequence list
chroms <- gigwa_get_sequences()
# Get markers
markers <- gigwa_get_markers()
|
| | 0%
|
|==== | 5%
|
|======= | 11%
|
|=========== | 16%
|
|=============== | 21%
|
|================== | 26%
|
|====================== | 32%
|
|========================== | 37%
|
|============================= | 42%
|
|================================= | 47%
|
|===================================== | 53%
|
|========================================= | 58%
|
|============================================ | 63%
|
|================================================ | 68%
|
|==================================================== | 74%
|
|======================================================= | 79%
|
|=========================================================== | 84%
|
|=============================================================== | 89%
|
|================================================================== | 95%
|
|======================================================================| 100%
# Get genotypic matrix
#marker_matrix <- gigwa_get_allelematrix()