@groovy.util.logging.Log class VariantDB extends java.lang.Object implements java.io.Closeable
VariantDB implements a simple, embedded, variant tracking database that can be used to track variants identified in multiple samples and sequencing runs. It allows quick computation of variant counts across samples, families, batches of samples, etc.
For the full power of the database, you shoudl use a PED file, parsed with the Pedigrees class when adding variants and samples. However this parameter can be passed as null, in which case all samples are treated as singletons. You should be aware that variant counts will not be family aware in such cases and thus will be distorted if your sequencing has large pedigrees and compared to the overall sample count.
To add variants, parse them using the VCF class and then simple use the #add method:
VariantDB db = new VariantDB("test.db") VCF.parse("test.vcf") { v -> db.add("batch1", null, v) }
Type | Name and description |
---|---|
static java.lang.Object |
ANNOVAR_ANNOTATION_PROFILE |
java.util.Map<java.lang.String, java.util.List<java.lang.String>> |
annotationProfile |
java.util.Map<java.lang.String, java.lang.Object> |
cachedSampleInfo Map of string to database rows of cached sample information |
java.lang.String |
connectString The file name of the database that is connected to |
groovy.sql.Sql |
db The actual database connection |
java.lang.String |
driver The file name of the database that is connected to |
Schema |
schema Schema for database |
Constructor and description |
---|
VariantDB
(java.lang.String fileName) Open or create a VariantDB using default settings and the given file name. |
Type Params | Return Type | Name and description |
---|---|---|
|
int |
add(java.lang.String batch, Pedigrees peds, gngs.Variant v, gngs.Variant$Allele alleleToAdd, java.lang.String cohort, java.lang.String sampleToAdd, java.lang.Object annotations) Add the given variant to the database. |
|
java.util.Map |
addCachedSampleInfo(java.util.List<java.lang.String> addSamples, java.lang.String batch, java.lang.String cohort, Pedigrees peds) Returns cached rows from the sample table for each sample. |
|
java.lang.Object |
addSample(java.lang.String sampleId, Pedigrees peds, java.lang.String cohort, java.lang.String batch) Add information about the given sample to the database |
|
void |
addSampleBatch(java.lang.Long sampleId, java.lang.String batch, java.lang.String cohort) |
|
void |
close() |
|
java.util.Map |
countObservations(java.lang.String chr, int start, int end, java.lang.String alt) Returns a map containing the following keys: |
|
java.util.Map |
countObservations(gngs.Variant v, gngs.Variant$Allele allele, java.lang.String batch) Return a counts of the number of observations of the given variant. |
|
java.lang.Object |
findSample(java.lang.String sampleId) Return row of database representing sample, or null if it does not exist. |
|
java.lang.Object |
findVariant(gngs.Variant variant, gngs.Variant$Allele allele) Find a variant in the database by its start position |
|
java.lang.Object |
getAnnotation(java.lang.String type, java.lang.Object annotations) Search for the given value in the annotations as a frequency value. |
|
java.lang.Float |
getFreq(java.lang.String type, java.lang.Object annotations) Search for the given value in the annotations as a frequency value. |
|
void |
init() Check the database exists and upgrade it if necessary |
|
static void |
main(java.lang.String[] args) Simple test program - all it does is creates the database |
|
java.util.Map |
queryVariantCounts(java.util.Map options, gngs.Variant variant, gngs.Variant$Allele allele, java.lang.String sampleId) Return a set of counts of times the given variant has been observed with in |
|
java.lang.String |
trimSampleId(java.lang.String sampleId) |
|
void |
tx(groovy.lang.Closure c) Execute the given closure in teh scope of a transaction, and roll it back if an exception occurs. |
Methods inherited from class | Name |
---|---|
class java.lang.Object |
java.lang.Object#wait(long), java.lang.Object#wait(long, int), java.lang.Object#wait(), java.lang.Object#equals(java.lang.Object), java.lang.Object#toString(), java.lang.Object#hashCode(), java.lang.Object#getClass(), java.lang.Object#notify(), java.lang.Object#notifyAll() |
Map of string to database rows of cached sample information
The file name of the database that is connected to
The actual database connection
The file name of the database that is connected to
Schema for database
Open or create a VariantDB using default settings and the given file name. These connect to a SQLite database with the given name.
Add the given variant to the database. If sampleToAdd is specified, add it for only the given sample. Otherwise add it for all the samples that are genotyped to have the variant
annotations
- optional annotations to draw from. These can be used when
the annotations are not embedded in the VCF file. The annotations
are queried for keys defined in the annotationProfile field.
The default mappings are set up to look for Annovar annotations.
The annotations themselves can be any object having properties which
will be queried. In practise, a CSV parser PropertyMapper is
what is being used here to pass in Annovar annotations.Returns cached rows from the sample table for each sample. If rows do not exist yet for the sample, adds rows to the table. If rows do exist, returns the existing rows. Also adds batch information.
Add information about the given sample to the database @return
Returns a map containing the following keys:
chr
- Chromosome of variantstart
- starting position of DNA change caused by variant (note this may be different
to VCF position depending on the representation of your indel in the VCF!)end
- end position of DNA change caused by variantalt
- alternate sequenceReturn a counts of the number of observations of the given variant. Two values are returned:
batch
- If provided, variants will only be provided if they are observed in
this batch OR any earlier batch. The intent is to support reproducibility
so that new samples can be added to the database without changing the
variant counts if queried for old samples.Return row of database representing sample, or null if it does not exist.
Find a variant in the database by its start position
Search for the given value in the annotations as a frequency value.
type
- The type of value to search for, must be one of the
predefined annotation profile keysannotations
- The annotations to searchSearch for the given value in the annotations as a frequency value. @return
type
- The type of frequency to search for, must be a key in the annotation profileannotations
- The annotations to searchCheck the database exists and upgrade it if necessary
Simple test program - all it does is creates the database
Return a set of counts of times the given variant has been observed with in
The returned map has 3 attributes:
Execute the given closure in teh scope of a transaction, and roll it back if an exception occurs.
c
- Closure to executeGroovy Documentation