class SampleInfo extends java.lang.Object
Meta data about a sample.
Designed to be compatible with the MGHA sample information format.
Type | Name and description |
---|---|
static java.util.List<java.lang.String> |
MG_COLUMNS MGHA redefined column order and contents to have a lot of things not of interest to others, so have a separate mapping for them. |
static java.util.List<java.lang.String> |
SIMPLE_COLUMNS |
java.util.List<java.lang.String> |
altIds Recognised alternative identifiers for the sample |
java.lang.String |
analysisContact |
java.lang.String |
batch Id of batch in which the sample was sequenced |
java.util.List<java.util.Date> |
captureDates |
Consanguinity |
consanguinity Whether the sample is consanguinous |
float |
dnaConcentrationNg DNA quality in nanograms |
java.util.List<java.util.Date> |
dnaDates |
float |
dnaQuality |
float |
dnaQuantity |
Ethnicity |
ethnicity |
java.util.Map<java.lang.String, java.util.List<java.lang.String>> |
fileMappings |
java.util.Map |
files List of files containing data specific to this sample, indexed by file types: - fastq - coverage (output from coverageBed) - vcf - bam - cram |
java.util.Map<java.lang.String, java.lang.Integer> |
geneCategories List of genes prioritised for the sample |
java.lang.String |
institution Hospital or organization responsible for the patient from which the sample originated |
java.lang.String |
library The library |
java.util.List<java.lang.String> |
machineIds |
float |
meanCoverage Mean coverage as reported by sequencing provider |
java.lang.String |
pedigree The pedigree of the family |
java.lang.String |
sample Sample name |
SampleType |
sampleType Whether the sample type is normal or tumor |
java.lang.String |
sequencingContact |
java.util.List<java.util.Date> |
sequencingDates |
Sex |
sex The sex of the sample |
java.lang.String |
target Target (flagship) name |
java.lang.String |
variantsFile |
Constructor and description |
---|
SampleInfo
() |
Type Params | Return Type | Name and description |
---|---|---|
|
static java.util.Map<java.lang.String, SampleInfo> |
fromFiles(java.util.List<java.lang.String> files, java.lang.String mask) Create a list of SampleInfo objects from provided files that contain sample information in the header data. |
|
static java.util.List<java.lang.String> |
getBAMSamples(java.lang.String fileName) |
|
static java.util.List<java.lang.String> |
getVCFSamples(java.lang.String fileName) |
|
void |
indexFileType(java.lang.String index, java.util.List<java.lang.String> endings) |
|
void |
indexFileTypes() |
|
static java.util.Date |
parseDate(java.lang.String dateValue) |
|
static java.util.Map<java.lang.String, SampleInfo> |
parse_mg_sample_info(java.lang.String fileName) Parse the given file to extract sample info, where the file is in the extended Melbourne Genomics Health Alliance format. |
|
static java.util.Map<java.lang.String, SampleInfo> |
parse_sample_info(java.lang.String fileName) Parse sample info using auto-detection to determine if the format is MGHA extended format or simplified format. |
|
static java.util.Map<java.lang.String, SampleInfo> |
parse_sample_info(java.lang.String fileName, java.util.List columns) Parse the given file to extract sample information |
|
static java.util.List<java.lang.String> |
readSampleInfoLines(java.lang.String fileName) |
|
java.lang.String |
toString() |
|
java.lang.String |
toTsv(java.util.List<java.lang.String> columns, java.lang.String father, java.lang.String mother) Return a tab separated string compatible with the samples.txt file format, containing the details for this sample. |
|
void |
validate() Validates that text fields are in the correct format. |
Methods inherited from class | Name |
---|---|
class java.lang.Object |
java.lang.Object#wait(long), java.lang.Object#wait(long, int), java.lang.Object#wait(), java.lang.Object#equals(java.lang.Object), java.lang.Object#toString(), java.lang.Object#hashCode(), java.lang.Object#getClass(), java.lang.Object#notify(), java.lang.Object#notifyAll() |
MGHA redefined column order and contents to have a lot of things not of interest to others, so have a separate mapping for them.
Recognised alternative identifiers for the sample
Id of batch in which the sample was sequenced
Whether the sample is consanguinous
DNA quality in nanograms
List of files containing data specific to this sample, indexed by file types: - fastq - coverage (output from coverageBed) - vcf - bam - cram
List of genes prioritised for the sample
Hospital or organization responsible for the patient from which the sample originated
The library
Mean coverage as reported by sequencing provider
The pedigree of the family
Sample name
Whether the sample type is normal or tumor
The sex of the sample
Target (flagship) name
Create a list of SampleInfo objects from provided files that contain sample information in the header data. @return
Parse the given file to extract sample info, where the file is in the extended Melbourne Genomics Health Alliance format. @return
Parse sample info using auto-detection to determine if the format is MGHA extended format or simplified format. @return
Parse the given file to extract sample information
Return a tab separated string compatible with the samples.txt file format, containing the details for this sample.
@return
father
- optional father id, if provided, will be output into the pedigree fieldmother
- optional mother id, if provided, will be output into the pedigree fieldValidates that text fields are in the correct format.
This format validation is specific to melbourne genomics.
Groovy Documentation