class RangedData extends Regions
RangedData represents a set of genomic regions with data attached. The data is parsed from a tab separated file, 3 columns of which are expected to be the genomic coordinates. The other columns are parsed and accessible as properties directly on the contained Region objects that are loaded.
To use RangedData, first create a RangedData object, and then call the load() method:
// my_file.tsv has columns in first line - chr, start, end, depth r = new RangeData("my_file.tsv").load()This form assumes that the first 3 columns of the file specify a genomic position in BED style representation (chromosome, start, end). These first three columns are extracted and the remaining columns are sniffed to infer their types, and then added as expandos to the created Region objects. The result is that you can access them directly as properties.
// Find the 'depth' property of all ranges overlapping chr1:100000-200000 println r.grep { it.overlaps("chr1",100000,200000) }.depthIf the file doesn't have column names as the first line then you should specify them yourself:
r = new RangeData("my_file.tsv").load(columnNames:['chr','start','end','depth'])Note: the 'chr','start' and 'end' columns here arbitrary - they don't affect what is parsed into the ranges.
If you are loading a CSV file, pass the separator to the "load" function:
r = new RangeData("my_file.csv").load(separator:',')
RangedData extends the Regions class, so it supports all the usual methods for working with genomic ranges. The only difference is that the ranges involved acquire properties corresponding to the other columns in the input file.
Type | Name and description |
---|---|
int |
chrColumn Index of the column containing the reference sequence (or "chromosome") |
java.util.List<java.lang.String> |
columns |
int |
endColumn Index of the column containing the end index |
int |
genomeZeroOffset The starting index for the first base in the genome. |
groovy.lang.Closure |
regionParser |
java.lang.String |
separator The separator used between values in the file |
graxxia.ReaderFactory |
source |
int |
startColumn Index of the column containing the start index |
Constructor and description |
---|
RangedData
() |
RangedData
(java.lang.String sourceFile) Default to first 3 columns of file being the genomic range information in form of chr,start,end |
RangedData
(java.lang.String sourceFile, int chrColumn, int startColumn, int endColumn) |
RangedData
(java.io.Reader reader, int chrColumn, int startColumn, int endColumn) |
RangedData
(java.io.Reader reader, groovy.lang.Closure regionParser) |
RangedData
(graxxia.ReaderFactory reader, int chrColumn, int startColumn, int endColumn) |
Type Params | Return Type | Name and description |
---|---|---|
|
static java.lang.Object |
getReader(java.lang.String fileName) |
|
RangedData |
load(java.util.Map options, groovy.lang.Closure c) |
|
protected gngs.Region |
parseRegion(com.xlson.groovycsv.PropertyMapper line) |
|
java.util.List<java.util.Map<java.lang.String, java.lang.Object>> |
toListMap() Convert to a list of map objects |
Methods inherited from class | Name |
---|---|
class Regions |
addRegion, addRegion, backward, balancedSplit, bkr, contains, coverage, distanceTo, distanceTo, eachOverlap, eachRange, eachRange, endingAt, enhance, forward, getAt, getAtIndex, getContigRegions, getExtrasAtPosition, getNumberOfRanges, getOverlapRegions, getOverlaps, getOverlaps, getSpan, intersect, intersect, intersect, intersect, intersectImpl, intersectRegion, intersectRegions, isEmpty, iterator, nearest, nextRange, overlaps, overlaps, overlaps, overlaps, plus, previousRange, reduce, regionsEndingAt, regionsStartingAt, remove, save, save, save, size, startingAt, subtract, subtractFrom, subtractFrom, thin, thinnedIndices, toHTML, toListMap, toString, uniquify, widen, window |
Index of the column containing the reference sequence (or "chromosome")
Index of the column containing the end index
The starting index for the first base in the genome. Some formats use 1 as the first base, but mostly it is zero.
The separator used between values in the file
Index of the column containing the start index
Default to first 3 columns of file being the genomic range information in form of chr,start,end
Convert to a list of map objects @return
Groovy Documentation