class Regions extends java.lang.Object implements java.lang.Iterable<gngs.Region>
A set of genomic regions, each indexed by chromosome, start and end position.
The Regions class provides operations for working on Ranges but adds awareness of chromosomes (or contigs) to RangeIndex. It offers most of the same functions as exist for a RangeIndex but transparently handles multiple chromosomes.
The internal representation for each genomic range stored in a Regions object is a Groovy groovy.lang.IntRange. For efficiency, many of the methods return these underlying IntRange objects directly, whenever a method is clearly appliciable only to a single chromosome. When functionality spans chromosomes, the groovy.lang.IntRange objects will be converted to gngs.Region objects which carries an extra field for the chromosome. This adds a small amount of overhead, but enables the class to support various useful functions such as implementing Groovy's enhanced support for java.lang.Iterable objects.
Often Regions will be loaded from a BED file. In that case, see the BED class which extends Regions, but provides extensive support for parsing, loading and filtering BED files. To create a purely in-memory set of regions, use the constructor:
Regions regions = new Regions() regions.addRegion("chr1",100,200) regions.addRegion("chr1",150,300)When added this way, Regions follows the BED file convention that the start of the range is considered inclusive, but the end value is considered exclusive of the range added. Once loaded, various operations can be performed on the ranges. Since the Iterable interface is implemented, you can use any of the Groovy special operations that work on iterables:
int bases = regions.grep { it.chr == "chrX" || it.chr == "chrY" }*.size().sum() println "There were $bases bases from sex chromosomes"The Regions class offers many operations for convenient querying of and logical operations on intervals: Finding overlaps:
Region r = new Region("chr1",120,130) regions.getOverlaps(r).size()==1
Many operations are possible using built in Groovy iterator methods. for example, to find the indexes of overlapping regions:
assert regions.findIndexValues { r.overlaps(it) } == [ 0 ]
Type | Name and description |
---|---|
java.util.Map<java.lang.String, java.util.List<groovy.lang.IntRange>> |
allRanges A list of ranges in the order they were loaded |
java.util.Map<java.lang.String, RangeIndex> |
index Index for looking up overlaps |
Constructor and description |
---|
Regions
(java.util.Map attributes) Create new empty set of regions |
Regions
(java.util.Map attributes, java.lang.String chr, java.lang.Iterable<groovy.lang.Range> ranges) Create a Regions from a list of Ranges |
Regions
(java.util.Map attributes, java.lang.Iterable<IRegion> regions) Create a Regions from a list of regions |
Type Params | Return Type | Name and description |
---|---|---|
|
Regions |
addRegion(java.lang.String chr, int start, int end, java.lang.Object extra) Add the specified range to this Regions file. |
|
Regions |
addRegion(gngs.Region r) Adds the region to this Regions object in such a way that the same Region object is returned in iteration, preserving any expando properties set on the region. |
|
groovy.lang.IntRange |
backward(java.lang.String chr, int pos, int count) Returns the region that is count regions backwards from the given position |
|
java.util.List<Regions> |
balancedSplit() Divide these regions into two regions objects with approximately the same total bp in each half |
|
java.util.List<java.util.Map> |
bkr() |
|
boolean |
contains(java.lang.String chr, int position) Return true iff the given chromosome and position fall into a range covered by this BED file |
|
Regions |
coverage() Return a new regions object that has each distinct region of this object with a "coverage" value assigned. |
|
int |
distanceTo(java.lang.String chr, int pos)
|
|
int |
distanceTo(gngs.Region r)
|
|
void |
eachOverlap(java.lang.String chr, int pos, groovy.lang.Closure c) Call closure c for each range that overlaps the given position |
|
void |
eachRange(groovy.lang.Closure c) |
|
void |
eachRange(java.util.Map options, groovy.lang.Closure c) |
|
java.util.List<groovy.lang.Range> |
endingAt(java.lang.String chr, int pos) Return a list of ranges that end exactly at the specified position. |
|
Regions |
enhance() |
|
groovy.lang.IntRange |
forward(java.lang.String chr, int pos, int count) Returns the region that is count regions forward of the given position |
|
java.lang.Object |
getAt(java.lang.Object obj) |
|
java.lang.Object |
getAtIndex(java.lang.Object obj) |
|
Regions |
getContigRegions(java.lang.String chr) |
|
java.util.List<java.lang.Object> |
getExtrasAtPosition(java.lang.String chr, int position) |
|
int |
getNumberOfRanges() |
|
java.util.List<gngs.Region> |
getOverlapRegions(IRegion r) Returns the overlaps with with the given region as region objects. |
|
java.util.List<groovy.lang.IntRange> |
getOverlaps(IRegion r) Find the ranges that have at least 1bp overlap with the given region |
|
java.util.List<groovy.lang.IntRange> |
getOverlaps(java.lang.String chr, int start, int end) Return a list of ranges that overlap the specified range. |
|
gngs.Region |
getSpan(java.lang.String contig) Returns the total span from the beginning of the first region to the end of the last region on the given contig (chromosome). |
|
java.util.List<groovy.lang.IntRange> |
intersect(IRegion region) |
|
java.util.List<groovy.lang.IntRange> |
intersect(java.lang.String chr, int start, int end) |
|
Regions |
intersect(BED other) |
|
Regions |
intersect(Regions other) |
|
Regions |
intersectImpl(Regions other) Returns a set of regions representing each region in this Regions intersected with the Regions in the other regions. |
|
Regions |
intersectRegion(gngs.Region other) |
|
Regions |
intersectRegions(Regions other) |
|
boolean |
isEmpty() |
|
java.util.Iterator<gngs.Region> |
iterator() |
|
groovy.lang.IntRange |
nearest(java.lang.String chr, int pos) Returns the range that is "closest" to the given position. |
|
groovy.lang.IntRange |
nextRange(java.lang.String chr, int pos) Returns the next range that has its beginning closest to the given position. |
|
boolean |
overlaps(java.lang.String chr, int from, int to) Return true if the given region overlaps any range in this Regions |
|
boolean |
overlaps(IRegion r) Return true if the given region overlaps any range in this Regions |
|
boolean |
overlaps(Regions other) Returns true if at least one region in r overlaps at least one region in this Regions object |
|
boolean |
overlaps(java.lang.Iterable<IRegion> other) Returns true if at least one region in r overlaps at least one region in this Regions object |
|
Regions |
plus(Regions other) |
|
groovy.lang.IntRange |
previousRange(java.lang.String chr, int pos) Returns the prior range that has its end closest to the given position. |
|
Regions |
reduce(groovy.lang.Closure reducer) Simplify all overlapping regions down to a single region, with an optional closure as a callback to combine the attributes of combined regions |
|
java.util.List<gngs.Region> |
regionsEndingAt(java.lang.String chr, int pos) Return a list of ranges that end exactly at the specified position. |
|
java.util.List<gngs.Region> |
regionsStartingAt(java.lang.String chr, int pos) |
|
void |
remove(java.lang.String chr, groovy.lang.Range r) |
|
void |
save(java.lang.String fileName) |
|
void |
save(java.util.Map options, java.lang.String fileName) Save the regions in BED format. |
|
void |
save(java.util.Map options, java.io.Writer w) Save the regions in BED format. |
|
long |
size() |
|
java.util.List<groovy.lang.IntRange> |
startingAt(java.lang.String chr, int pos) |
|
Regions |
subtract(Regions other) |
|
java.util.List<gngs.Region> |
subtractFrom(gngs.Region region) |
|
java.util.List<groovy.lang.IntRange> |
subtractFrom(java.lang.String chr, int start, int end) Remove all of the regions belonging to this Regions object from the interval specified, and return the list of Ranges that results. |
|
Regions |
thin(int desiredRanges, int minRangesPerChromosme) Select the given number of ranges from these, approximately evenly spaced |
|
java.util.List<java.lang.Integer> |
thinnedIndices(int desiredRanges, int minRangesPerChromosme) The same as thin(int,int) but returns the indices of the regions to keep. |
|
java.lang.String |
toHTML() |
|
java.util.List<java.util.Map<java.lang.String, java.lang.Object>> |
toListMap() A convenience method to return the contained regions as a list of Map objects. |
|
java.lang.String |
toString() |
|
Regions |
uniquify() Return a new Regions that contains the same regions as this one, but ensuring each identical region is only present once. |
|
Regions |
widen(int bp) Return a new Regions object that has bp bases
added to the beginning and end of each interval. |
|
java.util.List<gngs.Region> |
window(gngs.Region r, int n) Return a window of n regions upstream and downstream of the given region |
Methods inherited from class | Name |
---|---|
class java.lang.Object |
java.lang.Object#wait(long), java.lang.Object#wait(long, int), java.lang.Object#wait(), java.lang.Object#equals(java.lang.Object), java.lang.Object#toString(), java.lang.Object#hashCode(), java.lang.Object#getClass(), java.lang.Object#notify(), java.lang.Object#notifyAll() |
A list of ranges in the order they were loaded
Index for looking up overlaps
Create new empty set of regions
Create a Regions from a list of Ranges
Create a Regions from a list of regions
Add the specified range to this Regions file.
NOTE: the 'end' is treated as exclusive of the range covered. This is consistent with BED file notation, but different to Groovy ranges. Since most methods in this class use Groovy ranges, you will generally get a Range object out that has end one less than the end you put in with add. Thus if you are iterating through one BED file and adding the ranges to another with this method, you must add one to the end position that you pass to this method!
Will trigger the flag to indicate this BED is loaded in memory.
Adds the region to this Regions object in such a way that the same Region object is returned in iteration, preserving any expando properties set on the region. @return
Returns the region that is count regions backwards from the given position
Divide these regions into two regions objects with approximately the same total bp in each half @return
Return true iff the given chromosome and position fall into a range covered by this BED file
Return a new regions object that has each distinct region of this object with a "coverage" value assigned. @return
Call closure c for each range that overlaps the given position
Return a list of ranges that end exactly at the specified position.
NOTE: the position is considered inclusive to the range. This is different to BED file notation. @return
Returns the region that is count regions forward of the given position
Returns the overlaps with with the given region as region objects.
Note: all the internal ranges must be stored as full Region objects, or this operation will throw an exception.
Find the ranges that have at least 1bp overlap with the given region
Note that this method returns a list of groovy.lang.IntRange objects. Where these belong to Region objects they will be instances of GRange objects with the Region set as the gngs.Region#extra field.
If you want to get the gngs.Region objects back directly, use gngs.Region#getOverlapRegions().
the
- region to locate overlaps for0Return a list of ranges that overlap the specified range. Note: both ends of the range are *inclusive*. The Range objects returned all belong to the reference sequence chr. Note, if the internal ranges are GRanges objects (as they will be often) then you can get full Region objects out from the 'extra' property:
Listoverlaps = regions.getOverlaps('chrX',10000,20000)*.extra
start
- first position to look for overlapsend
- last position to look for overlaps (inclusive)Returns the total span from the beginning of the first region to the end of the last region on the given contig (chromosome). @return
Returns a set of regions representing each region in this Regions intersected with the Regions in the other regions.
Note: nothing is done to deal specially with overlapping ranges. Thus if there are overlapping ranges in this Regions then you will find overlapping ranges in the results wherever those overlaps intersect with the other Regions. The same applies with respect to the other Regions. For a "flat" intersection of the two Regions, you should use reduce() to flatten the source and target first before calling this method.
Returns the range that is "closest" to the given position. 1. if no range overlaps, the closer of a) the end of the nearest prior range vs b) the start of the nearest following range is returned. 2. if one or more ranges overlap, returns the overlapping range with the start or end closest to the specified position @return
Returns the next range that has its beginning closest to the given position.
Return true if the given region overlaps any range in this Regions
r
- Region to test for overlapsReturn true if the given region overlaps any range in this Regions
r
- Region to test for overlapsReturns true if at least one region in r overlaps at least one region in this Regions object @return
Returns true if at least one region in r overlaps at least one region in this Regions object @return
Returns the prior range that has its end closest to the given position.
Simplify all overlapping regions down to a single region, with an optional closure as a callback to combine the attributes of combined regions
The closure is passed two IntRange / GRange objects to combine. The callback shoud return the new gngs.GRange#extra object to assign to the result region.
Return a list of ranges that end exactly at the specified position. @return
Save the regions in BED format. If an 'extra' option is provided, this is called as a closure to return an id field to use for each region.
Save the regions in BED format. If an 'extra' option is provided, this is called as a closure to return an id field to use for each region.
options
- extra: closure returning data to write into 4th column,
sorted: true to sort lexically, Comparator to sort custom
(see {@link NumericRegionComparator)Remove all of the regions belonging to this Regions object from the interval specified, and return the list of Ranges that results.
Note: the end attribute is considered exclusive of the range to be subtracted from. @return
chr
- chromsome of rangestart
- start of interval to subtract from (inclusive)end
- end of interval to subtract from (exclusive)Select the given number of ranges from these, approximately evenly spaced @return
desiredRanges
- target number of ranges to preserveminRanges
- the minimum number of ranges to preserve on each chromosomeThe same as thin(int,int) but returns the indices of the regions to keep.
A convenience method to return the contained regions as a list of Map objects.
This is primarily aimed at interactive use in shell or notebook environments, it is not performant or memory efficient.
Return a new Regions that contains the same regions as this one, but ensuring each identical region is only present once. @return
Return a new Regions object that has bp
bases
added to the beginning and end of each interval. If this
causes overlaps then these are left in the resulting
Objects.
Return a window of n regions upstream and downstream of the given region
Groovy Documentation