Groovy Documentation

[Groovy] Class Stats

java.lang.Object
  org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
      Stats

class Stats
extends DescriptiveStatistics

A Groovy wrapper for Commons-Math DescriptiveStatistcs, combined with numerous convenience methods that return SummaryStatistics.

The summary static methods are used to easily create summary statistics for various collections and iterables. A special class of methods supports efficient creation of statistics for defined ranges of integers (eg: Coverage Depth values). The motivation of it is to be able to calculate the median of coverage depth values efficiently without storing the entire set in memory. See CoverageStats for more information.

The most basic method takes any Iterable and turns it into a SummaryStatistics object:

 x = [1,4,5,6]
 assert Stats.summary(x).mean == 4
 
As an alternative, a second method accepts a closure, which is called repeatedly until an exception occurs such as ArrayIndexOutOfBounds, NoSuchElementException, etc. This allows an inversion of control, and thus, effectively, a streaming model for objects that aren't necessarily iterable:
 int i = 0
 assert Stats.summary {  x[i++] }.mean == 4
 
The Stats class links well with the Matrix class to allow easy and efficient calculation of statistics for matrix columns and rows:
 Matrix m = new Matrix(2,2,[1,2,3,4])
 assert Stats.from(m[][1]).mean == 3
 
Authors:
simon.sadedin@mcri.edu.au


Constructor Summary
Stats()

Stats(int windowSize)

 
Method Summary
static Stats from(double[] values)

static Stats from(double[] values, Closure c)

A concrete implementation of from(Iterable, Closure) specialised for arrays of double[] values.

static Stats from(java.lang.Iterable values, Closure c)

A flexible method to generate statistics from any iterable object.

void leftShift(java.lang.Object value)

Convenience function to add sample value to statistics

static java.lang.Object mean()

static java.lang.Double mean(java.lang.Iterable iterable)

static java.lang.Double mean(Iterator i)

static java.lang.Double mean(Closure c)

static java.lang.Object median()

Convenience method for returning the median of values read from stdin using a default maximum value of 10,000.

static java.lang.Object median(int max, Closure c)

static java.lang.Object median(Closure c)

static java.lang.Object percentile(int max)

static java.lang.Object percentile()

static java.lang.Object percentile(int max, InputStream i)

Return a CoverageStats object by reading lines from the given input stream until no more lines are left.

static java.lang.Object percentile(int max, Closure c)

Calculate Percentile object by accepting values from the given closure until it either:

static Stats read(InputStream values = System.in, Closure c = null)

Compute statistcs from values read from the given input stream.

static SummaryStatistics summary(java.lang.Iterable iterable)

static java.lang.Object summary(Iterator i)

static SummaryStatistics summary(double[] values)

static SummaryStatistics summary(Closure c)

Return a SummaryStatistics object obtained by executing the given closure repeatedly until it either

 

Constructor Detail

Stats

Stats()


Stats

Stats(int windowSize)


 
Method Detail

from

static Stats from(double[] values)


from

@CompileStatic
static Stats from(double[] values, Closure c)
A concrete implementation of from(Iterable, Closure) specialised for arrays of double[] values.
Parameters:
values - values to calculate statistics for
c - Closure to filter or transform results
Returns:
Stats object containing stastitics about the given values


from

@CompileStatic
static Stats from(java.lang.Iterable values, Closure c)
A flexible method to generate statistics from any iterable object. Values can be streamed from any source that can generate numeric values and behave as an iterator.

An optional closure can be supplied that has dual functionality:

  • It can filter the values
  • It can transform the values If the result of the closure is a boolean, it is treated as a filter:
     x = [2,3,4,5,6]
     assert Stats.from(x) { it % 2 == 0 }.mean == 4 // mean of 2,4,6
     
    Alternatively if the value returned is numeric, it is treated as a transformation:
     x = [2,3,4,5,6]
     assert Stats.from(x) { it % 2 }.mean == 1.6 // (3+5)/5
     
    Of course, any Iterable could be easily transformed using standard Groovy collection operations to achieve the same effect:
     x = [2,3,4,5,6]
     assert Stats.from(x.collect { it % 2 }).mean == 1.6 // (3+5)/5
     
    However the latter requires a complete copy of the transformed data be temporarily created in memory, while the former can potentially stream any number of values in while never consuming anything more than trivial memory overhead.
    Parameters:
    values - Iterable object supplying values that can be parsed as numeric
    c
    Returns:

  • leftShift

    void leftShift(java.lang.Object value)
    Convenience function to add sample value to statistics
    Parameters:
    value


    mean

    @CompileStatic
    static java.lang.Object mean()


    mean

    static java.lang.Double mean(java.lang.Iterable iterable)


    mean

    static java.lang.Double mean(Iterator i)


    mean

    static java.lang.Double mean(Closure c)


    median

    static java.lang.Object median()
    Convenience method for returning the median of values read from stdin using a default maximum value of 10,000.
    Returns:


    median

    static java.lang.Object median(int max, Closure c)


    median

    static java.lang.Object median(Closure c)


    percentile

    static java.lang.Object percentile(int max)


    percentile

    static java.lang.Object percentile()


    percentile

    @CompileStatic
    static java.lang.Object percentile(int max, InputStream i)
    Return a CoverageStats object by reading lines from the given input stream until no more lines are left.
    Parameters:
    max - See #percentile(max,Closure)
    i - input stream
    Returns:
    a CoverageStats object


    percentile

    @CompileStatic
    static java.lang.Object percentile(int max, Closure c)
    Calculate Percentile object by accepting values from the given closure until it either:
  • returns null
  • throws NoSuchElementException
  • throws ArrayIndexOutOfBounds exception
    Parameters:
    max - An estimate of the maximum possible value that the percentile values that are requested will have. If the actual value is above this level then an exception will be thrown when the value is requested
    c
    Returns:

  • read

    static Stats read(InputStream values = System.in, Closure c = null)
    Compute statistcs from values read from the given input stream. The values are expected to be numeric. This is especially useful for piping input in from standard input, eg. in Bash:
     cut -f 6 coverage.txt | groovy -e 'println(Stats.from())'
     
    If a closure is provided then the caller can transorm the values before they are added. If the closure returns false then the value is not included, which gives the caller the opportunity to filter out values they might not be interested in.
    Parameters:
    values
    Returns:


    summary

    static SummaryStatistics summary(java.lang.Iterable iterable)


    summary

    static java.lang.Object summary(Iterator i)


    summary

    @CompileStatic
    static SummaryStatistics summary(double[] values)


    summary

    @CompileStatic
    static SummaryStatistics summary(Closure c)
    Return a SummaryStatistics object obtained by executing the given closure repeatedly until it either
  • returns null
  • throws NoSuchElementException
  • throws ArrayIndexOutOfBounds exception
    Parameters:
    c
    Returns:
    SummaryStatistics object

  •  

    Groovy Documentation