Groovy NGS

A toolkit for working with genomic sequencing data in Groovy.The JVM is an incredible platform for data analysis, offering high performance, extraordinary library and platform support and rock solid industry support when it comes time to scale up and productionise your work. Groovy NGS aims to unlock the power of the JVM for working with genomic sequencing data by enabling it to be used with the versatile and highly productive Groovy programming language.Groovy NGS can be used at three levels:Directly as pre-written tools on the command line For writing simple scripts (bash-style) or interactive analysis in https://github.com/ssadedin/beakerx Jupyter Notebooks As a library of classes for building full-scale applications Under the hood, Groovy NGS is built on the widely used https://github.com/samtools/htsjdk HTSJDK. However Groovy NGS makes it much easier to work with these libraries by adding idiomatic Groovy language constructs and filling in important commonly used missing features.Examples of supported functionality are:Reading, processing and filtering VCF files, including integration with common annotation sources such as VEP Working with Genomic Ranges - full set of operation as well as higher level reading, processing and filtering Reading, processing and performing logical operations with pedigree (PED) files and family structures Working with BAM/SAM/CRAM files (including, generating and working with Pileups) A range of statistical operations including R-like data frames and linear modeling constructs Many many more useful operations

Groovy NGS

Introduction Introduction What is Groovy NGS? Introduction Why Groovy? Introduction Structure Introduction API Documentation

How To Use Groovy NGS How To Use Groovy NGS As a Dependency In Projects How To Use Groovy NGS On the Command Line How To Use Groovy NGS Interactively via a Groovy Shell How To Use Groovy NGS In a Jupyter Notebook

Regions And Ranges Regions And Ranges Regions and Ranges

Common Operations Common Operations Creating Regions Objects Common Operations Basic Metrics Common Operations Treating as a Collection Common Operations Finding Overlaps Common Operations Intersection Common Operations Flattening Common Operations Assigning Properties

Loading And Saving Loading And Saving Loading BED files Loading And Saving Saving BED files Loading And Saving Region Based Data Tables

Parsing Vcf Files Parsing Vcf Files Loading a VCF File Parsing Vcf Files Accessing Header Information Only Parsing Vcf Files Streaming Processing Parsing Vcf Files Processing Variants in VCFs Parsing Vcf Files Indexed VCFs

Working With Variants Working With Variants General Notes Working With Variants Variant Properties Working With Variants Variants as Regions Working With Variants Querying VCFs for Presence of Variants Working With Variants Info Fields Working With Variants Annotated Variants Working With Variants Updating Variant Attributes

Opening And Reading Alignment Files Opening And Reading Alignment Files Introduction Opening And Reading Alignment Files Opening BAM and CRAM Files Opening And Reading Alignment Files Accessing Reads Opening And Reading Alignment Files Accessing Paired Reads Opening And Reading Alignment Files Generating Pileups Opening And Reading Alignment Files Calculating Coverage Depth

Plotting Plotting Basic Form of Plots Plotting Saving Plots Plotting Adding Data to Plots Plotting Specifying Colors Plotting Adding Legends Plotting Distribution Plots Plotting Accessing Generated Images

Miscellaneous Utilities Miscellaneous Utilities Creating Readers, Writers and Streams Miscellaneous Utilities Formatting Tables Miscellaneous Utilities JupyterLab Support Miscellaneous Utilities RefGene Database Access