Welcome to Groovy NGS!Groovy NGS tries to unlock the power of the JVM as a scripting and rapid application development platform for analysis of genomic sequencing data (particularly, NGS data). It does this by building on the popular https://github.com/samtools/htsjdk HTSJDK library to make its functionality greatly more accessible - primarily, by exposing it in an idiomatic way in the Groovy JVM language.
Groovy is a dynamic programming language that is widely used for scripting, domain specific languages (such as https://bpipe.org Bpipe and https://nextflow.io Nextflow) as well as full applications. It brings a unique blend of high performance and dynamic features that are ideal for rapid application development and interactive analysis. In many ways, Groovy is similar to Python from a language point of view, but without the downsides of Python such as slow performance and parallelisation capabilities, and with seamless integration into the enormous Java library ecosystem.
This guide first presents some conceptual and foundational topics and then is split into sections based on the broad types categories of functionalities that are available.Chapter 1: Key foundations - installing, running and writing simple scripts Chapter 2: Working with Genomic Regions Chapter 3: Working with VCFs Chapter 4: Working with alignment files (BAM, CRAM) Chapter 5: Miscellaneous Utilities
This guide is designed to give you a user friendly introduction to Groovy NGS with plenty of examples and explanations. There are, however, many more details that are not contained here. To fully understand usage of the classes as well as important limitations, reference the https://ssadedin.github.io/groovy-ngs-utils/doc/ API documentation.