FASTAIndex

Summary:
Property
Constructor
Method

| Detail:
Property
Constructor
Method

gngs.FASTAIndex

```
class FASTAIndex
extends java.lang.Object
```
A very simplistic FASTA index that allows lookup of sequence names by their sequence content using a fixed length prefix seed. It only supports looking up sequences by their prefixes, with an optional offset from the query start by up to specified number of bases (default=5). Both the original sequence AND its reverse complement are indexed, so you can perform a single query to identify a sequence in the case you are not sure what the strand / orientation of the query sequence is.
NOTE: this class is intended for indexing large numbers of SHORT sequences. It will not work for indexing, for example, a reference sequence for an organism! (You will be able to look up each chromosome by a short prefix, not terribly useful).
NOTE2: this class is not for working at low level with pre-indexed FASTA files (eg: .fai format) For working with those, use the gngs.FASTA class. This class takes a gngs.FASTA object and adds ability to look up by sequence content to it.
Example:
```
 index = new FASTAIndex(new FASTA("tests/test.fasta"), 0..20)
 assert index.querySequence("AGTCCCTATTACAAA") == "amplicon_1"
 
```
Authors:
simon.sadedin@mcri.edu.au

Properties Summary

Properties
Type	Name and description
`int`	`maxSize` Maximum number of sequences to index (0 means unlimited)
`groovy.lang.IntRange`	`offsetRange` Range of offsets from beginning of sequences to index (memory expensive to increase this a lot)
`int`	`seedSize` Size of seed to use (bp)
`java.util.Map<java.lang.String, java.lang.String>`	`sequenceNames` Index of amplicon names maps name => full sequence
`java.util.Map<java.lang.String, java.lang.String>`	`sequences` Index of amplicon sequences, maps subsequence to amplicon name(s)

Constructor Summary

Constructors
Constructor and description
`FASTAIndex (gngs.FASTA fasta, Regions regions)` Create an index from the given fasta, where each fasta sequence corresponds to a single amplicon
`FASTAIndex ()` For unit tests only
`FASTAIndex (gngs.FASTA fasta, groovy.lang.IntRange offsetRange, int maxSize, int seedSize, BED bed)` Create an index from the given fasta, where each fasta sequence corresponds to a single amplicon

Methods Summary

Methods
Type Params	Return Type	Name and description
	`void`	`index(gngs.FASTA fasta, Regions regions)` Index the given fasta, masked using the given BED file
	`java.lang.String`	`querySequenceName(java.lang.String sequence)`

Inherited Methods Summary

Inherited Methods
Methods inherited from class	Name
`class java.lang.Object`	`java.lang.Object#wait(long), java.lang.Object#wait(long, int), java.lang.Object#wait(), java.lang.Object#equals(java.lang.Object), java.lang.Object#toString(), java.lang.Object#hashCode(), java.lang.Object#getClass(), java.lang.Object#notify(), java.lang.Object#notifyAll()`

- Property Detail
  - int maxSize
    
    Maximum number of sequences to index (0 means unlimited)
  - groovy.lang.IntRange offsetRange
    
    Range of offsets from beginning of sequences to index (memory expensive to increase this a lot)
  - int seedSize
    
    Size of seed to use (bp)
  - java.util.Map<java.lang.String, java.lang.String> sequenceNames
    
    Index of amplicon names maps name => full sequence
  - java.util.Map<java.lang.String, java.lang.String> sequences
    
    Index of amplicon sequences, maps subsequence to amplicon name(s)
- Constructor Detail
  - FASTAIndex(gngs.FASTA fasta, Regions regions)
    
    Create an index from the given fasta, where each fasta sequence corresponds to a single amplicon
    Parameters:
    fasta
  - FASTAIndex()
    
    For unit tests only
  - FASTAIndex(gngs.FASTA fasta, groovy.lang.IntRange offsetRange, int maxSize, int seedSize, BED bed)
    
    Create an index from the given fasta, where each fasta sequence corresponds to a single amplicon
    Parameters:
    fasta
- Method Detail
  - void index(gngs.FASTA fasta, Regions regions)
    
    Index the given fasta, masked using the given BED file
    Parameters:
    fasta
    bed
  - java.lang.String querySequenceName(java.lang.String sequence)

Summary:
Property
Constructor
Method

| Detail:
Property
Constructor
Method

Groovy Documentation