Package Exports
- @gmod/vcf
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@gmod/vcf) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
vcf-js
High performance streaming VCF parser in pure JavaScript
Status
Usage
This module is best used when combined with some easy way of retrieving the header and individual lines from a VCF, like the @gmod/tabix module.
const { TabixIndexedFile } = require('@gmod/tabix')
const VCF = require('@gmod/vcf')
const tbiIndexed = new TabixIndexedFile({ path: '/path/to/my.vcf.gz' })
async function doStuff() {
const headerText = await tbiIndexed.getHeader()
const tbiVCFParser = new VCF({ header: headerText })
const variants = []
await tbiIndexed.getLines('ctgA', 200, 300, line =>
variants.push(tbiVCFParser.parseLine(line)),
)
console.log(variants)
}Given a VCF with a single variant line
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG00096
contigA 3000 rs17883296 G T,A 100 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:AP 0|0:0.000,0.000The variant object returned by parseLine() would be
{
CHROM: 'contigA',
POS: 3000,
ID: ['rs17883296'],
REF: 'G',
ALT: ['T', 'A'],
QUAL: 100,
FILTER: 'PASS',
INFO: {
NS: '3',
DP: '14',
AF: '0.5',
DB: null,
H2: null,
},
SAMPLES: {
HG00096: {
GT: '0|0',
AP: '0.000,0.000',
},
},
}The parser will try to use metadata from the header if present to convert INFO and FORMAT values to their proper type (int, float) or split them into an array if they represent multiple values.
Metadata can be accessed with the getMetadata() method. With no paramters it
will return all the data. Any parameters passed will further filter the
metadata. For example, a VCF with this header:
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
#CHROM POS ID REF ALT QUAL FILTER INFOyou can access the metadata like this:
> console.log(this.getMetadata())
{ INFO:
{ NS:
{ Number: 1,
Type: 'Integer',
Description: 'Number of Samples With Data' },
DP: { Number: 1, Type: 'Integer', Description: 'Total Depth' },
AF: { Number: NaN, Type: 'Float', Description: 'Allele Frequency' },
AA: { Number: 1, Type: 'String', Description: 'Ancestral Allele' },
DB:
{ Number: 0,
Type: 'Flag',
Description: 'dbSNP membership, build 129' },
H2: { Number: 0, Type: 'Flag', Description: 'HapMap2 membership' } },}
> console.log(this.getMetadata('INFO'))
{ NS:
{ Number: 1,
Type: 'Integer',
Description: 'Number of Samples With Data' },
DP: { Number: 1, Type: 'Integer', Description: 'Total Depth' },
AF: { Number: NaN, Type: 'Float', Description: 'Allele Frequency' },
AA: { Number: 1, Type: 'String', Description: 'Ancestral Allele' },
DB:{ Number: 0,
Type: 'Flag',
Description: 'dbSNP membership, build 129' },
H2: { Number: 0, Type: 'Flag', Description: 'HapMap2 membership' } }
> console.log(this.getMetadata('INFO', 'DP'))
{ Number: 1, Type: 'Integer', Description: 'Total Depth' }
> console.log(this.getMetadata('INFO', 'DP', 'Number'))
1Samples are also available.
> console.log(this.samples)
[ 'HG00096' ]API
Table of Contents
VCF
Class representing a VCF parser, instantiated with the VCF header.
Parameters
_parseMetadata
Parse a VCF metadata line (i.e. a line that starts with "##") and add its properties to the object.
Parameters
linestring A line from the VCF. Supports both LF and CRLF newlines.
_parseStructuredMetaVal
Parse a VCF header structured meta string (i.e. a meta value that starts with "<ID=...")
Parameters
metaValstring The VCF metadata value
Returns Array Array with two entries, 1) a string of the metadata ID and 2) an object with the other key-value pairs in the metadata
getMetadata
Get metadata filtered by the elements in args. For example, can pass ('INFO', 'DP') to only get info on an metadata tag that was like "##INFO=<ID=DP,...>"
Parameters
args...string List of metadata filter strings.
Returns any An object, string, or number, depending on the filtering
_parseKeyValue
Sometimes VCFs have key-value strings that allow the separator within the value if it's in quotes, like: 'ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129"'
Parse this at a low level since we can't just split at "," (or whatever separator). Above line would be parsed to: {ID: 'DB', Number: '0', Type: 'Flag', Description: 'dbSNP membership, build 129'}
Parameters
strstring Key-value pairs in a stringpairSeparatorstring? A string that separates sets of key-value pairs (optional, default';')
Returns object An object containing the key-value pairs
_percentDecode
Decode any of the eight percent-encoded values allowed in a string by the VCF spec.
Parameters
strstring A string that may contain percent-encoded characters
Returns string A string with any percent-encoded characters decoded
parseLine
Parse a VCF line into an object like { CHROM POS ID REF ALT QUAL FILTER INFO } with SAMPLES optionally included if present in the VCF
Parameters
linestring A string of a line from a VCF. Supports both LF and CRLF newlines.