Package Exports
- @gmod/bgzf-filehandle
Readme
Transparently read indexed block-gzipped (BGZF) files, such as those created by bgzip, using coordinates from the uncompressed file. The module is used in @gmod/indexedfasta to read bgzip-indexed fasta files (with gzi index, fai index, and fa).
Users can also use the unzip function to unzip bgzip files whole (which pako
has trouble with natively)
You can also use the unzipChunkSlice function to unzip ranges given by BAI or
TBI files for BAM or tabix file formats (which are bgzip based).
The unzip utility function properly decompresses BGZF chunks in both node and
the browser.
Install
$ npm install --save @gmod/bgzf-filehandleUsage
const { BgzfFilehandle, unzip } = require('@gmod/bgzf-filehandle')
const f = new BgzfFilehandle({ path: 'path/to/my_file.gz' })
// assumes a .gzi index exists at path/to/my_file.gz.gzi. can also
// pass `gziPath` to set it explicitly. Can also pass filehandles
// for the files: `filehandle` and `gziFilehandle`
// supports a subset of the NodeJS v10 filehandle API. currently
// just read() and stat()
const myBuf = Buffer.alloc(300)
await f.read(myBuf, 0, 300, 23234)
// now use the data in the buffer
const { size } = f.stat() // stat gives the size as if the file were uncompressed
// unzip takes a buffer and returns a promise for a new buffer
const chunkDataBuffer = readDirectlyFromFile(someFile, 123, 456)
const unzippedBuffer = await unzip(chunkDataBuffer)
// unzipChunkSlice unzips the buffer, and and slices out
// (0,chunk.minv.dataPosition) and (chunk.maxv.dataPosition)
//
// the dpositions and cpositions indicate the block boundaries in compressed
// and decompressed coordinates which can be used for generating stable feature
// IDs across chunk boundaries
const { buffer, dpositions, cpositions } = await unzipChunkSlice(
chunkDataBuffer,
chunk,
)Academic Use
This package was written with funding from the NHGRI as part of the JBrowse project. If you use it in an academic project that you publish, please cite the most recent JBrowse paper, which will be linked from jbrowse.org.
License
MIT © Robert Buels
Note
This repo is unable to be upgraded to pako v2 at this time due to removal of the Z_SYNC_FLUSH capability. It will produce "invalid distance too far back" errors with pako v2