| FaFile {Rsamtools} | R Documentation |
Manipulate indexed fasta files.
Description
Use FaFile() to create a reference to an indexed fasta
file. The reference remains open across calls to methods, avoiding
costly index re-loading.
FaFileList() provides a convenient way of managing a list of
FaFile instances.
Usage
## Constructors
FaFile(file, index=sprintf("%s.fai", file),
gzindex=sprintf("%s.gzi", file))
FaFileList(...)
## Opening / closing
## S3 method for class 'FaFile'
open(con, ...)
## S3 method for class 'FaFile'
close(con, ...)
## accessors; also path(), index()
## S4 method for signature 'FaFile'
gzindex(object, asNA=TRUE)
## S4 method for signature 'FaFileList'
gzindex(object, asNA=TRUE)
## S4 method for signature 'FaFile'
isOpen(con, rw="")
## actions
## S4 method for signature 'FaFile'
indexFa(file, ...)
## S4 method for signature 'FaFile'
scanFaIndex(file, ...)
## S4 method for signature 'FaFileList'
scanFaIndex(file, ..., as=c("GRangesList", "GRanges"))
## S4 method for signature 'FaFile'
seqinfo(x)
## S4 method for signature 'FaFile'
countFa(file, ...)
## S4 method for signature 'FaFile,GRanges'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile,IntegerRangesList'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile,missing'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile'
getSeq(x, param, ...)
## S4 method for signature 'FaFileList'
getSeq(x, param, ...)
Arguments
con, object, x |
An instance of |
file, index, gzindex |
A character(1) vector of the fasta or fasta index
or bgzip index file path (for |
asNA |
logical indicating if missing output should be NA or character() |
param |
An optional |
... |
Additional arguments.
|
rw |
Mode of file; ignored. |
as |
A character(1) vector indicating the type of object to return.
|
Objects from the Class
Objects are created by calls of the form FaFile().
Fields
The FaFile class inherits fields from the
RsamtoolsFile class.
Functions and methods
FaFileList inherits methods from
RsamtoolsFileList and SimpleList.
Opening / closing:
- open.FaFile
Opens the (local or remote)
pathandindexfiles. Returns aFaFileinstance.- close.FaFile
Closes the
FaFilecon; returning (invisibly) the updatedFaFile. The instance may be re-opened withopen.FaFile.
Accessors:
- path
Returns a character(1) vector of the fasta path name.
- index
Returns a character(1) vector of fasta index name (minus the '.fai' extension).
Methods:
- indexFa
Visit the path in
path(file)and create an index file (with the extension ‘.fai’).- scanFaIndex
Read the sequence names and and widths of recorded in an indexed fasta file, returning the information as a
GRangesobject.- seqinfo
Consult the index file for defined sequences (
seqlevels()) and lengths (seqlengths()).- countFa
Return the number of records in the fasta file.
- scanFa
Return the sequences indicated by
paramas aDNAStringSetinstance.seqnames(param)selects the sequences to return;start(param)andend{param}define the (1-based) region of the sequence to return. Values ofend(param)greater than the width of the sequence cause an error; useseqlengths(FaFile(file))to discover sequence widths. Whenparamis missing, all records are selected. Whenlength(param)==0no records are selected.- getSeq
Returns the sequences indicated by
paramfrom the indexed fasta file(s) offile.For the
FaFilemethod, the return type is aDNAStringSet. ThegetSeq,FaFileandscanFa,FaFile,GRangesmethods differ in thatgetSeqwill reverse complement sequences selected from the minus strand.For the
FaFileListmethod, theparamargument must be aGRangesListof the same length asfile, creating a one-to-one mapping between the ith element offileand the ith element ofparam; the return type is aSimpleListofDNAStringSetinstances, with elements of the list in the same order as the input elements.- show
Compactly display the object.
Author(s)
Martin Morgan
Examples
fl <- system.file("extdata", "ce2dict1.fa", package="Rsamtools",
mustWork=TRUE)
fa <- open(FaFile(fl)) # open
countFa(fa)
(idx <- scanFaIndex(fa))
(dna <- scanFa(fa, param=idx[1:2]))
ranges(idx) <- narrow(ranges(idx), -10) # last 10 nucleotides
(dna <- scanFa(fa, param=idx[1:2]))