Documentation
¶
Overview ¶
Package blast provides functions and types to help with running any of the BLAST suite of programs. Namely, this package defines an interface `Blaster` whereby values of types that implement it can execute a BLAST search using the `Blast` function in this package.
The results of a BLAST search are captured as XML data and loaded into the `BlastResults` structure automatically.
Note that this is not a package for executing remote BLAST queries on NCBI's web page, but rather, running local programs like "blastp" on a local database.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BlastHSP ¶
type BlastHSP struct {
XMLName xml.Name `xml:"Hsp"`
Num int `xml:"Hsp_num"`
BitScore float64 `xml:"Hsp_bit-score"`
Score float64 `xml:"Hsp_score"`
EValue float64 `xml:"Hsp_evalue"`
QueryFrom int `xml:"Hsp_query-from"`
QueryTo int `xml:"Hsp_query-to"`
HitFrom int `xml:"Hsp_hit-from"`
HitTo int `xml:"Hsp_hit-to"`
PatternFrom int `xml:"Hsp_pattern-from"`
PatternTo int `xml:"Hsp_pattern-to"`
QueryFrame int `xml:"Hsp_query-frame"`
HitFrame int `xml:"Hsp_hit-frame"`
Identity int `xml:"Hsp_identity"`
Positive int `xml:"Hsp_positive"`
Gaps int `xml:"Hsp_gaps"`
AlignLength int `xml:"Hsp_align-len"`
Density int `xml:"Hsp_density"`
AlignQuery string `xml:"Hsp_qseq"`
AlignHit string `xml:"Hsp_hseq"`
AlignMiddle string `xml:"Hsp_midline"`
}
type BlastIteration ¶
type BlastIteration struct {
XMLName xml.Name `xml:"Iteration"`
Num int `xml:"Iteration_iter-num"`
QueryID string `xml:"Iteration_query-ID"`
QueryDef string `xml:"Iteration_query-def"`
QueryLen int `xml:"Iteration_query-len"`
Hits []BlastHit `xml:"Iteration_hits>Hit"`
Stats BlastStatistics `xml:"Iteration_stat>Statistics"`
Message string `xml:"Iteration_message"`
}
type BlastParams ¶
type BlastParams struct {
XMLName xml.Name `xml:"Parameters"`
Matrix string `xml:"Parameters_matrix"`
Expect float64 `xml:"Parameters_exect"`
Include float64 `xml:"Parameters_include"`
ScMatch int `xml:"Parameters_sc-match"`
ScMismatch int `xml:"Parameters_sc-mismatch"`
GapOpen int `xml:"Parameters_gap-open"`
GapExtend int `xml:"Parameters_gap-extend"`
Filter string `xml:"Parameters_filter"`
Pattern string `xml:"Parameters_pattern"`
EntrezQuery string `xml:"Parameters_entrez-query"`
}
type BlastResults ¶
type BlastResults struct {
XMLName xml.Name `xml:"BlastOutput"`
Program string `xml:"BlastOutput_program"`
Version string `xml:"BlastOutput_version"`
Reference string `xml:"BlastOutput_reference"`
DB string `xml:"BlastOutput_db"`
QueryID string `xml:"BlastOutput_query-ID"`
QueryDef string `xml:"BlastOutput_query-def"`
QueryLen int `xml:"BlastOutput_query-len"`
QuerySeq string `xml:"BlastOutput_query-seq"`
Params BlastParams `xml:"BlastOutput_param>Parameters"`
Iterations []BlastIteration `xml:"BlastOutput_iterations>Iteration"`
}
BlastResults is the top-level struct for representing XML output of the BLAST family of programs. Subsequent XML elements are represented with other `Blast*` types.
The types are meant to be comprehensive with respect to NCBI's DTD found here: http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd. Note that the meat is really here: http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.mod.dtd.
func Blast ¶
func Blast(blaster Blaster) (*BlastResults, error)
Blast executes the search query described by blaster. Search results are returned from Blast's XML output format mode.
Example ¶
ExampleBlast demonstrates a very simple protein BLAST search. Note that you'll need to change `dbPath` to your own local BLAST database. The one I used in the example is a BLAST database containing all of the protein sequences from each strain of yeast from http://www.yeastgenome.org.
dbPath := "/home/andrew/research/repeats/data/blast/amino"
sequence := seq.Sequence{
Name: "YAL001C",
Residues: []seq.Residue(`
MVLTIYPDELVQIVSDKIASNKGKITLNQLWDISGKYFDLSDKKVKQFVLSCVILKKDIE
VYCDGAITTKNVTDIIGDANHSYSVGITEDSLWTLLTGYTKKESTIGNSAFELLLEVAKS
GEKGINTMDLAQVTGQDPRSVTGRIKKINHLLTSSQLIYKGHVVKQLKLKKFSHDGVDSN
PYINIRDHLATIVEVVKRSKNGIRQIIDLKRELKFDKEKRLSKAFIAAIAWLDEKEYLKK
VLVVSPKNPAIKIRCVKYVKDIPDSKGSPSFEYDSNSADEDSVSDSKAAFEDEDLVEGLD
NFNATDLLQNQGLVMEEKEDAVKNEVLLNRFYPLQNQTYDIADKSGLKGISTMDVVNRIT
GKEFQRAFTKSSEYYLESVDKQKENTGGYRLFRIYDFEGKKKFFRLFTAQNFQKLTNAED
EISVPKGFDELGKSRTDLKTLNEDNFVALNNTVRFTTDSDGQDIFFWHGELKIPPNSKKT
PNKNKRKRQVKNSTNASVAGNISNPKRIKLEQHVSTAQEPKSAEDSPSSNGGTVVKGKVV
NFGGFSARSLRSLQRQRAILKVMNTIGGVAYLREQFYESVSKYMGSTTTLDKKTVRGDVD
LMVESEKLGARTEPVSGRKIIFLPTVGEDAIQRYILKEKDSKKATFTDVIHDTEIYFFDQ
TEKNRFHRGKKSVERIRKFQNRQKNAKIKASDDAISKKSTSVNVSDGKIKRRDKKVSAGR
TTVVVENTKEDKTVYHAGTKDGVQALIRAVVVTKSIKNEIMWDKITKLFPNNSLDNLKKK
WTARRVRMGHSGWRAYVDKWKKMLVLAIKSEKISLRDVEELDLIKLLDIWTSFDEKEIKR
PLFLYKNYEENRKKFTLVRDDTLTHSGNDLAMSSMIQREISSLKKTYTRKISASTKDLSK
SQSDDYIRTVIRSILIESPSTTRNEIEALKNVGNESIDNVIMDMAKEKQIYLHGSKLECT
DTLPDILENRGNYKDFGVAFQYRCKVNELLEAGNAIVINQEPSDISSWVLIDLISGELLN
MDVIPMVRNVRPLTYTSRRFEIRTLTPPLIIYANSQTKLNTARKSAVKVPLGKPFSRLWV
NGSGSIRPNIWKQVVTMVVNEIIFHPGITLSRLQSRCREVLSLHEISEICKWLLERQVLI
TTDFDGYWVNHNWYSIYEST*
`),
}
blaster := NewBlastp([]seq.Sequence{sequence}, dbPath)
blaster.SetFlag("evalue", 0.1)
results, err := Blast(blaster)
if err != nil {
fmt.Println(err)
return
}
hit := results.Iterations[0].Hits[0].Def
fmt.Println(strings.Contains(strings.ToLower(hit), "tfc3"))
Output: true
type BlastStatistics ¶
type BlastStatistics struct {
XMLName xml.Name `xml:"Statistics"`
NumSequences int `xml:"Statistics_db-num"`
Length int `xml:"Statistics_db-len"`
HSPLength int `xml:"Statistics_hsp-len"`
EffSpace float64 `xml:"Statistics_eff-space"`
Kappa float64 `xml:"Statistics_kappa"`
Lambda float64 `xml:"Statistics_lambda"`
Entropy float64 `xml:"Statistics_entropy"`
}
type Blaster ¶
type Blaster interface {
// Executable should return the blast executable to run.
Executable() string
// CmdArgs should return a list of command line flags to pass to the
// blast executable. This list must not include the `-outfmt` flag,
// since clients of this interface may set it in order to retrieve
// results in an expected format.
CmdArgs() []string
// Stdin, when not nil, will be used for the stdin of the blast process.
Stdin() io.Reader
}
Blaster represents values that can execute a BLAST search. This package provides some slim implementations of this interface for a couple variations of BLAST. Clients requiring access to some of BLAST's more sophisticated options should provide their own Blaster.
type Query ¶
type Query struct {
// The BLAST executable to use.
Exec string
// contains filtered or unexported fields
}
Query is a generic blaster for any type of BLAST search. It provides a thin wrapper around setting command line flags to pass to a BLAST executable.
func NewQuery ¶
NewQuery constructs a generic blast search with default parameters. Parameters can be overridden using the `SetFlag` method.
Note that `queries` may have length 0. If it does, then the obligation is on the caller to set the `-query` flag (or provide some other means of giving BLAST a search query).
This also sets the `-num_threads` flag to the number of logical CPUs on your machine.
func (*Query) Executable ¶
func (*Query) SetFlag ¶
SetFlag adds a command line switch (without the proceeding "-") to the set of blastp arguments. `value` should be a string, integer, float, bool or other type with an appropriate `Stringer` implementation that results in a valid command line flag value.
If `value` is `false`, then the flag is removed from the blastp arguments.