|
Using
Blast
Below is an online tutorial designed
to help the first time BLAST user. The tutorial will teach you to
input a sequence into the Basic BLAST web page, choose a program
and database to search, and examine the results. The core of NCBI
's BLAST services is a program called BLAST 2.0 otherwise known
as "Gapped BLAST". This service is designed to take protein
and nucleic acid sequences and compare them against a selection
of NCBI databases. Generally for this class you will want to search
the protein databases. To do so, you will find it easier to search
using your amino acid sequence rather than the nucleotide sequence.
I recommend using the blastp service and searching the "nr"
database.
To search the databases, it is necessary to have your sequence in
the FASTA format. A FASTA format begins with a single-line
description, followed by lines of sequence data. The description
line is distinguished from the sequence data by a greater-than (">")
symbol in the first column. It is recommended that all lines of
text be shorter than 80 characters in length. An example sequence
in FASTA format is:
>gi|532319|pir|TVFV2E|TVFV2E envelope
protein
ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSY
SENRTQIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQ
KYNLRLRQAWCHFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDP
ETANLWFNCHGEFFYCKMDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPC
VQRTYVACHIRSVIIWLETISKKTYAPPREGHLECTSTVTGMTVELNYIPKNRTNV
TLSPQIESIWAAELDRYKLVEITPIGFAPTEVRRYTGGHERQKRVPFVXXXXXXXX
XXXXXXXXXXXXXXVQSQHLLAGILQQQKNLLAAVEAQQQMLKLTIWGVK
Many
bioinformatics programs, including MacVector, can output sequence
data in the FASTA format.
- What
is Blast? and how do I use it?
- What
is Entrez and how do I use it?
- What
is Pubmed and how do I use it?
- What
is MacVector and how do I use it?
|