Homepage

 


 

Homepage

Announcements

Syllabus

Lectures

Grant Exercise

Grant Syllabus

Lab Syllabus

Grading

Practice Probl.

Links

 

Exercise 2

Exercise 2 overview. You will locate all Paramecium PP2B sequences related to the one you have downloaded. If you use keywords, sometimes PP2B is called calcineurin so you may need to try that keyword instead of PP2B. You can also conduct a Blast search of the database using the amino acid sequence you downloaded last week.

This part of the exercise acquaints you with the program Blast 2.0. This is a useful searching program that helps you identify related sequences. Coupled with Entrez, it is a powerful way to 'explore' what is known about your sequence or related sequences at a variety of levels. I have linked this page to a variety of tutorials offered by NCBI. Please go over them if you encounter troubles using Blast.

After you have your sequences, we will align the nucleotide and amino acid sequences using MacVector. A tutorial explaining how to conduct alignments in Macvector is available in the Cell Physiology links page

Using Blast
Below is an online tutorial designed to help the first time BLAST user. The tutorial will teach you to input a sequence into the Basic BLAST web page, choose a program and database to search, and examine the results. The core of NCBI 's BLAST services is a program called BLAST 2.0 otherwise known as "Gapped BLAST". This service is designed to take protein and nucleic acid sequences and compare them against a selection of NCBI databases. Generally for this class you will want to search the protein databases. To do so, you will find it easier to search using your amino acid sequence rather than the nucleotide sequence. I recommend using the blastp service and searching the "nr" database
. To search the databases, it is necessary to have your sequence in the FASTA format. A FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:


>gi|532319|pir|TVFV2E|TVFV2E envelope protein

ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSY

SENRTQIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQ

KYNLRLRQAWCHFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDP

ETANLWFNCHGEFFYCKMDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPC

VQRTYVACHIRSVIIWLETISKKTYAPPREGHLECTSTVTGMTVELNYIPKNRTNV

TLSPQIESIWAAELDRYKLVEITPIGFAPTEVRRYTGGHERQKRVPFVXXXXXXXX

XXXXXXXXXXXXXXVQSQHLLAGILQQQKNLLAAVEAQQQMLKLTIWGVK

Many bioinformatics programs, including MacVector, can output sequence data in the FASTA format.

 

 

Biology homepage | Fraga Homepage | Cell Physiology Homepage | Genbank | Pubmed | E-Res

Last Updated: August 24, 2004
Dean Fraga dfraga@wooster.edu