Exercise: Finding genes in human DNA (Thalassemia)
Start out identifying the gene structure, then use a gene prediction tool.
- Check whether the gene represents an ORF gene
- Open http://www.dnai.org/geneboy/.
- Click Genic 2, then Find Genes, ORFs.
- Record the position and length of ORFs and the length of their protein products.
- Go to http://www.ncbi.nlm.nih.gov/.
- Find and click BLAST.
- Find and click BLASTN.
- Paste your sequence into the window, click BLAST.
- Record the request ID.
- Click Format!.
- Using the output from the BLAST search identify the gene/protein. How long is the protein?
- Is the ORF determined above capable of encoding this protein?
- Determine the gene structure by aligning the protein sequence with the translated nucleotide sequence.
- Use a gene prediction tool to identify the gene in the human sequence above.
- Highlight and copy the sequence.
- Go to http://www.softberry.com/berry.phtml?topic=gfind&prg=FGENESH.
- Paste the sequence into the window, click PERFORM SEARCH.
- What structure does FGenesH (Find Genes Human) predict for the gene in this sequence?
- Align the mRNA sequence displayed in the FGenesH result window with the human DNA sequence above using the tool at http://pbil.univ-lyon1.fr/sim4.php/.
- What is the structure of the gene?
- Where would the ORF be located that was identified by the ORF Finder?
- Which is the true ORF for this gene?
- Identify the nature of the gene by performing a BLAST search of the human genome using Map Viewer.
- Open http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606.
- Find and click BLAST search the human genome, paste the sequence into the window, click Begin Search.
- Record request ID.
- Then, click Format.
- Click Genome View.
- How many hits? What chromosome is/are the hit/s located on?
- Click the number underneath the chromosome.
- How many genes are shown as hits?
- What type of gene?
- Find the gene that matches with the query sequence to 100%.
- Move the cursor over this gene and zoom in
(left mouse-click on vertical line, click Zoom 4x. Zoom in approximately 16 times.
- How does the structure of the gene match the FGenesH gene prediction?
- Note the arrow next to the gene; what does it denote?
- For a nucleotide view of the gene click on sv.
- Identify the various structural elements of the gene, including introns, exons, start codon, stop codon, polyA-signal, and TATA-Box.