What is the Frame and length of the longest found ORF when running the program under default settings?

What is the Frame and length of the longest found ORF when running the program under default settings?

database assignment

UMUC BIOT630
Week 8 Assignment

Question 1.

Using the FASTA formatted Human genomic sequence provided at the end of this Exercise, perform Gene Prediction using the “Pattern-based” program “ORFinder”. What is the Frame and length of the longest found ORF when running the program under default settings?

ORFinder url = https://www.ncbi.nlm.nih.gov/orffinder/

Answer = ?

Question 2.

In the results page returned for Question 1, what is the longest ORF that has a significant human blast hit by E value? To answer, interrogate the ORFs one-by-one from the longest to the shortest until a significant human result is returned by “SmartBLAST”.

Answer = ?

Question 3.

Using the FASTA formatted Human genomic sequence provided at the end of this Exercise, perform Gene Prediction using the “Content-based” program “geneid”. First, click on the “Reset Form” radio button to ensure the default parameters for the tool are loaded. Second, paste in the sequence to perform gene prediction on in the top input window. Third, select from the “Output options” drop down box “geneid including CDS sequence”. Lastly, click on the “Submit” radio button. Paste a copy of the predicted results below as your answer.

geneid url = https://genome.crg.cat/geneid.html

Answer = ?

Question 4.

Using the fasta formatted nucleotide sequence returned from the geneid prediction tool for Question 3, use the TESTCODE tool to verify that the ORF sequence is indeed coding. Copy/paste the TESTCODE returned results as your answer.

TESTCODE url = http://www.bioinformatics.org/SMS/testcode.html

Answer = ?

Question 5.

Using the fasta formatted protein sequence returned from the geneid prediction tool for Question 3, perform a protein BLAST (i.e., blastp) search to determine what the predicted known “human” gene by geneid is.

Answer = ?

Question 6.

Using the FASTA formatted Human genomic sequence and the FASTA formatted “Related” sequence provided at the end of this Exercise, perform Gene Prediction using the “Comparative-based” program “AUGUSTUS”. First, paste in the “Human” sequence in the top input window. Second, paste the “Related” sequence in the “expert options” input window under the title “Upload cDNA (ESTs, mRNAs) sequences”. Lastly, click on the “Run AUGUSTUS” radio button. How many genes are predicted to be present in the sequence and how many exons per gene?

AUGUSTUS url = http://bioinf.uni-greifswald.de/augustus/submission.php

Answer = ?

Question 7.

Using the protein sequence returned from the AUGUSTUS prediction tool for Question 6, put in fasta format and perform a protein BLAST (i.e., blastp) search to determine what the predicted known “human” gene by AUGUSTUS is.

Answer = ?

Question 8.

From the results of the Exercise, which Gene Prediction method performed the best, or did they seem to all perform equally well?

Answer = ?

Sequences to use for this Exercise:

>Human

AACCGCATCTGCAGCGAGCATCTGAGAAGCCAAGACTGAGCCGGCGGCCGCGGCGCAGCGAACGAGCAGT

GACCGTGCTCCTACCCAGCTCTGCTCCACAGCGCCCACCTGTCTCCGCCCCTCGGCCCCTCGCCCGGCTT

TGCCTAACCGCCACGATGATGTTCTCGGGCTTCAACGCAGACTACGAGGCGTCATCCTCCCGCTGCAGCA

GCGCGTCCCCGGCCGGGGATAGCCTCTCTTACTACCACTCACCCGCAGACTCCTTCTCCAGCATGGGCTC

GCCTGTCAACGCGCAGGTAAGGCTGGCTTCCCGTCGCCGCGGGGCCGGGGGCTTGGGGTCGCGGAGGAGG

AGACACCGGGCGGGACGCTCCAGTAGATGAGTAGGGGGCTCCCTTGTGCCTGGAGGGAGGCTGCCGTGGC

CGGAGCGGTGCCGGCTCGGGGGCTCGGGACTTGCTCTGAGCGCACGCACGCTTGCCATAGTAAGAATTGG

TTCCCCCTTCGGGAGGCAGGTTCGTTCTGAGCAACCTCTGGTCTGCACTCCAGGACGGATCTCTGACATT

AGCTGGAGCAGACGTGTCCCAAGCACAAACTCGCTAACTAGAGCCTGGCTTCTCCGGGGAGGTGGCAGAA

AGCGGCAATCCCCCCTCCCCCGGCAGCCTGGAGCACGGAGGAGGGATGAGGGAGGAGGGTGCAGCGGGCG

GGTGTGTAAGGCAGTTTCATTGATAAAAAGCGAGTTCATTCTGGAGACTCCGGAGCGGCGCCTGCGTCAG

CGCAGACGTCAGGGATATTTATAACAAACCCCCTTTCAAGCAAGTGATGCTGAAGGGATAACGGGAACGC

AGCGGCAGGATGGAAGAGACAGGCACTGCGCTGCGGAATGCCTGGGAGGAAAAGGGGGAGACCTTTCATC

CAGGATGAGGGACATTTAAGATGAAATGTCCGTGGCAGGATCGTTTCTCTTCACTGCTGCATGCGGCACT

GGGAACTCGCCCCACCTGTGTCCGGAACCTGCTCGCTCACGTCGGCTTTCCCCTTCTGTTTTGTTCTAGG

ACTTCTGCACGGACCTGGCCGTCTCCAGTGCCAACTTCATTCCCACGGTCACTGCCATCTCGACCAGTCC

GGACCTGCAGTGGCTGGTGCAGCCCGCCCTCGTCTCCTCCGTGGCCCCATCGCAGACCAGAGCCCCTCAC

CCTTTCGGAGTCCCCGCCCCCTCCGCTGGGGCTTACTCCAGGGCTGGCGTTGTGAAGACCATGACAGGAG

GCCGAGCGCAGAGCATTGGCAGGAGGGGCAAGGTGGAACAGGTGAGGAACTCTAGCGTACTCTTCCTGGG

AATGTGGGGGCTGGGTGGGAAGCAGCCCCGGAGATGCAGGAGCCCAGTACAGAGGATGAAGCCACTGATG

GGGCTGGCTGCACATCCGTAACTGGGAGCCCTGGCTCCAAGCCCATTCCATCCCAACTCAGACTCTGAGT

CTCACCCTAAGAAGTACTCTCATAGTTTCTTCCCTAAGTTTCTTACCGCATGCTTTCAGACTGGGCTCTT

CTTTGTTCTCTTGCTGAGGATCTTATTTTAAATGCAAGTCACACCTAGTCTGCAACTGCAGGTCAGAAAT

GGTTTCACAGTGGGGTGCCAGGAAGCAGGGAAGCTGCAGGAGCCAGTTCTACTGGGGTGGGTGAATGGAG

GTGATGGCAGACACTTTTACTGAATGTCGGTCTTTTTTTGTGATTATTCTAGTTATCTCCAGAAGAAGAA

GAGAAAAGGAGAATCCGAAGGGAAAGGAATAAGATGGCTGCAGCCAAATGCCGCAACCGGAGGAGGGAGC

TGACTGATACACTCCAAGCGGTAGGTACTCTGTGGGTTGCTCCTTTTTAAAACTTAAGGGGAAAGTTGGA

GATTGAGCATAAGGGCCCTTGAGTAAGACTGTGTCTTATGCTTTCCTTTATCCCTCTGTATACAGGAGAC

AGACCAACTAGAAGATGAGAAGTCTGCTTTGCAGACCGAGATTGCCAACCTGCTGAAGGAGAAGGAAAAA

CTAGAGTTCATCCTGGCAGCTCACCGACCTGCCTGCAAGATCCCTGATGACCTGGGCTTCCCAGAAGAGA

TGTCTGTGGCTTCCCTTGATCTGACTGGGGGCCTGCCAGAGGTTGCCACCCCGGAGTCTGAGGAGGCCTT

CACCCTGCCTCTCCTCAATGACCCTGAGCCCAAGCCCTCAGTGGAACCTGTCAAGAGCATCAGCAGCATG

GAGCTGAAGACCGAGCCCTTTGATGACTTCCTGTTCCCAGCATCATCCAGGCCCAGTGGCTCTGAGACAG

CCCGCTCCGTGCCAGACATGGACCTATCTGGGTCCTTCTATGCAGCAGACTGGGAGCCTCTGCACAGTGG

CTCCCTGGGGATGGGGCCCATGGCCACAGAGCTGGAGCCCCTGTGCACTCCGGTGGTCACCTGTACTCCC

AGCTGCACTGCTTACACGTCTTCCTTCGTCTTCACCTACCCCGAGGCTGACTCCTTCCCCAGCTGTGCAG

CTGCCCACCGCAAGGGCAGCAGCAGCAATGAGCCTTCCTCTGACTCGCTCAGCTCACCCACGCTGCTGGC

CCTGTGAGGGGGCAGGGAAGGGGAGGCAGCCGGCACCCACAAGTGCCACTGCCCGAGCTGGTGCATTACA

GAGAGGAGAAACACATCTTCCCTAGAGGGTTCCTGTAGACCTAGGGAGGACCTTATCTGTGCGTGAAACA

CACCAGGCTGTGGGCCTCAAGGACTTGAAAGCATCCATGTGTGGACTCAAGTCCTTACCTCTTCCGGAGA

TGTAGCAAAACGCATGGAGTGTGTATTGTTCCCAGTGACACTTCAGAGAGCTGGTAGTTAGTAGCATGTT

GAGCCAGGCCTGGGTCTGTGTCTCTTTTCTCTTTCTCCTTAGTCTTCTCATAGCATTAACTAATCTATTG

GGTTCATTATTGGAATTAACCTGGTGCTGGATATTTTCAAATTGTATCTAGTGCAGCTGATTTTAACAAT

AACTACTGTGTTCCTGGCAATAGTGTGTTCTGATTAGAAATGACCAATATTATACTAAGAAAAGATACGA

CTTTATTTTCTGGTAGATAGAAATAAATAGCTATATCCATGTACTGTAGTTTTTCTTCAACATCAATGTT

CATTGTAATGTTACTGATCATGCATTGTTGAGGTGGTCTGAATGTTCTGACATTAACAGTTTTCCATGAA

AACGTTTTATTGTGTTTTTAATTTATTTATTAAGATGGATTCTCAGATATTTATATTTTTATTTTATTTT

TTTCTACCTTGAGGTCTTTTGACATGTGGAAAGTGAATTTGAATGAAAAATTTAAGCATTGTTTGCTTAT

TGTTCCAAGACATTGTCAATAAA

 

>Related

ATGATGTTCTCGGGCTTCAACGCAGACTACGAGGCGTCATCCTCCCGCTGCAGCAGCGCGTCCCCGGCCG

GGGATAGCCTCTCTTACTACCACTCACCCGCAGACTCCTTCTCCAGCATGGGTTCGCCTGTCAACGCGCA

GGACTTCTGCACGGACCTGGCCGTCTCCAGTGCCAACTTCATTCCCACGGTCACTGCCATCTCGACCAGT

CCGGACCTGCAGTGGCTGGTGCAGCCCGCCCTCGTCTCCTCCGTGGCCCCATCGCAGACCAGAGCCCCTC

ACCCTTTCGGAGTCCCCACCCCCTCCGCTGGGGCTTACTCCAGGGCTGGCGTTGTGAAGACCATGACAGG

AGGCCGAGCGCAGAGCATTGGCAGGAGGGGCAAGGTGGAACAGTTATCTCCAGAAGAAGAAGAGAAAAGG

AGAATCCGAAGGGAAAGGAATAAGATGGCTGCAGCCAAATGCCGCAACCGGAGGAGGGAGCTGACTGATA

CACTCCAAGCGGAGACAGACCAACTAGAAGATGAGAAGTCTGCTTTGCAGACCGAGATTGCCAACCTGCT

GAAGGAGAAGGAAAAACTAGAGTTCATCCTGGCAGCTCACCGACCTGCCTGCAAGATCCCTGATGACCTG

GGCTTCCCAGAAGAGATGTCTGTGGCTTCCCTTGATCTGACTGGGGGCCTGCCAGAGGTTGCCACCCCGG

AGTCTGAAGAGGCCTTCACCCTGCCTCTCCTCAATGACCCTGAGCCCAAGCCCTCAGTGGAACCTGTCAA

GAGCATTAGCAGCATGGAGCTGAAGACCGAGCCCTTTGATGACTTCCTGTTCCCAGCATCATCCAGGCCC

AGTGGCTCTGAGACAGCCCGCTCCGTGCCAGACATGGACCTATCTGGGTCCTTCTATGCAGCAGACTGGG

AGCCTCTGCACAGTGGCTCCCTGGGGATGGGGCCCATGGCCACAGAGCTGGAGCCCCTGTGCACTCCGGT

GGTCACCTGTACTCCCAGCTGCACTGCTTACACGTCTTCCTTCGTCTTCACCTACCCCGAGGCTGACTCC

TTCCCCAGCTGTGCAGCTGCCCACCGCAAGGGCAGCAGCAGCAATGAGCCTTCCTCTGACTCGCTCAGCT

CACCCACGCTGCTGGCCCTGTGA

Answer preview what is the Frame and length of the longest found ORF when running the program under default settings?

What is the Frame and length of the longest found ORF when running the program under default settings?

APA

574 words