|
Basic Molecular Biology Questions 1 - The Non-coding regions of spliced mRNA are (a) CDS (b) 3’UTR and 5’UTR (c) Introns (d) Promoter region (e) All of the above
(a) CDS (b) 3’UTR and 5’UTR (c) upstream region of a gene (d) Downstream region of a gene (e) All of the above
(a) CDS (b) 3’UTR and 5’UTR (c) Introns (d) Any DNA segment outside of a gene (e) All of the above
(a) Protein (b) DNA (c) mRNA (d) EST (e) All of the Above
(a) Donor/acceptor sites of introns (b) PolyA tail (c) Exons (d) Promoter (e) All of the above
(a) The end a protein (b) The 3’UTR (c) Before the mature peptide (d) On the tRNA (e) All of the above
(a) A segment that starts with a start codon and goes until the next stop codon. (b) A partial coding region with no start or stop codon. (c) A partial coding region with no STOP codon (d) A coding region with a gap in it. (e) All of the above
8 - Use a search Engine and find the URL for the CENSOR program to mask repeats. What is that URL 9 - Use a Search engine to find a large laboratory facility that specializes in hosting 100’s of mouse strains and is also heavily involved in the sequencing of mouse genes and genome. (hint: they are located on coastal Maine). What is the name of that laboratory?
10 - How do you prevent words from being term-mapped in Entrez (a) Put them in parenthesis (b) Put a ‘+’ sign in front of each term. (c) Put them within single quotes (d) Put them within double quotes (e) Add butnot pubref [PROP] to the query.
(a) It’s a term used in the meshed index of Entrez (b) A controlled vocabulary of keywords assigned by manually by indexers: Medical Subject Heading. (c) It’s a synomym dictionary used by entrez: Medical Synonym and Homonyms (d) It’s a set of keywords that each submitter of an article assigns to his article. (e) All of the above.
(a) A curated collection of Gene locus (b) A database of the genomic location of genes. (c) A clustering of sequence that attempts to provide one record for each gene by using sequence similarity. (d) The blast database of all the EST (e) All of the above.
(a) NM_001241 (b) NP_001241 (c) P001241 (d) AAA001241 (e) NT_001241
(a) A50517 (b) AAA01241 (c) P23356 (d) A23D561 (e) NP_001241
(a) A23D561 (b) I12345 (c) NX_001241 (d) R01241 (e) AAA26521
(a) StructBase (b) PDB (c) PRF (d) Genbank (e) Swissprot
(a) The limits field in entrez restricted to refseq sequences (b) In entrez, limit to srcdb_refseq [PROP] (c) Search using the LocusLink ressource (d) All of the above
19 - What kind of alignment method is the Smith-Waterman method. (a) Hash-indexed (b) Global Alignment (c) Multiple-alignment (d) Local Alignment (e) Gibbs Sampling
(a) Hidden-Markov modeling (b) Regular expression searching. (c) Blast alignments (d) Codon Usage (e) Profile searching.
(a) 10e10 (b) 65536 (c) 16777216 (d) 1048576 (e) 1024
(a) SAGE (b) EST+UniGene+DDD (c) Affymetrix microarrays (d) 2D GELS (e) All of the Above
(a) About 1 variation every 3 bases in the coding region. (b) About 1 variation per million base (c) About 1 variation per 1000 bases (d) There is possible polymorphism at every base vary outside the coding region. (e) All of the above. 24 - I want to search a 5’ human EST against 5’ drosophila EST, which tool should I use. (a) Blastn (b) Blastx (c) PSI-blast (d) Tblastx (e) Megablast
(a) Find evidence of Horizontally transferred genes. (b) Find evidence of yeast infection in the human patients. (c) Find human exons, since the introns will have evolved away. (d) Find sequencing contamination in human sequence. (e) All of the above
(a) NO filtering whatsoever (b) Low-complexity filtering AND human repeat filtering (c) Low-complexity filtering only. (d) Human repeat filtering only. (e) No filtering, but mask for lookup table only.
P1 - (5pts, 1-10 minutes) Find one genomic sequences of rodents that have sequence length between 10040 and 10050, and which contain at least one CDS feature. Give me the accession number. Accession:_____________________ P2 - (10pts,5-30 minutes) After much analysis, your collaborators have determined that a gene involved in diabetes is between genethon markers AFM242ZG5 and AFM266YB5. Of all the genes in gene_seq map between those two markers, which gene is most likely to be involved in diabetes.(don’t forget to set the display settings to see what you want! zoom in enough.. or to change enough parameters to see ALL the genes and markers.. and to use the verbose mode) Gene name or defline:______________________________________________ Accession:_______________ P3 - (10 pts5-10 minutes.. mostly waiting.) blast the protein for the PAX6 isoform a gene, NP_000271 against nr. You should find a hit for PAX2a.. if not increase the number of definitions to display. a) What is the E-value of the PAX2 (pax gene 2) hit. (near the bottom of the hitlist) E-value:_____________ Use the FASTA(without the defline) of NP_000271 to search PROSITE at the expasy website (use a search engine if you don’t remember the URL) for PROSITE patterns. Exclude from the search patterns with a high probability of occurrence. b) What are the two longest patterns that you find in this sequence (give the procite names) . Pattern 1: __________________________ Pattern 2:___________________________ c) Use the PHI-blast with NP_000271 (without defline) and the PROSITE pattern (one of the ones you already found) [LIVMFYG]-[ASLVR]-x(2)-[LIVMSTACN]-x-[LIVM]-x(4)-[LIV]-[RKNQESTAIY]-[LIVFSTNKH]-W-[FYVC]-x-[NDQTAH]-x(5)-[RKNAIMW] Do you see PAX2a (if you see it, what is the E-value) Yes/No {Evalue = ) |