BIO 224 Laboratory CSU, Sacramento September 3, 2004 Lab Assignment 9 (15 pts total; due Wed Nov 17th by midnight) 1. Go to the website for Mega to install the program onto your local computer. (http://www.megasoftware.net/ ). Go to the download for mega 4 (not the beta version though) and enter your information for the free download. Follow the steps to download it onto your computer. 2. Go to the “Tutorial on How to Use MEGA” and click the tutorial “Aligning Sequences” to learn to create multiple sequence alignments. Perform Ex 1.0.2 to Ex 1.1.5 to learn how to open up mRNA sequences. (Note: there seems to be a misprint in their instructions- the file seems to be hsp20.meg and not hsp20.fas). 3. Once you finish with Ex 1.1.5, open up the file Drosophila_Adh.meg using the same protocol as above (note: you are to open up the aligned data file in the MEGA program itself- can't do the analyses in the Alignment Explorer program) When you have opened your aligned file into the MEGA program, go to the Data drop down menu and select "Data Explorer" to view the mRNA alignment. Then go to the highlight drop-down menu and answer the following questions: a. how many 0-fold degenerate sites are there? b. how many 2-fold degenerate sites are there? c. how many 4-fold degenerate sites are there? d. Explain why there are so many 0-fold degenerate sites than the others e. Explain why there are more 4-fold degenerate sites than 2-fold sites Go to the Data drop-down menu and translate the sequences, then go back to the highlight menu to answer the following questions: f. How many conserved sites are there? g. How many parsimonious informative sites are there? h. How are parsimonious informative sites used for tree building? 4. Then, use the next Exercise within the same tutorial “Aligning Sequences” (Ex 1.2.1 and on) to upload your human mRNA sequence and the mRNAs for the homologs you identified in question #1 of assignment #7. When you upload your mRNA sequences (step 1.2.4), be sure to upload only the coding sequence (cds) portion of the mRNA displayed in Fasta format. Also, translate your data into protein sequences prior to aligning them. Then save your data as a “meg” file extension as the tutorial states. (note the question from assignment 7 is re-listed for you below). [Question 1 from assign 7: Using your assigned human protein/mRNA sequence, find one other mammalian sequence in addition to your mouse & rat sequences (one outside of the rodent family) and also two sequences from non-mammalian species.] BIO 224 Laboratory CSU, Sacramento September 3, 2004 Save the alignment file and a mega file for you to perform the rest of the analysis. a. How many conserved sites in the protein alignment do you have? b. How many parsimonious sites are there? c. Go to the Distances drop-down menu (the main program, not the “Sequence Data Explorer” window) and “compute the pairwise distances” for the protein using the substitution model for the pdistance (need to go to the Substitution Model part of the dialog box and choose the amino acid setting and p-distance parameter)– paste the table below (use the export/print function within the file drop down tab) [definition of p-distance (Nucleotide) from MEGA: “This distance is the proportion (p) of nucleotide sites at which two sequences being compared are different. It is obtained by dividing the number of nucleotide differences by the total number of nucleotides compared.”] d. Perform the same analysis as part c except change the substitution model to “Poisson correction” – paste the table below [definition of Poisson Correction (PC) distance from MEGA: “The Poisson correction distance assumes equality of substitution rates among sites and equal amino acid frequencies while correcting for multiple substitutions at the same site.”] e. Compare the two tables you generated in c and d and explain why they are different? 5. Tree analysis a. Go to the Phylogeny drop down menu and then select the "Bootstrap test of Phylogeny"and then the neighbor-joining tree analysis option. Perform the neighbor-joining tree using the translated protein dataset and the Poisson correction (meaning use the using the substitution model based on amino: Poisson correction) with a bootstrap analysis (500 replicates; note that this is the default setting). Paste the bootstrap consensus tree below. What do the numbers at the various nodes mean? b. Use the branch swapping function on the left of the tree and flip around several of your branches to show yourself that these are still the same tree although they may look different. Paste one of these and describe what these superficial differences are between the trees. c. Go back to the Phylogeny drop down menu and then select the "Bootstrap test of Phylogeny" again except this time select the maximum parsimony tree analysis. Choose the settings to be based on your protein alignment by going to BIO 224 Laboratory CSU, Sacramento September 3, 2004 the "->Codon Positions" option and then select "Translated Amino Acid Sequences" under the Data to Analyze section. Perform bootstrap analysis of 500 replicates (note: if program shuts down, then perform bootstrap with less replicates- for example 100). Paste the consensus tree below. d. Does the branching order match between the neighbor-joining and maximum parsimony trees? Describe the tree differences. Why might these two phylogenetic analyses generate different branching orders? (note: answer this question theoretically if you trees do happen to have the same branching order) 6. Perform the relative rate test (found under the phylogeny drop-down menu) on all three of your mammalian mRNA sequences using an appropriate outgroup. (meaning you will need the following tests, human with mouse, human with rat, and mouse with rat). Why did you use this outgroup? What were the results for all three relative rate tests? Even if your relative rate tests did not show any differences, what does it mean if you did find significant differences between species comparisons?