Monday 10 January 2011

bioinformatics - Source of DNA sequences

This is more of a comment, but too long to put in a comment box, so I'm putting it here.



This is a fun idea you're doing. I have a half baked idea (assuming you're looking for input) if you want to explore it further -- or not ... by the time I finish writing this, I might realize it's too silly, but still ... let's see.



It might be fun to take the sequence of two organisms, let's say mouse and human and align certain regions to each other -- imagine this is like playing a piano where the "left hand" might be the mouse sequence, and the "right hand" is human.



So, say you take a gene that are shared in both, like CCND1. You can align them against each other and you'll find large portions of the sequences are common (with some mismatches, obviously). In these regions, the left and right hands are playing together (different octaves.



You'll also find gaps in the alignments where you'll have a stretch of "mouse only" or "human only" sequence, and in these regions the left or right hand play alone (solo).



For instance, say the two alignments look like this:



mouse: CGTGGGAGGCTCTTGAGCCTGGAAACACTATCGCAGTTTGTACGGAATGCACTTGTTCTTTACAAAAGG
human: CTTGGGCGACA---GAGC---GAGACTTTGTCTCAAAAAAGAAG--------------------AAAAG


In this case, you see stretches of the alignments where the mouse (left hand) will be playing a solo, and other times the two hands play in "harmony."

No comments:

Post a Comment