Materials
There are a total of 12 pieces in this exercise. Each piece represents the paired reads and intervening unknown sequence for a single clone/insert. In order to keep this exercise manageable, all of the inserts have been reduced in size by a factor of 50. For example, a 3-kilobase (kb) insert is represented by a piece that is 60 base pairs (bp) long.
The 12 to 14 base pairs on the ends of each fragment represent the 600- to 700-base-pair reads. Since this portion was sequenced, it is represented by As, Ts, Cs and Gs. The intervening sequence is made up of Ns, which refer to any nucleotide. These Ns were not sequenced, so an A, T, C, or G could not be assigned to them. In this exercise, all of the sequence is single stranded, which removes the need to determine the direction of the piece.
Starting Material:
Group 1
3 small paired reads (~60 bp in length, representing a 3-kb insert)
2 large paired reads (~160 bp in length, representing an 8-kb insert)
Group 2
3 small paired reads
3 large paired reads
Shared Material
1 very large paired read connected by string (representing a 40-kb insert)
Assemble the pieces for your group together by using areas of similarity.
- Two pieces need a minimum of a 7-bp overlap in order to be put together.
- This overlap must not include any N bases.
- The reads also have to be in the same orientation so that the sequence can be read from left to right.
- Note that each group has a single base pair difference in one of its pieces. This base pair difference can be due to such factors as genetic polymorphism or low quality scores.
- Note that the assembly examples below do not represent double-stranded DNA where A matches to T and G matches to C. Rather, these represent two different sequencing reactions that sequenced the SAME strand of DNA. So A matches to A, C matches to C and so forth.
Valid Assemblies:
Example 1:
..NNNNGGACTATGATTCG ||||||| TGATTCGAGGCTAANN..
Example 2:
..NNNNNNNNCGATTCTGATCCGA ||||||| GTCCTCGATTCTNNNNNNNN..
Invalid Assemblies:
Invalid Example 1 (does not have at least a 7bp overlap):
..NNNNCGGACTATGATT |||||| ATGATTCGAGGCTAANN..
Invalid Example 2 (overlap has too many mismatches):
..NNNNNNNNCGCTACTGATCCGA || | ||| GTCCTCGATTCTGNNNNNNN..
Continue to put pieces together until all the pieces from your group make one contig.
Once each group has completed its contig, then use the 40kb string piece to join the two contigs. The 5’ end goes to group 1 and the 3’ end goes to group 2.