Step 1
Separate the students into two groups (or four groups if running two assemblies). Generally three to seven students per group works best.
Step 2
Separate the pieces into two groups. The pieces have a barcode on the 5′ end. The barcodes have a format that consists of a single character followed by a dash followed by six numbers and then finally a letter. The initial character before the dash designates the group that the piece belongs to. Thus the barcode 2-664789P belongs to group 2. Barcode 1-371164C belongs to group 1. The barcode for the 40-kb piece has an initial character of B, which represents bridge. B1 is the part of the 40-kb piece that goes with group one. B2 goes with group 2. This 40-kb piece is introduced later.
Step 3
Go through the Student Instructions with the students.
Step 4
Give the students the pieces.
Notes
- If the students need help during the assembly, you can assist them. The barcodes also contain information on the placement of the pieces. The third number after the dash in the barcode indicates which position (reading from left to right) the piece has within the group. Thus the barcode 2-664789P is the fourth piece from the left piece in group 2. Barcode 1-371164C is the left-most piece in group 1. The third positions in the 40-kb fragment pieces do not have any significance.
- The two pieces that contain mismatches are 1-653687F and 2-664789P.
Step 5
- Once they have completed the assembly, make the following points:
- Show the idea of depth of coverage (the number of times a particular base is sequenced on different reads) by pointing out areas where the assembly is 3 to 4 times in depth.
- A true assembly is much more difficult than this due to the following reasons:
- The clones and reads are 50 times longer.
- The clones are double stranded and their orientation is unknown.
- Genomes contain repeat areas.
- There is potential for contamination.
- There may be polymorphisms.
- The number of reads in an assembly is in the thousands to millions. Ideally, there are enough reads so that every base is represented 10 times, thus giving us 10x depth of coverage.
- These difficulties make it necessary to use computers to perform the assemblies.
- Note that the 40kb piece bridged the two contigs thus resulting in one contig. This is the main reason why the JGI does 40kb libraries. If the portion of the assembly where a 40kb fragment does not have enough depth of coverage, then often times that fragment is broken up into a 3kb library to fill in the area.
- Note that the base-pair mismatch was in the last base of the read. This is there to illustrate that the quality scores typically drop towards the end of the read.
- To illustrate other points relating to sequence quality, work through the quality activity.
Step 6
Go through the Computer Demo with the students.