SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
ChIP-Seq: GenPlay, a multi-purpose genome analyzer and browser. Newsbot! Literature Watch 1 10-19-2011 02:42 PM
Please help: imperfect reference genome/get consensus on genome/read alignment? KAP Bioinformatics 1 08-19-2011 08:14 AM
Whole Mitochondrial Genome Alignment rahul_piyush Bioinformatics 0 06-24-2011 08:51 AM
Multi-Genome Analysis Industry Session Features Don Gregory of GenomeQuest GenomeQuest Vendor Forum 0 09-20-2010 06:23 AM
Cuffdiff multi-protein vs multi-promoter RockChalkJayhawk Bioinformatics 2 03-26-2010 11:26 AM

Reply
 
Thread Tools
Old 08-17-2010, 08:51 AM   #1
james hadfield
Moderator
Cambridge, UK
Community Forum
 
Location: Cambridge, UK

Join Date: Feb 2008
Posts: 221
Default Multi-Genome Alignment for QC...

In a previous post on our HiSeq I mentioned that we were running a multi-genome alignment (MGA) as a QC tool. Comments made me think it would be an interesting topic to post in the Bioinformatics section, not one I usually post in!

The work for this was done by Matt Edlridge, our head of bioinformatics. Big thanks to him for doing it!
  1. The MGA takes a sample of sequence reads from a lane and aligns the first 36bp using Bowtie. The sampling allows the MGA to run fast and this is part of our normal data pipeline, we get to see the report in our LIMs alongside the Gerald report (which I think we will soon be ditching entirely).
  2. Of course reads can align to multiple genomes (conserved regions). If this happens we assign the read to the genome with most reads. This approach should show up cases of genome contamination and maximise the difference between first and second genomes in the list.
  3. We also use Exonerate to identify sequences containing Illumina adapters.

Currently we run against: Human, Mouse, Rat, Xenopus, Arabidopsis, C.elegans, Yeast, Bacteria and Viruses (the last two being amalgamations of >1500 genomes each). There are other genomes as well which are specific to the work for projects in our lab, I guess at some level it would be possible to run against all genomes?

The output is a descending list of genomes with the highest number of aligned reads expressed as a percentage. Hopefully the genome the user was expecting! We did have a case about three years ago where one user accidentally sequenced a genome to 80x coverage of an organism that was also growing in his lab. It took a little time to work out what was wrong with his experiment and I believe the data was handed over to that community. Serendipity at its best!
There are often un-aligned reads and the assumption initially was that these were junk low quality reads. Running this kind of aligner might allow us to see if that assumption is true but we have not looked at this at this time.

The reason I wanted this MGA in our pipeline was to see what amount of PhiX was in lanes where we had not actually put it. The assumption was that any sloppy practices in a lab where all flowcells are set up would be obvious in this instance. It was immediately clear that the level of PhiX ‘contamination’ from lane to lane was very low. We identified two or three flowcells where there was a potential issue but this was out of over many hundred. We were also able to get run reports and data from anther large centre nearby and they had similar results. All in all I was very happy with the low contamination from lane to lane and am very happy that the protocols are reasonably robust.

PhiX must be being breathed in as aerosols in labs the word over, might we get some Cronenberg style PhiX-Human hybrid. Let me know if you see one...

Let me know what you think.
james hadfield is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO