Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • comparing results from two different reference genomes

    we have a Solexa experiment that seems to be contaminated with a different genome than the one we were originally aiming at. The genomes have very different sizes (mouse vs. pombe) and if I understand correctly the quality scores from the fastq output correctly they are dependent on the size of the reference genome.
    In our particular example it seems to me that the smaller genome will always get lower scores (due to the smaller reference genome). Is there a way to account for that and make the quality scores comparable?

    To clarify a bit my confusion:
    I got a Gerald output in s_1_sequence.txt with a reference genome to pombe that starts like this:
    @PF2:1:1:1644:1100
    ATGAATTTCAGCCTCTGGTCAGGCAGGGTTCCTTTT
    +PF2:1:1:1644:1100
    OOOOOOOOPOKOOPPOKKOOOOKOOEKKOOGGGGGA
    @PF2:1:1:1702:1050
    ACCAAGCGCAAATTTACGATTTAATTAGTATTTATA
    +PF2:1:1:1702:1050
    OPOOOOPOOOOOOOOOOOOOOOOOOOOHOOCHAHHE
    @PF2:1:1:1532:1901
    TTCAAATATTCCTGATCCAATGACAAGTTGAACCGT


    And I get another file for a mouse genome as reference:
    @PF2:1:1:1644:1100
    ATGAATTTCAGCCTCTGGTCAGGCAGGGTTCCTTTT
    +PF2:1:1:1644:1100
    VVVVVVVVVVOVVVVVOOVVVVMVVCMNVVQQRRRE
    @PF2:1:1:1702:1050
    ACCAAGCGCAAATTTACGATTTAATTAGTATTTATA
    +PF2:1:1:1702:1050
    VVVVVVVVVVVVVVVVVVVVVVVVVVVIVVGRCRRP
    @PF2:1:1:1532:1901
    TTCAAATATTCCTGATCCAATGACAAGTTGAACCGT


    clearly the quality scores are different.
    Which makes me believe that not only peak information but also alignment information is used. Peak information is used because I see for the same sequence different quality scores. There reference genome is used for the calculation of the quality score because for the exact same clusters different scores are being obtained with different reference genomes.
    Now the questions: How do I make the values comparable for different reference genomes. I want to identify sequences that align better to one reference genome compared to another one in order to get some understanding about the possible contamination.


    Any comment is appreciated.

    Thanks,

    Bernd
    Last edited by BAJ; 02-23-2009, 08:52 AM.

  • #2
    I just heard back from techsupport at illumina that this is a property of pipeline 1.0 whereas pipeline 1.3.2 is independent of the alignment.

    Comment


    • #3
      Ohh, so thats changed in 1.3
      Yes this was the case as Illumina updated its quality values based on Gerald alignments. If you check the quality values in Bustard folder, they should match irrespective of reference
      --
      bioinfosm

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin


        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        Yesterday, 07:48 AM
      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 06:57 AM
      0 responses
      9 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 07:17 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-02-2024, 08:06 AM
      0 responses
      19 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-30-2024, 12:17 PM
      0 responses
      23 views
      0 likes
      Last Post seqadmin  
      Working...
      X