Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Too many mismatches?

    Hello guys,

    I've just been hit with my first SOLiD data...
    Reading the posts here, I already feel better to see that other people are struggling as well

    I'm trying to map the reads (75bp) to prokaryotic reference genomes and detect SNPs. Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment. I'm getting on average about 10 mismatches per read. Some have as low as 2 mismatches, but others have above 20. Because I was using ECC chemistry I did not think this would turn out so bad...

    My question is this: does it even make sense to try and detect SNPs if I have that many mismatches in my reads? Should I rather focus on getting the alignment to work in color-space?

    thanks for your help

  • #2
    I would focus on getting the color aligners to work. I would just use lifescope because it knows what to do with the ECC data and I think it has a few scripts for converting reference genomes to color space.

    If you just convert reads from color to space the ECC is useless.

    Comment


    • #3
      Originally posted by BambooGarden View Post
      Because I couldn't get any color-space aligners to work I've converted to base-space and used Bowtie2 for alignment.
      A very hesitant +1 for lifescope, because they know the most about colour-space.

      Are you aware that Bowtie (v1) can do colour-space alignment and has very similar input/output parameters to Bowtie2?

      How are you converting to base-space? If you're doing a naive conversion in the absence of a reference sequence (e.g. G1122330 = GTGAGCGG, regardless of error), then you're going to end up with plenty of rubbish sequence every time there's a colour error. At the risk of repeating myself too much, colour-space is not an intuitive way of representing sequence, and you'll save yourself a lot of pain and time by shifting to a different sequencing platform.

      Comment


      • #4
        I am using NGS plumbing to convert to base-space. I guess this is what you call a naive conversion because I'm not putting any reference sequence in at that point. What would be a software to convert with taking a reference sequence into account?

        Yeah, I agree. Definitely next time another platform. But for now I'll have to make do with this data somehow.

        Thanks for the help.

        Comment


        • #5
          Originally posted by BambooGarden View Post
          What would be a software to convert with taking a reference sequence into account?
          Bowtie can do this, you just have to map the reads to your reference first (which is a bit of a chicken/egg thing). The base-space sequence reported by bowtie is corrected to match the reference sequence (but including any discovered SNPs).

          Comment


          • #6
            Older versions of BWA worked with "SOLiD".
            Colorspace was disabled in 0.6.1, I don't know if it was re-enabled.
            As I remember, it required using solid2fastq.pl program.
            The "bioscope" aligner was too aggressive in aligning reads; BWA did a better job of dropping and clipping reads that had mis-transitions in the middle of the reads.

            The newer "lifescope" (?) software may have improved the situation.

            I'd recommend getting and old copy of BWA and using it.

            Comment


            • #7
              We did do some testing with comparing bioscope, lifescope, bwa, bowite and shimp for color space alignments and found that Shrimp2 worked the best. Although all of these tests were done before my arrival so I don't have all of the details, but generally Shrimp2 seems to work well and maps in color space.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Working...
              X