Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bioscope or Bowtie+TopHat ??

    Hi all

    we have a built-in Bioscope pipe line which runs all rnaseq files for us.
    As I personally don't know the pipeline of Bioscope but I'm fully aware of Bowtie and Tophat.
    Which one do you guys recommend? and why?
    It's bit helpful if you explain any major or minor or important differnces between using these different commercial and public pipelines.

    Thanks in advance

  • #2
    Originally posted by repinementer View Post
    Hi all

    we have a built-in Bioscope pipe line which runs all rnaseq files for us.
    As I personally don't know the pipeline of Bioscope but I'm fully aware of Bowtie and Tophat.
    Which one do you guys recommend? and why?
    It's bit helpful if you explain any major or minor or important differnces between using these different commercial and public pipelines.

    Thanks in advance
    To my knowledge, Bioscope finds splice/fusion junctions based on known exons only. It doesn't give you new exons, while Tophat does.

    Comment


    • #3
      Last I checked, TopHat doesn't have support for color-space without some adjustments. I really like BowTie, and that's what we've been using for our mapping. Most our samples are bacterial, so the junction mapping isn't particularly important for us right now. Overall, I find BioScope to be slow, computer intensive, and difficult to troubleshoot when there are glitches, bugs or network errors. They seem to have comparable accuracy.
      As long as you don't need junction mapping or are able to get TopHat working with color-space reads, BowTie is definitely the way to go.

      Comment


      • #4
        There are some very good tools for the alignment stage of SOLiD reads that you could replace in Bioscope.
        Have a look at BFAST,BWA and novoalign for colorspace alignment.

        Comment


        • #5
          Re-open bowtie vs. bioscope mapping

          Dear all,
          I am wondering about the meaning of bioscope mapping. H in a cigar string means "hard clipped" as far as I can see. Therefore, a lot of the alignments below are clipped at the beginning of a read. What does this mean biologically. E.g. if I need to calculate the centre of a nulceosome position then would one include the 9 bases that were hard clipped?

          I am currently doing pair-end tests to see which maps better between bowtie and bioscope.

          Would appreciate any new observations since this last post.

          Thank you,

          John.

          1609_868_1720 147 chr2L 21095 0 9H26M = 20984 -136 GCGGTGGCCGAGTAATTTTTTGAACT :IIF<%%0=%%II%%?@IIEEIIIII RG:Z:20110622114026958 NH:i:2 CM:i:3 SM:i:4 CQ:Z:6>?8>:,:89(8%>=4%7'*%2+<<6%&079+-2' CS:Z:G22102100000333123303111033100300200
          1702_1269_1280 147 chr2L 21095 0 9H26M = 20976 -144 GCGGTGGCCGAGTAATTTTTTGAACT AII((II11..II''FBIIIIIIIII RG:Z:20110622114026958 NH:i:2 CM:i:4 SM:i:4 CQ:Z:5A?=@@-A><'@'A><.31=76(-A=%&)?6%%,+ CS:Z:G22102100000333123333013033210301333
          1730_1944_1244 147 chr2L 21095 0 9H26M = 20988 -132 GCGGTGGCCGAGTAATTTTTTGAACT >II%%II'',,IIIDIHIIIIIIIII RG:Z:20110622114026958 NH:i:2 CM:i:3 SM:i:4 CQ:Z:;?A>A?5@>81;*B?>,:'919%2=:%'6@1&%-& CS:Z:G22102100000303123333013033100300101
          1808_1086_1406 147 chr2L 21095 0 9H26M = 20989 -131 GCGGTGGCCGAGTAATTTTTTGAACT ?II</0?--**IIIEGAIIE=IIIII RG:Z:20110622114026958 NH:i:2 CM:i:2 SM:i:4 CQ:Z:=A@.=6(>87+=)B<>*7-8()'6=;%+1:6%-5% CS:Z:G22102100000303123333011033100300003
          1844_699_1062 147 chr2L 21095 0 9H26M = 20990 -130 GCGGTGGCCGAGTAATTTTTTGAACT @IIGHII))33IIIID?IIIIIIIIE RG:Z:20110622114026958 NH:i:2 CM:i:2 SM:i:4 CQ:Z:*<A2A9><?9'>.A;;3=8817A<%'+@2.%&) CS:Z:G22102100000303123333011033200301101
          2239_913_1797 147 chr2L 21095 0 9H26M = 20990 -130 GCGGTGGCCGAGTAATTTTTTGAACT *BI(&*7=C=%)ID<%%FI=8H:CII RG:Z:20110622114026958 NH:i:2 CM:i:3 SM:i:4 CQ:Z:9?>&54%98/%6'>-)%9+3%&(;=&%3:8%(.8' CS:Z:G22102100003303102303010033100300000
          937_1422_208 147 chr2L 21095 0 9H26M = 20989 -131 GCGGTGGCCGAGTAATTTTTTGAACT CII((>54&(EIICEIIIIIFIIIII RG:Z:20110622114026958 NH:i:2 CM:i:2 SM:i:4 CQ:Z:<A<8>70A><39-7;<*(&/'8(8@9+1-;=)46+ CS:Z:G22102100000303122203013033100300000
          1552_1879_1107 147 chr2L 21096 0 8H27M = 20990 -132 CGGTGGCCGAGTAATTTTTTGAACTAT IFIB?IIH"""=II8IID//A3IIIAI RG:Z:20110622114026958 NH:i:2 CM:i:3 SM:i:4 CQ:Z:B)93</%=//6>,-?5)*>72=;%>255/)(3-&5 CS:Z:G23321021100003031133030110300133003
          191_923_442 147 chr2L 21096 0 10H25M = 20988 -132 CGGTGGCCGAGTAATTTTTTGAACT IG+/G99I44IIIIFBIIIIIIIII RG:Z:20110622114026958 NH:i:2 CM:i:2 SM:i:4 CQ:Z:@@B>B:3B@=&A==<84<5%53/+=;()8::85/9 CS:Z:G22102100000303123303010030030300303
          373_1670_1840 147 chr2L 21096 0 10H25M = 21023 -97 CGGTGGCCGAGTAATTTTTTGAACT :I%%6<?IC2II7@IHGIIA55III RG:Z:20110622114026958 NH:i:2 CM:i:1 SM:i:4 CQ:Z:8;?0&02>8098)/A)*:00-*%>1*+.:*%'/&% CS:Z:G22102100000303122303010030000000003
          408_1737_482 147 chr2L 21096 0 10H25M = 20991 -129 CGGTGGCCGAGTAATTTTTTGAACT BI%%B9.54;FA>B;?IIIDIIIII RG:Z:20110622114026958 NH:i:2 CM:i:1 SM:i:4 CQ:Z:8=>.<6/>;6*21.43),*%5.%?4/12>01607. CS:Z:G22102100000303122303013030000000000
          1270_638_531 147 chr2L 21100 0 8H27M = 20991 -135 GGCCGAGTAATTTTTTGAACTATTTTA )89.%%GIACE?III;BCDH&&IIIII RG:Z:20110622114026958 NH:i:2 CM:i:2 SM:i:4 CQ:Z:>3;9?A&,=(<'5><6*<(:?)%)&4%%,.:28*6 CS:Z:G13000322102100000303123

          Comment


          • #6
            Cigar H (hard clip) means they cut out some of the input read. i.e. ATCGAAAAATCGAAA got changed to ATCGAAAAAATC; first part got kept, last 4 nucleotides got thrown away.

            This opposed to the S)oftclip which keeps the sequence. BWA produces softclips sometimes.

            I *think* in the bioscope pipeline the frequent CIGAR "H" operator in the output CIGAR string means this: the aligner software in Bioscope figured out part of the read was junk and so they just chopped it off.

            I suspect it has to do with the "transition from one nucleotide to another in 'colorspace' and it gets confused but it can figure out that it got confused later in alignment and can deal with it". Maybe it knows there's a problem and tags with a low quality score. Regardless, it's dropping bad/unalignable edges of the read.

            Comment


            • #7
              Dear Richard,
              Thanks for the feedback. I think for things like mutation detection, this hard/soft clipping could be OK as long as the read does not map to many locations in the genome. However, for things like nucleosome positioning, where you want to find the central point, it is hard to know if the first part of the read was just junk or that it did not map (maybe caused by fusion or indel).
              1) so if it is only bad/low quality bases that has been clipped and it maps uniquely, this is OK in my mind
              2) but if clip is because it can't align part of the read, this is not so good.

              I guess not knowing which is a problem and as I thought reads were usually higher quality at the 5' end, it seems suspicious with any 5' clipping.

              Thanks again for your useful answer.

              Kind regards,

              John.

              Comment


              • #8
                I find BioScope to map more reads consistently (50-60%) and Tophat/Bowtie to map less (30-40%). However, I don't think amount of reads mapped can determine which mapper is "better". The Bowtie manual says the best alignments are guarenteed. So the difference between the percentage mapped between Bowtie and BioScope might just that Bowtie is more strict with determining alignment.

                Personally, I would use Bowtie for assembly. For expression levels, I use both Bowtie and BioScope and see how they stack up against each other.

                Comment


                • #9
                  I had a similar observation between bowtie and bioscope. when i did alignments of SOLID data with bowtie I got 25% less reads aligned compared to bioscope results we got from our core facility(bowtie aligned 60% of my total reads). when I asked about this, I got a reply mentioning "clipping" that bioscope does which might very well contribute to the 25% difference. I'm wondering if clipping would increase the read# one gets for a gene since shorter sequences might have a higher chance to align where they may not belong. I may be totally wrong too. Any ideas?

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  25 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  27 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X