Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Cole Trapnell View Post
    Hmm - that's a new one. What version of OS X are you running this on?
    Mac OS X 10.5.5

    2.8 GHz Quad-Core Intel Xeon

    Comment


    • #17
      Originally posted by NJD View Post
      I was getting the same message with bowtie-0.11.2-bin-macos-10.5-x86_64.zip. Working from source and setting BITS=64 seems to be fine. Mac OS X 10.5.8.
      Thanks for this. I hope not to have to work from source since I never have much luck doing that kind of thing.

      Comment


      • #18
        Originally posted by greggrant View Post
        Thanks for this. I hope not to have to work from source since I never have much luck doing that kind of thing.
        Hi guys,

        Please try the 10.4 package instead and let me know if that gives the same error. If 10.4 works fine, then I think I know what's wrong.

        Thanks for the reports,
        Ben

        Comment


        • #19
          Originally posted by Ben Langmead View Post
          Hi guys,

          Please try the 10.4 package instead and let me know if that gives the same error. If 10.4 works fine, then I think I know what's wrong.

          Thanks for the reports,
          Ben
          There is no 10.4, the options are 10.0 and 10.1, then it goes to 11.2 and 11.3.

          Comment


          • #20
            Originally posted by greggrant View Post
            There is no 10.4, the options are 10.0 and 10.1, then it goes to 11.2 and 11.3.
            Sorry, I was unclear. I meant the Mac OS 1.4 binary package. There is one for Bowtie version 0.11.2.

            Ben

            Comment


            • #21
              Originally posted by Ben Langmead View Post
              Hi guys,

              Please try the 10.4 package instead and let me know if that gives the same error. If 10.4 works fine, then I think I know what's wrong.

              Thanks for the reports,
              Ben
              OK I managed to compile 11.1 using BITS=64 as suggested, it compiled and ran without crashing. BUT the results are the same "Warning: junction database is empty!".

              Here is the tophat_out directory:


              This is really frustrating!

              Comment


              • #22
                I can't reproduce the issue you are seeing with just that tarball.

                Can you post the small sample of reads you are using? The left_kept_reads.fq file contains reads that have different lengths etc. TopHat is really designed for Illumina reads - you can certainly use 454, but you'll need to trim them down to be all the same length.

                Comment


                • #23
                  Originally posted by Cole Trapnell View Post
                  I can't reproduce the issue you are seeing with just that tarball.

                  Can you post the small sample of reads you are using? The left_kept_reads.fq file contains reads that have different lengths etc. TopHat is really designed for Illumina reads - you can certainly use 454, but you'll need to trim them down to be all the same length.
                  This file has the reads:

                  Comment


                  • #24
                    Originally posted by Cole Trapnell View Post
                    I can't reproduce the issue you are seeing with just that tarball.

                    Can you post the small sample of reads you are using? The left_kept_reads.fq file contains reads that have different lengths etc. TopHat is really designed for Illumina reads - you can certainly use 454, but you'll need to trim them down to be all the same length.
                    Thanks for your help, I split my reads to be all 50 reads in length. I didn't lose any sequence since I tiled longer reads with 50s. And this time it ran and did not report the Warning. However, the junctions file is empty. It can't be that there are no junctions. Here is a tarball of my input file and the tophat_out directory:



                    Does this look right?

                    Thanks again for your help!

                    Comment


                    • #25
                      Originally posted by greggrant View Post
                      OK I managed to compile 11.1 using BITS=64 as suggested, it compiled and ran without crashing.
                      Hi guys,

                      For what it's worth, I believe that I have now fixed the problem with the Bowtie macos-10.5 binary packages from version 0.11.3. I replaced the bad packages with good ones up on sourceforge. Let me know if you have more problems like that.

                      Apologies,
                      Ben

                      Comment


                      • #26
                        Originally posted by Ben Langmead View Post
                        Hi guys,

                        For what it's worth, I believe that I have now fixed the problem with the Bowtie macos-10.5 binary packages from version 0.11.3. I replaced the bad packages with good ones up on sourceforge. Let me know if you have more problems like that.

                        Apologies,
                        Ben
                        Thank you! Can you take a look at my file and see why it reports no junctions? There have to be some junctions in the transcripts. Thanks again for your help.

                        Comment


                        • #27
                          I just ran the left_kept_reads.fq file from the above package - there are handful of reads that are 49 bp, which explains your result. When the reads are less than 50bp long, TopHat uses a coverage-island based algorithm to find junctions. When they are longer than that, TopHat starts using a split segment algorithm in addition to the the coverage-based approach. For reads 75bp or longer, TopHat disables the coverage-based algorithm (since it's slower and has a larger memory footprint), and uses only the split-segment algorithm. Since your file has those 49bp reads, TopHat is reverting only to the coverage-based algorithm, and since this input set is small, TopHat has a hard time identifying possible splice junctions.

                          I was able to get junctions by running the pipeline with --segment-length 24, to force TopHat to use both the coverage based search and the split-segment search.

                          You may want to make 75bp reads out of your reads if possible, as that should dramatically improve your junction sensitivity. If you do that, you may also want to explicitly pass --coverage-search to TopHat, to further improve sensitivity. You should also consider passing a GFF file of mouse annotations to TopHat.

                          Comment


                          • #28
                            Originally posted by Cole Trapnell View Post
                            I just ran the left_kept_reads.fq file from the above package - there are handful of reads that are 49 bp, which explains your result. When the reads are less than 50bp long, TopHat uses a coverage-island based algorithm to find junctions. When they are longer than that, TopHat starts using a split segment algorithm in addition to the the coverage-based approach. For reads 75bp or longer, TopHat disables the coverage-based algorithm (since it's slower and has a larger memory footprint), and uses only the split-segment algorithm. Since your file has those 49bp reads, TopHat is reverting only to the coverage-based algorithm, and since this input set is small, TopHat has a hard time identifying possible splice junctions.

                            I was able to get junctions by running the pipeline with --segment-length 24, to force TopHat to use both the coverage based search and the split-segment search.

                            You may want to make 75bp reads out of your reads if possible, as that should dramatically improve your junction sensitivity. If you do that, you may also want to explicitly pass --coverage-search to TopHat, to further improve sensitivity. You should also consider passing a GFF file of mouse annotations to TopHat.
                            That helps a lot, I got it running and I reran it with read length 75 and --coverage-search. It returned 30 junctions, which seems low, I was expecting thousands. Is the power to find junctions usually that low with 100K 75 bp reads?

                            I tried to upload the bed and wig files output by tophat to the genome browser and it didn't like either of them, it gave the following errors:

                            > Error File 'junctions.bed' - Unrecognized format line 2 of custom track: gi|94389945|ref|NT_039515.6|Mm11_39555_37 19897107 19897630 JUNC00000001 1 + 19897107 19897630 255,0,0 2 47,28 0,495 (note: chrom names are case sensitive)

                            > Error File 'coverage.wig' - Unrecognized format type=bedGraph line 2 of custom track

                            Am I missing something?

                            Thanks again for your help!

                            Comment


                            • #29
                              The track errors are easy to resolve - UCSC expects user-supplied tracks to have chromosome names that it knows about. This typically means "chr1", "chrX", etc. One way to guarantee compatibility between TopHat, Cufflinks, and UCSC is to map reads against a Bowtie index built from UCSC chromosomes. You can always convert them after each run, but this is annoying, IMO.

                              Also: I agree, the junction count is disturbingly low. In the data you sent me, approximately 60% of them were contiguously mappable by TopHat. I just went and ran a handful of the remaining unmappable ones through BLAT against mm9, and I didn't turn up any plausible spliced alignments, just a bunch of relatively low-identity hits to repeats, chrM, etc. Have you tried running the whole set through BLAT? It didn't seem like TopHat was missing junctions left and right, but then again, neither TopHat nor Bowtie are designed for 454 reads, so it's worth running an independent check.

                              I'm happy to look at this more with you, but we should probably take this offline at this point. Please email me directly if you want to continue looking at it together.

                              Comment


                              • #30
                                Originally posted by Cole Trapnell View Post
                                Also: I agree, the junction count is disturbingly low.
                                Tophat doesn't look like it will work for me so I put together a plan to do it another way using bowtie. Then I found out that on my mac pro, I am getting an alignment speed of approximately 60 reads/hour. That's significantly less than the 25,000,000/hour that I expected. This run of one sequence against the m_musculus index that I downloaded from the bowtie site takes about 45 seconds. Here's the command I used. Any idea why it would take so long? I rebooted to make sure there was nothing else taxing the resources.

                                bowtie -c /Applications/bowtie-0.10.0/indexes/m_musculus GAAAGTCATGCGTTTCAAGTTTGGCAAGGAATAGAAACAGACGGGCTTATGAAAATAAGGAAAACATCACCCCCAGGCG

                                That sequence should have no spaces, I don't know why the forum inserts a space in the middle of it...

                                Thanks in advance for any suggestions.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Advancing Precision Medicine for Rare Diseases in Children
                                  by seqadmin




                                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                  12-16-2024, 07:57 AM
                                • seqadmin
                                  Recent Advances in Sequencing Technologies
                                  by seqadmin



                                  Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                  Long-Read Sequencing
                                  Long-read sequencing has seen remarkable advancements,...
                                  12-02-2024, 01:49 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 12-17-2024, 10:28 AM
                                0 responses
                                33 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-13-2024, 08:24 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-12-2024, 07:41 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-11-2024, 07:45 AM
                                0 responses
                                46 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X