Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Didn't happen in the previous version so I'm guessing its a result of the fix they put in for my last issue. I'm impressed with how fast they are turning around fixes so it makes it worth the effort.

    Comment


    • #17
      Customer service makes all the difference! I don't know if you have ever tried to get help from the Tophat/cufflinks people but man...those guys are impossible. So far Wei has been on top of things. The same goes for the BWA, STAR and RSEM devs. I've had pretty quick responses from all of them.

      It's sort of convenient for me that these subread people are just getting to work when I'm about to go to sleep. I get back to the lab the next day and they've fixed stuff during the night.
      /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
      Salk Institute for Biological Studies, La Jolla, CA, USA */

      Comment


      • #18
        Dear sdriscoll and Jon,

        Many thanks for your nice comments. We really appreciate you putting up with the bugs and helping us to improve our programs.

        As you said, the sam output bug was introduced in v1.3.5. We have fixed it in v1.3.5-p1. We also enhanced the subread-buildindex to let it check the integrity of the provided reference sequences and report any unexpected characters in a more informative way.

        The latest version v1.3.5-p1 can be downloaded from http://subread.sourceforge.net . We have done a more thorough test by using much bigger test datasets. Hope it works for you. But please let me know if found any other bugs.

        Best wishes,

        Wei

        Comment


        • #19
          Running a test tonight...

          Comment


          • #20
            Hi Shi,

            Just wanted to thank you this last patch seems to be much improved. Now moving on to test featureCounts.

            Comment


            • #21
              Dear Jon,

              No worries. Thanks for letting me know.

              Please make sure you are using the latest version (1.3.5-p3). We made changed to featureCounts today. Let me know if you run into any problems.

              Best wishes,

              Wei

              Comment


              • #22
                Hello!

                I've hit a problem with subread-align. I receive the following output from subread-align when trying to map paired end RNAseq data.

                Code:
                $ subread-align -r sample_R1.trimmed.fq.gz -R sample_R2.trimmed.fq.gz -o sample.bam -i ../refseqs/TAIR10_gen/TAIR10_gen
                
                Number of selected subreads = 10
                Consensus threshold = 3
                Number of threads=1
                Number of indels allowed=5
                
                
                Performing paired-end alignment:
                Maximum fragment length=600
                Minimum fragment length=50
                Threshold on number of subreads for a successful mapping (the minor end in the pair)=1
                Number of anchors=10
                The directions of the two input files are: forward, reversed
                
                Out of memory. If you are using Rsubread in R, please save your working environment and restart R.
                This is on the following platform:
                Code:
                $ uname -a
                Linux host 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux
                
                $ free -m
                             total       used       free     shared    buffers     cached
                Mem:        129161      56634      72526          0        196      54215
                -/+ buffers/cache:       2222     126938
                Swap:        95356         12      95344
                subread-align -v gives "Subread 1.3.5-p4", even though it's from the 1.3.5-p5 tarball


                I can attach the Fastq files if you would like, as these are ~7mb test files i've used to debug my analysis pipeline.

                Kevin

                Comment


                • #23
                  Dear Kevin,

                  It seems you provided gzipped fastq file to subread-aligner for alignment. But subread-align does not support gzipped input files. Please unzip them and run it again to see if you still have the same problem.

                  Best wishes,

                  Wei

                  Comment


                  • #24
                    Originally posted by shi View Post
                    Dear Kevin,

                    It seems you provided gzipped fastq file to subread-aligner for alignment. But subread-align does not support gzipped input files. Please unzip them and run it again to see if you still have the same problem.

                    Best wishes,

                    Wei
                    Hi Wei,

                    That fixed the problem, thanks very much. Apologies for the stupid question.

                    Kevin

                    Comment


                    • #25
                      There is no stupid question here. I'm glad that fixed the problem.

                      Cheers,
                      Wei

                      Comment


                      • #26
                        Hi Shi,

                        how one should run subjunc for a RNA-seq experiment such that the (i) input reads should have only on insertion, and (ii) the maximum length of the insertion is specified by the user (or how to specify the maximum length of a intron; this is needed because this depends from organism to organism)!

                        Comment


                        • #27
                          Hi @ndaniel,

                          I don't quite understand your questions. It looks like you asked how to specify the maximum intron size in subjunc? Firstly, an exon-spanning read may span more than one exon, so what do you want to limit the detection of introns in each read to only one intron? One of the strengths of subjunc is that it can detect up to 4 introns in each read.

                          Secondly, you do not know what is the maximun length of introns in your data, so you'd better let subjunc detect it for you. Subjunc uses donor/receptor sites to accurately detect the boundaries of introns.

                          Wei

                          Comment


                          • #28
                            Originally posted by shi View Post
                            Hi @ndaniel,

                            I don't quite understand your questions. It looks like you asked how to specify the maximum intron size in subjunc?

                            Wei
                            Sorry for not explaining very well my question. :-(

                            Yes, what is the maximum size of the intron which subjunc can handle? Is there a (affine?) penalty related to the intron length? Does subjuncs weights an intron of length 10,000,000 bp long as one of 10,000 bp long? Is subjuncs able to find an intron of length 10,000,000 bp long? What is the minimum read overhang which subjuncs can handle (e.g. 10bp, 17bp, 20bp)?

                            How subjuncs treats this case when a read of 100 bp is split in 80+20 with an intron of length (a) 1,000 bp, or (b) 100,000 bp (that is 20bp maps 1,000 bp away from the 80 bp or 20bp maps equally well to 100,000 bp away) ?

                            Originally posted by shi View Post
                            Firstly, an exon-spanning read may span more than one exoni
                            I know but my question is not about those kind of reads. I am interested only in reads which spans two and only two exons (and one intron).

                            Originally posted by shi View Post
                            so what do you want to limit the detection of introns in each read to only one intron?
                            Yes. :-)

                            Originally posted by shi View Post
                            Secondly, you do not know what is the maximun length of introns in your data
                            Yes, I do know the maximum length of the introns in my data. Actually there are really exact estimates for annotated genomes about this!

                            Originally posted by shi View Post
                            Hi @ndaniel,

                            so you'd better let subjunc detect it for you
                            I prefer to look/search for introns which have their lengths within a given range in order to limit the search space for subjuncs.

                            Originally posted by shi View Post
                            Hi @ndaniel,
                            Subjunc uses donor/receptor sites to accurately detect the boundaries of introns.
                            Is able subjuncs look for boundaries of introns without using the donor/receptor sites (i.e. conventional sites)? Or does subjuncs allow to weight equally the not-conventional donor/acceptor sites and the conventional ones?
                            Last edited by ndaniel; 09-24-2014, 10:28 AM.

                            Comment


                            • #29
                              Yes, what is the maximum size of the intron which subjunc can handle? Is there a (affine?) penalty related to the intron length? Does subjuncs weights an intron of length 10,000,000 bp long as one of 10,000 bp long? Is subjuncs able to find an intron of length 10,000,000 bp long? What is the minimum read overhang which subjuncs can handle (e.g. 10bp, 17bp, 20bp)?
                              The maximum allowed intron size is 500,000 bases in subjunc. There is no penalty applied for intron length. Long introns and short introns are treated in the same manner. Subjunc is capable of detecting introns at any position of the reads.

                              How subjuncs treats this case when a read of 100 bp is split in 80+20 with an intron of length (a) 1,000 bp, or (b) 100,000 bp (that is 20bp maps 1,000 bp away from the 80 bp or 20bp maps equally well to 100,000 bp away) ?
                              Subjunc treats them as the equally best mapping locations.

                              I prefer to look/search for introns which have their lengths within a given range in order to limit the search space for subjuncs.
                              Subjunc is very fast. You do not need to limit the search space for it.

                              Is able subjuncs look for boundaries of introns without using the donor/receptor sites (i.e. conventional sites)? Or does subjuncs allow to weight equally the not-conventional donor/acceptor sites and the conventional ones?
                              Use the '--allJunctions' option, which allows the detection of exon splicing that uses non-canonical donor/receptor sites.

                              Comment


                              • #30
                                Originally posted by shi View Post
                                The maximum allowed intron size is 500,000 bases in subjunc. There is no penalty applied for intron length. Long introns and short introns are treated in the same manner. Subjunc is capable of detecting introns at any position of the reads.



                                Subjunc treats them as the equally best mapping locations.



                                Subjunc is very fast. You do not need to limit the search space for it.



                                Use the '--allJunctions' option, which allows the detection of exon splicing that uses non-canonical donor/receptor sites.
                                Thanks Shi! The answers are really great (for me at least)!

                                Is there any way for the user to change the 500,000 bp limit (besides doing changes in the source code)?

                                Also what is the minimum overhang of a read which subjuncs is able to handle (for example, is it able to map/split a read of 100 bp as: 80bp+20bp, or 83bp+17bp, or 85bp+15bp, or 90bp+10bp, or 95bp+5bp)? There must be a limit for overhang and as far as I know no aligner would split a read of 100bp as 95bp+5bp (here most of the aligners would just soft clip the last 5bp and this is ok)!
                                Last edited by ndaniel; 09-26-2014, 11:22 PM.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                18 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                22 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                17 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                48 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X