Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Is there any way for the user to change the 500,000 bp limit (besides doing changes in the source code)?
    This is currently hard coded, although it is possible that we allow users to change it in the future. But I believe the best way is possibly to let subjunc determine the intron length for you and you can perform filtering in subjunc output later. Subjunc does not have bias toward long or short introns.

    Also what is the minimum overhang of a read which subjuncs is able to handle (for example, is it able to map/split a read of 100 bp as: 80bp+20bp, or 83bp+17bp, or 85bp+15bp, or 90bp+10bp, or 95bp+5bp)? There must be a limit for overhang and as far as I know no aligner would split a read of 100bp as 95bp+5bp (here most of the aligners would just soft clip the last 5bp and this is ok)!
    As I said in my last reply, subjunc can detect the splicing site at any location of the read. It can split a 100bp read as 95bp+5bp, or even 99bp+1bp, if a confident splicing site was discovered. Subjunc achieves this by firstly generating a complete list of splicing sites by using all high-confidence junction reads (these reads typically contain splicing sites at the middle positions of the reads), and then re-aligning all the reads using these discovered splicing sites. Have a look at the paper below for more details about the algorithm:

    Read alignment is an ongoing challenge for the analysis of data from sequencing technologies. This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome. The new strategy chooses the mapped genomic location for the read directly from …

    Comment


    • #32
      Originally posted by shi View Post
      This is currently hard coded, although it is possible that we allow users to change it in the future. But I believe the best way is possibly to let subjunc determine the intron length for you and you can perform filtering in subjunc output later. Subjunc does not have bias toward long or short introns.



      As I said in my last reply, subjunc can detect the splicing site at any location of the read. It can split a 100bp read as 95bp+5bp, or even 99bp+1bp, if a confident splicing site was discovered. Subjunc achieves this by firstly generating a complete list of splicing sites by using all high-confidence junction reads (these reads typically contain splicing sites at the middle positions of the reads), and then re-aligning all the reads using these discovered splicing sites. Have a look at the paper below for more details about the algorithm:

      http://www.ncbi.nlm.nih.gov/pubmed/23558742
      Thanks! Definitely I will try subjuncs!

      Comment


      • #33
        So subjunc is now able to find non canonical junctions ?
        Does this influence the aligment ?

        Because the subjunc help still says:

        Subjunc requires donor/receptor sites to be present when detecting exon-exon junctions. It can detect up to four junction locations in each exon-spanning read.


        Are there plans to make subjunc support annotation files ?

        What happens when of a mate pair only one read was mapped are both reported as unpaired ?

        Which settings would you recommend for 150bp single end reads ?

        When building the reference index ungapped i observed a decrease in the percentage of aligned reads when compared to a gapped index. Does this make sense ?


        Many thanks for your help

        Comment


        • #34
          So subjunc is now able to find non canonical junctions ?
          Does this influence the aligment ?

          Because the subjunc help still says:

          Subjunc requires donor/receptor sites to be present when detecting exon-exon junctions. It can detect up to four junction locations in each exon-spanning read.
          Yes, subjunc can now detect non-canonical exon-exon junctions. Use the "--allJunctions" option. This will affect your alignment in that not only more junctions will be reported, but will more exon spanning reads be reported. However because the alignment now becomes more aggressive, you may have a increase on false alignments as well. You do not need to do this if you just want to perform an expression analysis.

          Are there plans to make subjunc support annotation files ?
          Yes, this is on our to-do list. Hope this will further improve subjunc's accuracy on junction detection. Subjunc collects all candidate junction locations from the initial scan of reads and then uses the collected junctions to realign all the reads and to remove spurious junctions. We found this already works very well.

          What happens when of a mate pair only one read was mapped are both reported as unpaired ?
          Not sure what your question was. If only one read was mapped, this read will be reported as unpaired. But the unmapped read from the same pair will also be reported along with the mapped one.

          Which settings would you recommend for 150bp single end reads ?
          The default setting should work well. Subjunc has an excellent scalability and it has no problem in mapping reads of hundreds of bases long.

          When building the reference index ungapped i observed a decrease in the percentage of aligned reads when compared to a gapped index. Does this make sense ?
          Could you please show me your commands with mapping using ungapped and gapped indices and also the percentages of aligned reads from each approach? What is version of Subread package you are using?

          Wei

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          51 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          68 views
          0 likes
          Last Post seqadmin  
          Working...
          X