Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    @Xinwu, My segment length was default 25.
    Last edited by tsucheta; 01-30-2011, 01:31 PM.

    Comment


    • #17
      @plabaj (and anyone else),
      I have read your comment with interest and some concern. I plan to generate SOLID RNA-seq data (paired end 50bp+25bp) and wonder whether Tophat is any better at handling paired end SOLID data. I was planning to utilise Tophat for helping with new transcript/exon discovery. BTW this is all with mouse, so have annotated genome to work with.
      Thanks.

      Comment


      • #18
        Hi All,

        I am trying to align paired-end Solid whole-transcriptome reads (50+35 FR) using Bowtie 0.12.7.

        Strangely, when mapping paired-end, no read pairs (other than a few repeat-regions) map to the genome. In addition, no read pairs map to the transcriptome. (Ensembl genes-based reference).

        When mapping individual reads, about 30% of 50bp reads and 20% of 35bp reads map successfully.

        I must be doing something wrong. Using --ff changes nothing (reads are actually FR. The older Solid mate-pairs were FF) Read csfasta files are 100% pair matched (with read name suffixes _F3 and _F5-BC).

        Examples (using only 1k reads but the results hold for the full set - as well as with exon-only Ensemble genes as reference, one fasta element per gene)
        PE:
        bowtie -f -p 8 --fr -C -S --sam-nohead --sam-nosq -s 1000000 -u 1000 --Q1
        $dir/F3_QV.qual --Q2 $dir/F5_QV.qual -1 $dir/F3.csfasta -2
        $dir/F5.csfasta ~/p2/indexes/bowtie/hg19_c align.sam
        # reads processed: 1000
        # reads with at least one reported alignment: 1 (0.10%)
        # reads that failed to align: 999 (99.90%)
        Reported 1 paired-end alignments to 1 output stream(s)

        The aligned pair:
        17_213_1598_F3 67 chr2 154876116 255 48M = 154876139 56 CACACACACACACACACACACACACACACACACACACACACACACACA LbRL\TMUVBH\TQ.8[[bOM```LG]^_LJ^\_SN]bcca_aaabbW XA:i:1 MD:Z:48 NM:i:0 CM:i:1
        17_213_1598_F5-BC 131 chr2 154876140 255 33M = 154876115 -58 CACACACACACACACACACACACACACAACAGC @NcPIAE`c^E:S_^KI]`cPGZZSED)!E1!3 XA:i:0 MD:Z:29A3 NM:i:1 CM:i:5

        F3-ends (50b) only
        [markus@q34 run]$ bowtie -f -p 8 -C -S --sam-nohead --sam-nosq -s
        1000000 -u 1000 -Q $dir/F3_QV.qual ~/p2/indexes/bowtie/hg19_c
        $dir/F3.csfasta algn.sam
        # reads processed: 1000
        # reads with at least one reported alignment: 319 (31.90%)
        # reads that failed to align: 681 (68.10%)
        Reported 319 alignments to 1 output stream(s)

        F5-ends (35b) only
        [markus@q34 run]$ bowtie -f -p 8 -C -S --sam-nohead --sam-nosq -s
        1000000 -u 1000 -Q $dir/F5_QV.qual ~/p2/indexes/bowtie/hg19_c
        $dir/F5.csfasta algn.sam
        # reads processed: 1000
        # reads with at least one reported alignment: 287 (28.70%)
        # reads that failed to align: 713 (71.30%)
        Reported 287 alignments to 1 output stream(s)

        Running tophat generated a decent result, so plenty of reads should map in pairs.

        I'd be grateful if anyone can help me out, or provide any hint.

        Edit: I got it working now, I might have forgotten about the --fr for the full data set.
        Last edited by mackan; 04-20-2011, 03:46 AM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        17 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        46 views
        0 likes
        Last Post seqadmin  
        Working...
        X