Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Align paired and unpaired reads with Tophat

    I have a stranded PE RNAseq data set that I want to align with tophat using the --library-type fr-firststrand option. After adaptor trimming I end up with my 4 files:

    paired_1
    paired_2
    unpaired_1
    unpaired_2

    Is there a way to align these so I do not loose the strand information for the unpaired reads? When I run tophat with the files listed like this:

    paired_1,unpaired_1 paired_2,unpaired_2

    It seems to want to try and align the two unpaired files as paired files. If I combine the two unpaired files and run tophat with the files listed like this:

    paired_1 paired_2,unpaired

    It recognizes the last file as unpaired. But am I loosing my strand specificity by aligning this way? Thanks.

  • #2
    Inputting the files like:
    Code:
    paired_1,unpaired_1 paired_2,unpaired_2
    Will result in "unpaired_1" and "unpaired_2" being treated as paired, which is the opposite of what you want. To keep the strandedness correct you'll need to run things twice. Firstly using "paired_1,unpaired_1 paired_2" with library-type set to fr-firststrand and then just "unpaired_2" by itself with "fr-secondstrand". I should note that aligning unpaired_2 is usually not worthwhile (the reads are often crap), but perhaps you'll get luckier than I have with that.

    Comment


    • #3
      Thank you very much for your response, that makes a lot of sense.

      Comment


      • #4
        On this note ..I Have a question. I am doing PE RNA-Seq analysis of mouse data. My read length is 90bp and fragment size is 127 bp which means I have overlapping reads. I used Flash and found that not all reads overlap.

        I have a file with merged reads which overlaps and also two files with non overlap reads. basically three fastq files.

        How do I go about running Tophat with this and I am not really sure hot to calculate the mean inner mate distance and sd??

        Has anyone come across this situation???

        Comment


        • #5
          There's no need to run Flash on them, just use tophat2 and bowtie2 instead of bowtie1.

          For the mean inner distance, just try 0 and see if that produces acceptable results (I recall reading that tophat re-estimates the insert length as it runs, though I can't say I've ever checked if that's correct).

          Comment


          • #6
            Thanks alot Ryan... I will try that.

            Comment


            • #7
              or just use STAR that do not need to specify inner distance. and also is much faster for the same ( even better ) results

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              31 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X