Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • pbrand
    Member
    • Feb 2012
    • 13

    SOAPdenovo-trans alternative splicing

    Hi,
    I am working with several assemblers to find the best one for my RNA-Seq data.
    Besides Trinity and Oases I used SOAPdenovo -trans.

    While Oases found massive sequences that have possible alternative splice products, SOAPdenovo-trans did not find a single one. I used 12 different k-mers from 19 to 89, e 1,3,5 and d 1,3,5 with all combinations. I allowed up to 10 alternative splicing products.

    Is this behavior normal for this program?

    Cheers,
    Philipp
  • Kate.W
    Member
    • Aug 2012
    • 10

    #2
    Hello,

    I have noticed some weird behaviour too using soapdenovo-Trans, still I can't answer to your question. Anyhow, how could you try several k-mer sizes going from 19 to 89 as, I believe, Soapdenovo-trans is limited to 31? Cheers,

    K8

    Comment

    • Kate.W
      Member
      • Aug 2012
      • 10

      #3
      oops... didn't see the SOAPdenovo-Trans-127mer file...

      Comment

      • Jeremy
        Senior Member
        • Nov 2009
        • 190

        #4
        Which file is splice variants supposed to be in? I don't even see a file that would contain that information in my output data

        This thread claims that the trans and regular SOAP are giving the same output.

        I tried the 31kmer version and the 127mer version of this and in both cases I do not get the sequence of all the contigs. The .readOnContig and the .cnt2Read files show that all the contigs have reads but the .contig file is missing the sequence data of many contigs, including several with a high read count.

        Comment

        • pbrand
          Member
          • Feb 2012
          • 13

          #5
          Hi Jeremy,
          the variants, should be saved to the .scafSeq file. But with my data I didn't manage to get any of them. Maybe it is because I used single-end reads..
          What kind of reads do you have?

          Philipp

          Comment

          • Jeremy
            Senior Member
            • Nov 2009
            • 190

            #6
            I have paired end reads. Ah yes I see them, looks like I got a few splice variants. The locus numbering is not consecutive. Is it the same for you?
            Code:
            >scaffold1 Locus_0_0 5 891 COMPLEX
            153823     0          -   175 
            177411     154        -   246 
            169783     410        +   212 
            125249     614        +   133 
            122882     760        +   131 
            >scaffold2 Locus_0_1 2 478 COMPLEX
            153823     0          -   175 
            171731     260        +   218 
            >scaffold3 Locus_0_2 4 783 COMPLEX
            169195     0          +   210 
            169783     302        +   212 
            125249     506        +   133 
            122882     652        +   131 
            >scaffold4 Locus_1_0 2 406 LINEAR
            122884     0          +   131 
            154865     260        -   177 
            >scaffold5 Locus_4_0 3 698 LINEAR
            174798     0          +   230 
            180285     272        -   274 
            122890     598        +   131 
            >scaffold6 Locus_5_0 3 490 LINEAR
            158619     0          -   184 
            164579     190        +   197 
            122892     390        +   131 
            >scaffold7 Locus_6_0 2 354 LINEAR
            125953     0          +   134 
            122894     254        +   131 
            >scaffold8 Locus_8_0 2 428 LINEAR
            122898     0          +   131 
            168645     251        +   208
            Is there some site or blog that gives all the details of the output files?

            Comment

            • pbrand
              Member
              • Feb 2012
              • 13

              #7
              As I said, I haven't managed to get any splice variants so I can't say anything about it
              I also haven't found a site with suitable information on the outputs, yet.
              But there is a command that controls the amount of splice variants. -t it's 5 on default. Maybe your results change when you increase the -t value.

              Could you post your configuration file? I am curious to see whether I made a mistake writing mine.

              Philipp

              Comment

              • Jeremy
                Senior Member
                • Nov 2009
                • 190

                #8
                Even without splice variants you should still get locus information in the .scaff file right? I only just started playing with the program so for the moment everything is almost on default. I did change the -G option though since I noticed that my insert sizes have a wider spread than 50.

                config
                Code:
                max_rd_len=150
                [LIB]
                avg_ins=320
                asm_flags=3
                reverse_seq=0
                rank=1
                q1=*file.fastq
                q2=*file.fastq
                commands
                Code:
                SOAPdenovo-Trans-31kmer all -s config -K 31 -G 100 -o *file
                I think I just figured out how the contigs work, for something as strange as that they really should have some output descriptions.

                The .newcontigindex lists all contigs in consecutive order (no missing numbers), both the .readoncontig and .cnt2read files show that reads were used to makes all contigs BUT the .contig file only has about half the contigs. The .newcontigindex has a 2 for contigs that I do get a sequence for and a 0 for contigs that are not in the output file. I think contigs with a 0 are assembled using the reverse reads then the reverse complement is integrated into the forward contigs.

                But, the confusing part for me was that not all of my contigs had a reverse complement. This information is in .contigindex which lists how many reverse complements each of the forward contigs has. I have 49 forward contigs without a reverse complement making the contig numbering system in the .contig file appear random. I can't find the file that lists which contig was the reverse complement to which, based on read count per contig it looks to be often consecutively numbered contigs, but not always. sigh.

                Would be nice to know what the headers are for the .links file too ...

                I think I'll just try a few other programs, I have no idea exactly what this one did.
                Last edited by Jeremy; 10-10-2012, 12:53 AM.

                Comment

                • pbrand
                  Member
                  • Feb 2012
                  • 13

                  #9
                  Strangely, I do not have entries in .ctg2read, .readONcontig and .links..
                  It must have something to do with single-end paired-end libraries, because my config file doesn't seem to be incomplete.

                  I also worked with Trinity and Oases and both did a better job than SOAP, anyway.
                  Maybe this thread helps http://seqanswers.com/forums/showthread.php?t=17959

                  Cheers

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM
                  • SEQadmin2
                    Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                    by SEQadmin2


                    With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                    Introduction

                    Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                    05-22-2026, 06:42 AM
                  • SEQadmin2
                    Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                    by SEQadmin2

                    Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                    Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                    05-06-2026, 09:04 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, Today, 08:59 AM
                  0 responses
                  11 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  21 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  17 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-28-2026, 11:40 AM
                  0 responses
                  31 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...