Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by CPCantalapiedra View Post
    I am sure I am missing the true naming convention of the foundlinks file (I mean, r1 f1 does mean contig1 in formattedcontigs file, and so on?). Any light on this please?
    Yes, this is indeed the case.

    I'm not sure if I have to reply to the remaining questions, please contact me with a personal message if you need further help.

    Regards,
    Boetsie

    Comment


    • I am using SSPACE to improve a genome assembly, and unfortunately it is giving me results that conflict with a genetic linkage map that I have made. I am trying to figure out what is causing the discrepancy. My organism has a repetitive genome so I suspect that is playing into it. Here are my questions:

      1) I've read that SSPACE does not use reads that map to multiple locations within the genome. How does it obtain this information? Does it map to the entire scaffold or just the scaffold edges?

      2) Is there a way to extract the exact positions where pairs used for scaffolding are mapped?

      3) My ratio of pairs that satisfy:do not satistify distance/logic requirements within contigs is way different than for pairs that map to different contigs. Is this normal? For example:

      Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 600 +/-570): 110771
      Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 300
      Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 1897
      ---
      Satisfied in distance/logic within a given contig pair (pre-scaffold): 6691
      Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 56729

      Thanks so much for your help and for writing a very useful program!

      Comment


      • Originally posted by pjuneja View Post
        1) I've read that SSPACE does not use reads that map to multiple locations within the genome. How does it obtain this information? Does it map to the entire scaffold or just the scaffold edges?
        It only uses the edges of the contigs/scaffolds based on the max insert size you have provided (insert size + (insert size * stdev)).

        Originally posted by pjuneja View Post
        2) Is there a way to extract the exact positions where pairs used for scaffolding are mapped?
        No, I'm sorry, this can not be extracted from SSPACE. Only pairs that could not map correctly are stored in the folder 'pairinfo'.

        You could of course map the reads to the edges yourself. The edges are in the 'alignoutput' folder, and the reads in the 'reads' folder.

        Originally posted by pjuneja View Post
        3) My ratio of pairs that satisfy:do not satistify distance/logic requirements within contigs is way different than for pairs that map to different contigs. Is this normal? For example:

        Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 600 +/-570): 110771
        Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 300
        Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 1897
        ---
        Satisfied in distance/logic within a given contig pair (pre-scaffold): 6691
        Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 56729

        Thanks so much for your help and for writing a very useful program!
        Well, it is common that the number of pairs within a contig is higher than the number of pairs between pairs. Though, the number of pairs between two contigs is very high. I can't tell you why this is the case..

        Regards,
        Boetsie

        Comment


        • Hi,
          I'm using SSPACE to extend a hybrid (454 + illumina PE) denovo assembly using more illumina PE reads. I'm running into a problem with PERL after the mapping and extension:
          System is UBUNTU 12.04 64 bit. Looks like this:

          =>Mon Apr 15 18:27:32 2013: Reading, filtering and converting input sequences of library file initiated

          ------------------------------------------------------------

          =>Mon Apr 15 18:54:15 2013: Building Bowtie index for contigs

          =>Mon Apr 15 18:54:20 2013: Mapping reads to Bowtie index

          =>Mon Apr 15 19:30:28 2013: Contig extension initiated

          LIBRARY Lib1
          ------------------------------------------------------------

          =>Mon Apr 15 19:48:03 2013: Reading contig file

          =>Mon Apr 15 19:48:03 2013: Building Bowtie index for contigs

          =>Mon Apr 15 19:48:08 2013: Mapping reads to contigs. Reading bowtie output and pairing contigs

          =>Mon Apr 15 20:19:00 2013: Building scaffolds file

          =>Mon Apr 15 20:19:01 2013: Merging contigs and creating fasta file of scaffolds
          100Quantifier follows nothing in regex; marked by <-- HERE in m/* <-- HERE TTAAAAAA*CGTTTCTAACAGCTCTAGCAATATTCTAATTTCGAAAGT/ at /home/mron003/Programs/SSPACE-BASIC-2.0_linux-x86_64/SSPACE_Basic_v2.0.pl line 447, <IN> line 102.

          Any ideas on what's going on?
          Cheers,

          Miguel

          Comment


          • Will the single end reads specified with the -u option be incorporated if -x is set to 0?

            And is there a way to tell from the output files of SSPACE if these reads were used?

            Comment


            • SSPACE error

              Dear all,

              I have a pair end 454 library, which I extract pair end sequence by myself into two files like this:

              left
              >H68R2DI01DH3A5/1
              TTTCAAAGGAGATTGTCTGATAACTTCTCAAGAAAGAGAGCGTATGAATAGAGTTCCATATGCTTTGGCAG
              >H68R2DI01DUQPJ/1
              TTTCAAAGGAGATTGTCTGATAACTTCTCAAGAAAGAGAGCGTATGAATAGAGTTCCATATGCTTTGGCAG
              >H68R2DI01DYR3Y/1
              ACAATCTTCCTATACCAATCAAAATGACCATCTAGCAATGATATCCGATGTTCGGATAGGTCAAAAGATTGCAAAGTATCATTCAAGAACCTATTGGCAT


              right
              >H68R2DI01DH3A5/2
              CTTGAAAATCAAAAGGCCGTATATGATAGGGCCGGACTTGGCTATAACCCTAC
              >H68R2DI01DUQPJ/2
              CGTAAAGAAACTAAAGTCTCGTAAAGTAAAATTTATTTAGTAAGTTAAATTTACTTAACGTAAAGTTAAAGTTAACGTTACCCTAAACCTAAATTAACCT
              >H68R2DI01DYR3Y/2
              GAGAACTGTGATGACAATCAAACTTTTATTCTCTGTAATGTAGGGATATCATTTTTGTATTAAGAGAATGTCATCGACATAC
              >H68R2DI01DJ0QZ/2
              AATAAATATACATATTCAATGCAACAATGAATAGGTACTCCTTGAAGTTTAAAAATCATATAAATT



              Then, I run SSPACE like this:

              perl /home/jingjing/software/SSPACE-BASIC-2.0_linux-x86_64/SSPACE_Basic_v2.0.pl -l library.txt -s ../oilpalm.gapclose.fa -k 5 -a 0.7 -x 0 -b oil_palm_no_extension


              the library file is like this:

              lib1 /backup/454/left_short.fa /backup/454/right_short.fa 20000 0.35 RF


              However, for the log is very strange:

              Your inserted inputs on [SSPACE_Basic_v2.0_linux] at Tue May 7 04:54:10 2013:
              Required inputs:
              -l = library.txt
              -s = ../oilpalm.gapclose.fa
              -b = oil_palm_no_extension

              Optional inputs:
              -x = 0
              -z = 0
              -k = 5
              -a = 0.7
              -n = 15
              -T = 1
              -p = 0


              =>Tue May 7 04:54:10 2013: Reading, filtering and converting input sequences of library file initiated
              Reading read-pairs lib1.1 @ 0 //there are no pair reads

              ------------------------------------------------------------

              =>Tue May 7 04:54:12 2013: Storing contigs to format for scaffolding

              LIBRARY lib1
              ------------------------------------------------------------

              =>Tue May 7 04:56:33 2013: Reading contig file

              =>Tue May 7 04:57:07 2013: Building Bowtie index for contigs


              In the reads folder, I can find it correct parse the reads:

              [jingjing@tll-bioinfo02 reads]$ less -h 5 oil_palm_no_extension.lib1.file1.fa
              >read0/1
              TTTCAAAGGAGATTGTCTGATAACTTCTCAAGAAAGAGAGCGTATGAATAGAGTTCCATATGCTTTGGCAG
              >read0/2
              CTTGAAAATCAAAAGGCCGTATATGATAGGGCCGGACTTGGCTATAACCCTAC
              >read1/1
              TTTCAAAGGAGATTGTCTGATAACTTCTCAAGAAAGAGAGCGTATGAATAGAGTTCCATATGCTTTGGCAG
              >read1/2
              CGTAAAGAAACTAAAGTCTCGTAAAGTAAAATTTATTTAGTAAGTTAAATTTACTTAACGTAAAGTTAAAGTTAACGTTACCCTAAACCTAAATTAACCT
              >read2/1
              ACAATCTTCCTATACCAATCAAAATGACCATCTAGCAATGATATCCGATGTTCGGATAGGTCAAAAGATTGCAAAGTATCATTCAAGAACCTATTGGCAT
              >read2/2
              GAGAACTGTGATGACAATCAAACTTTTATTCTCTGTAATGTAGGGATATCATTTTTGTATTAAGAGAATGTCATCGACATAC


              Can anyone give me some suggestions?

              Jingjing

              Comment


              • Hello boetsie,

                I've posted a new thread about a problem I have when merging overlapping contigs using SSPACE:

                Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                regards,

                seb.


                Originally posted by boetsie View Post
                You could decrease the -a value to 0.5 (meaning that there should at least be 2 times more links) if multiple links are found.

                The -n parameter is useful for merging two contigs. Say you have contigA and contigB, they are scaffolded with a gap of -20bp. Then SSPACE will search for an overlap of -n or more nucleotides:

                contigA
                AGATGATATAAAAGTATAGATTA
                contigB
                ATAAAAGTATAGATTAGGGGTTATGATA

                overlap:
                AGATGATATAAAAGTATAGATTA
                -------ATAAAAGTATAGATTAGGGGTTATGATA


                So if the size of the overlap is above the defined -n parameter, they are merged together;
                AGATGATATAAAAGTATAGATTAGGGGTTATGATA

                regards,
                Boetsie

                Comment


                • Hi seb,

                  could you maybe send me a personal message and show me by an example what you mean?

                  Regards,
                  Boetsie

                  Comment


                  • Hi Boetsie,

                    I am trying to run SSPACE and I am having an error "Can't locate getopts.pl in @INC (@INC contains: /home/annet/Programs/SSPACE/dotlib/ /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /home/annet/Programs/SSPACE/SSPACE_Basic_v2.0.pl line 87."

                    Could you please tell me, where I can find this file getopts.pl??

                    Waiting forward to your answer!

                    Anna

                    Comment


                    • Hmmm, It seems that in the newest version of perl they removed the getopts.pl library (see http://search.cpan.org/~rjbs/perl-5....s_and_Pragmata). At this site they explain how to solve this issue:


                      Hope this helps.
                      Boetsie

                      Originally posted by OTU View Post
                      Hi Boetsie,

                      I am trying to run SSPACE and I am having an error "Can't locate getopts.pl in @INC (@INC contains: /home/annet/Programs/SSPACE/dotlib/ /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /home/annet/Programs/SSPACE/SSPACE_Basic_v2.0.pl line 87."

                      Could you please tell me, where I can find this file getopts.pl??

                      Waiting forward to your answer!

                      Anna

                      Comment


                      • Thank you, Boetsie! It helped!

                        Anna

                        Comment


                        • Boetsie,

                          I am having troubles with my input data. At the very beginning of the run I get an error:
                          >> Can't write to single file filereads//home/annet/output6/TM7.Lib1.filtered.readpairs.singles.fasta-- fatal

                          What can it be about? My data consists of two fastq FR sequences and a fasta contig data.

                          Anna

                          Comment


                          • Hi,Anna
                            Did you resolve your question now?and how?
                            I have the same question as yours.

                            xu
                            Originally posted by OTU View Post
                            Boetsie,

                            I am having troubles with my input data. At the very beginning of the run I get an error:
                            >> Can't write to single file filereads//home/annet/output6/TM7.Lib1.filtered.readpairs.singles.fasta-- fatal

                            What can it be about? My data consists of two fastq FR sequences and a fasta contig data.

                            Anna

                            Comment


                            • Xu,

                              Have you specified the output directory in your command line?
                              If yes - delete it.

                              Comment


                              • Hi, Anna

                                That's right! Thank you!


                                Originally posted by OTU View Post
                                Xu,

                                Have you specified the output directory in your command line?
                                If yes - delete it.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-27-2024, 06:37 PM
                                0 responses
                                12 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-27-2024, 06:07 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                52 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X