Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Thanks for your prompt reply.

    I get the following error:

    =>Sun Feb 13 22:37:39 2011: Building Bowtie index for contigs (tmp.alboxf_scaffolds_no_extension/subset_contigs.fasta)

    Bowtie-build error; -1 at /scratch/yang/tools/SSPACE-1.1_linux-x86_64/bin/mapWithBowtie.pl line 38.
    WARNING: No scaffolding, because no reads found on contigs

    I believe it might have something to do with bowtie, but I am unsure.

    Thanks again!

    Comment


    • #32
      What should I call?

      SSPACE is very nice tool for us. Thank you for your good job.

      By the way, what should I call SSPACE?

      es es pace?
      es pace?
      es space?

      Regards.

      Comment


      • #33
        Yes that's a common problem. What version do you have from SSPACE?

        The problem was mainly solved by going through the directory were the main SSPACE script (SSPACE_v1-x.pl) and folders are stored using the command line. Then, do one of the following;

        chmod a+x bowtie/*

        or

        chmod 777 *

        in your command line.

        If this won't work, then you may try to download the newest Bowtie version at http://sourceforge.net/projects/bowt...bowtie/0.12.7/

        Replace the files in the bowtie folder with the ones you've downloaded.

        Kind regards,
        Boetsie

        Comment


        • #34
          Downloading the newest version of bowtie worked (I am using SSPACE-1.1_linux-x86_64). Also, I had extra annotation in my reference (assembled file) and that screwed up bowtie as well (if anyone else runs into the same problem).

          Thanks again!

          Comment


          • #35
            Error with '-a' and insert stdev values

            I'm getting the following error when running the SSPACE perl script using: -a = 0.70 (default) and insert stdev of 0.50:

            Code:
            ERROR: -a must be a number between 0.00 and 1.00. Your inserted -a is .70 ...Exiting.
            ERROR: Insert stdev must be a number between 0.00 and 1.00. Your library lib1 has insert size of 0.50. Exiting.
            Here are the contents of library.txt:
            Code:
            lib1 s_6_1_sequence.txt s_6_2_sequence.txt 250 0.50 0
            and the command that was run:
            Code:
            perl SSPACE_v1-1.pl -l libraries.txt -s sk2_originalreads_contigs.fa -x 0 -m 32 -o 20 -t 0 -k 5 -n 15 -p 1 -v 0 -b sk2_origreads_no_extension

            This was run on a 64-bit OSX server w/ 32gb RAM.


            Edit: I believe this issue was corrected by correcting the permissions on the files involved. However, I'm having the same issue as the user above: WARNING: No scaffolding, because no reads found on contigs
            Edit #2: Nevermind - changed permissions to 777 in the directories took care of this issue.

            Thanks,


            Rsw3284
            Last edited by rsw3284; 02-16-2011, 09:51 AM.

            Comment


            • #36
              Hi rsw3284,

              is it fixed now? To be honest, we did not test SSPACE on a MacOSX 64 bit server, only on a 32-bit server. However, the above problems are looking more like a perl problem rather than a SSPACE problem.

              Boetsie

              Comment


              • #37
                Yes, it's working just fine now. Thanks!



                - Rsw3284

                Comment


                • #38
                  Hi boetsie,
                  thank you for the SSPACE. I have a question while reading the MANUAL file coming with SSPACE:

                  The libraries.txt file contains information about each library. For each library, column 2 and 3 are Fasta or fastq files for both ends. Should these fasta/fastq files be different files? But I found, in MANUAL file, this example:

                  Lib1 file1.fasta file2.fasta 400 0.5 1
                  Lib1 file2.fasta file2.fasta 400 0.5 1
                  Lib2 file3.fastq file3.fastq 4000 0.75 0

                  I'm a bit confused. In what kind of cases, file2.fasta/ file3.fastq can be placed in both column 2 and 3?

                  Comment


                  • #39
                    Hi Hliang,

                    Thank you for your question, i see some mistakes there in the MANUAL.

                    About your question;

                    Column 2 and 3 should always be the same in one line. For example, if the file with the first reads are fastA, then the file with the second reads should also be fastA

                    However, if you have multiple library files, you might also have paired reads in fastQ format, which could also be used;

                    so, these libraries are ok:

                    lib1 file1.1.fastA file1.2.fastA 400 0.5 0
                    lib1 file2.1.fastQ file2.2.fastQ 400 0.5 0

                    While these are not correct;
                    lib1 file1.1.fastA file1.2.fastQ 400 0.5 0
                    lib1 file2.1.fastQ file2.2.fastA 400 0.5 0

                    Is this what you mean?

                    Kind regards,
                    Boetsie


                    Originally posted by hliang View Post
                    Hi boetsie,
                    thank you for the SSPACE. I have a question while reading the MANUAL file coming with SSPACE:

                    The libraries.txt file contains information about each library. For each library, column 2 and 3 are Fasta or fastq files for both ends. Should these fasta/fastq files be different files? But I found, in MANUAL file, this example:

                    Lib1 file1.fasta file2.fasta 400 0.5 1
                    Lib1 file2.fasta file2.fasta 400 0.5 1
                    Lib2 file3.fastq file3.fastq 4000 0.75 0

                    I'm a bit confused. In what kind of cases, file2.fasta/ file3.fastq can be placed in both column 2 and 3?

                    Comment


                    • #40
                      Thanks for the info.

                      So column 2 and column 3 should be PAIRED and have the same file format ?

                      can I concatenate (separate the paired-end sequences by ":" ) file1.1.fastA and file1.2.fastA into one single file file_combo.fastA, and use the following line?
                      lib1 file_combo.fastA file_combo.fastA 400 0.5 0

                      One more question: is SSPACE suitable for scaffolding using 454 paired-end data? 454 paired-end reads are longer than illumina/solexa reads and have a mix of different lengths (200-500 bp).


                      Originally posted by boetsie View Post
                      Hi Hliang,

                      Thank you for your question, i see some mistakes there in the MANUAL.

                      About your question;

                      Column 2 and 3 should always be the same in one line. For example, if the file with the first reads are fastA, then the file with the second reads should also be fastA

                      However, if you have multiple library files, you might also have paired reads in fastQ format, which could also be used;

                      so, these libraries are ok:

                      lib1 file1.1.fastA file1.2.fastA 400 0.5 0
                      lib1 file2.1.fastQ file2.2.fastQ 400 0.5 0

                      While these are not correct;
                      lib1 file1.1.fastA file1.2.fastQ 400 0.5 0
                      lib1 file2.1.fastQ file2.2.fastA 400 0.5 0

                      Is this what you mean?

                      Kind regards,
                      Boetsie

                      Comment


                      • #41
                        Hi Hliang,

                        no i'm sorry, this is not possible. They should be paired in two files.

                        We use bowtie for mapping, were we only use only reads that map entirely for scaffolding. If the whole read can be mapped to the contig (thus without gaps) it should be possible. If it really works... I really don't know. You can give it a try The differences in size does not matter, Illumina reads with different read lengths is also possible. In the future it is a good idea to have a mapper for larger sequences, you know any?

                        Boetsie

                        Comment


                        • #42
                          gotcha.

                          I'm not doing a lot mapping at the moment. but there are a bunch of programs you can take a look at here: http://en.wikipedia.org/wiki/List_of...nment_software
                          MUMmer and MAQ can handle long reads.

                          There is another one called LAST not mentioned above: http://last.cbrc.jp/

                          Originally posted by boetsie View Post
                          We use bowtie for mapping, were we only use only reads that map entirely for scaffolding. If the whole read can be mapped to the contig (thus without gaps) it should be possible. If it really works... I really don't know. You can give it a try The differences in size does not matter, Illumina reads with different read lengths is also possible. In the future it is a good idea to have a mapper for larger sequences, you know any?

                          Boetsie

                          Comment


                          • #43
                            I have a question or two about the mapping stage.

                            I'm working with datasets that consist of a contig file assembled by using both paired end and mate pair data. I'm running SSpace with that contig file against the mate pair reads for scaffolding. In my best case I have 80 million inserted pairs, 10 million single reads and 7 million pairs with pairing contigs. In other cases 25 million inserted pairs, 600k single reads and 400k pairs w/ pairing contigs.

                            in the first case I do end up with extensive scaffolding despite ~6% of the reads mapping. in the other cases with less than 1% reads used for mapping I get very little scaffolding. I'm a little concerned about the low level of reads mapping to my contigs. and without getting into details of my datasets (as they are different species and could be the source of the difference) I'm curious if you have any thoughts on this from the program's point of view.

                            Perhaps I just need some clarification of some of the terms.
                            #number of single reads found on contigs =
                            (I use an insert size of 3000bp with a std dev of .5)
                            regarding the mapping step, does this mean you take the 4500bp from the left and right edge of each contig to use for the mapping step or do you delete 4500 bp off each edge and just use the middle of the contigs for mapping. I assume it's the first option but you use the word "subtracted" in the readme file which is somewhat misleading.

                            #number of pairs found with pairing contigs =
                            for "pairing contigs" I get numbers that are greater than half the single reads. If SSPACE uses 10 million single reads for mapping, I would imagine that at most I could get 5 million pairs

                            #total pairs =
                            I'm unclear about what this number means. total read pairs used in mapping? if so, i'm unclear how this relates to the single reads. my understanding is that SSPACE/BOwtie takes all the read pairs that don't have Ns then maps each single read to the contigs. It then determines which of the reads are paired and what contigs those lie on etc.

                            any light you could shed would be greatly appreciated.. I'm fully ready to realize i'm just being dense.

                            Comment


                            • #44
                              Hi themwg,

                              thank you for the good points you mention there. I see indeed some vague descriptions and mistakes in the summary file.

                              About your questions;

                              - I indeed take 4500bp from the left and right edge for scaffolding, which is of course the obvious method.

                              - You are absolutely right, the number of pairs should indeed be at least two times smaller than the number of single reads. I see that I displayed the wrong variable in my script. I will fix this in a next release.

                              - As said above, wrong calculation for the total number of single reads. Total pairs is a sort of filtering step for the pairs. The actual pairs used for scaffolding is the value given at "Assembled pairs".

                              I'm sorry for the mistakes, as said, i will fix this in next release which will probably come in the next week.

                              Kind regards,
                              Boetsie

                              Comment


                              • #45
                                file in ./reads/ folder really small

                                Hey,
                                I am a little worried because my input files were each about 3G of gzipped fastq, and the .fasta files in the ./reads/ folder are only about 100M each. I am pretty sure that there are more perfect reads without N's than that... I trimmed bases from the beginning and end of reads prior to running the program, and the file should only have reads that are over 30nt.

                                One possible bug is that I noticed a few of the fastq reads have 0 length in my input file, but they are still paired properly, and have the right new lines and everything so the two files are the right relative length. Do you think that is causing issues for the program?

                                UPDATE: fixed the above issue with the few 0 length reads, and the output files still have this issue of being very small compared with the size of the input files. Maybe I just don't understand what the files in the ./reads folder are?

                                Thanks!
                                -John
                                Last edited by jstjohn; 02-23-2011, 07:26 PM.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM
                                • seqadmin
                                  The Impact of AI in Genomic Medicine
                                  by seqadmin



                                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                  02-26-2024, 02:07 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-14-2024, 06:13 AM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-08-2024, 08:03 AM
                                0 responses
                                71 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-07-2024, 08:13 AM
                                0 responses
                                80 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-06-2024, 09:51 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X