Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Thanks for the quick reply,
    but as i tried to execute the command as you suggested, it seems fastq_screen tries to obtain bowtie's index file, and not with bowtie2 as specified with the --aligner flag.

    this is the command i executed:
    Code:
    fastq_screen --subset 1000000  --illumina1_3 --threads 22 --outdir /someOutDirPath  --aligner bowtie2 --paired  /pathTo/raw_data/SomeFastq_L008_R1_001.fastq  /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
    and this is the error message i get :
    Code:
    Reading configuration from '/PathTo/fastq_screen_v0.4.1/fastq_screen.conf'
    Using '/PathTo/bowtie2' as bowtie2 path
    Using 8 threads for searches
    Skipping DATABASE 'Human' since no bowtie index was found at '/PathTo/GRCh37/Sequence/Bowtie2Index'
    Skipping DATABASE 'Mouse' since no bowtie index was found at '/PathTo/GRCm38/Sequence/Bowtie2Index'
    Skipping DATABASE 'Rat' since no bowtie index was found at '/PathTo/RGSC3.4/Sequence/Bowtie2Index'
    Skipping DATABASE 'Ecoli' since no bowtie index was found at '/PathTo/EB1/Sequence/Bowtie2Index'
    Skipping DATABASE 'Yeast' since no bowtie index was found at '/PathTo/Ensembl/EF4/Sequence/Bowtie2Index'
    Skipping DATABASE 'PhiX' since no bowtie index was found at '/PathTo/1993-04-28/Sequence/Bowtie2Index'
    Skipping DATABASE 'Adapters' since no bowtie index was found at '/PathTo/Contaminants'
    No search libraries were configured at /PathTo/fastq_screen_v0.4.1//fastq_screen line 119.
    any other suggestions?
    it doesn't really matter to me which bowtie to use, whether it is bowtie or bowtie2.

    Cheers.
    Chen

    Comment


    • #32
      I'm assuming that the /PathTo bits are replaced by a real path in your actual run?

      Can you please list the contents of /PathTo/GRCh37/Sequence/Bowtie2Index which should tell us why it's not finding the indices.

      Comment


      • #33
        Yes, i replaced the real Path with /PathTo,
        please note that, before i added the flag, it seems the program recognize the Bowtie Index files.
        i did noticed the difference between the index files of Bowtie's version.
        here's the content of the folder:

        Code:
        /GRCh37/Sequence/Bowtie2Index->ls -l
        total 3963084
        
        -rwxrwxr-x 1 chenzr biogroup 957090229 Apr 11  2012 genome.1.bt2
        -rwxrwxr-x 1 chenzr biogroup 714668672 Apr 11  2012 genome.2.bt2
        -rwxrwxr-x 1 chenzr biogroup      3239 Apr 11  2012 genome.3.bt2
        -rwxrwxr-x 1 chenzr biogroup 714668666 Apr 11  2012 genome.4.bt2
        lrwxrwxrwx 1 chenzr biogroup        29 Jul 10 16:20 genome.fa -> ../WholeGenomeFasta/genome.fa
        -rwxrwxr-x 1 chenzr bioinfo 957090229 Apr 11  2012 genome.rev.1.bt2
        -rwxrwxr-x 1 chenzr bioinfo 714668672 Apr 11  2012 genome.rev.2.bt2
        Thanks.
        Chen.

        Comment


        • #34
          Hi,

          I’m responsible for developing fastq_screen. The path to the index should include the genome base name. Adjust the configuration file so the index paths are:
          /GRCh37/Sequence/Bowtie2Index/genome

          And so on. Also include the "--aligner bowtie2" option as well.

          I hope that helps.
          Steven

          Comment


          • #35
            Hi Steven,
            I am triing to use fasq_screen to remove wolbachia contamination from my illumina paired end reads. The program appears to run fine but my --nohits output fastq files are the same as the input even though the log file reports .02 % of reads hiting the wolbachia db I have.

            #Fastq_screen version: 0.4.2
            Library #Reads_processed #Unmapped %Unmapped #One_hit_one_library %One_hit_one_library #Multiple_hit
            s_one_library %Multiple_hits_one_library #One_hit_multiple_libraries %One_hit_multiple_libraries Multiple_hits
            _multiple_libraries %Multiple_hits_multiple_libraries
            Wolbachia 33070644 33063580 99.98 3201 0.01 3863 0.01 0 0.00 0 0.00

            %Hit_no_libraries: 99.98
            Is there another command that wil enable the removal of the reads that hit the db from the --nohit ouput?

            Thanks, Nathan

            Comment


            • #36
              thanks,
              it will be useful for me
              Last edited by collacor; 03-07-2014, 04:46 PM. Reason: not complete
              Want to Learn more and more

              Comment


              • #37
                Checking the number of reads is the same

                Hi Nathan,

                Thank you for your email. I would expect the “nohits” file to be very similar to the input file since 99.98% of the reads do not map to any genome. Just to check this is not a bug, are the number of lines in both the files the same? You can check with the Unix command wc –l [filename].

                Please let me know if the results are the same or different and I shall investigate further if need be.

                Regards,
                Steven

                Comment


                • #38
                  Hi Steven I did grep -c '@' and the seq counts are the same for the input and ouput files.

                  Nathan

                  Comment


                  • #39
                    I also ran the wc -l command and the the file sets are identical in line number
                    1_CATTTT_L003_R1_001.fastq.cor.pair_1.fq = 132282576
                    1_CATTTT_L003_R1_001.fastq.cor.pair_1.fq_screen.txt_no_hits.txt = 132282576
                    The other odd thing is that I processed 8 sets of paired end reads and the last R2 of the files processed actually tripled in size.... from 11382611519 to 34147834557

                    Any advice?

                    Nathan
                    Last edited by fireant; 03-11-2014, 07:22 AM.

                    Comment


                    • #40
                      Hi Simon, I am testing this application and I get the following error: No search libraries were configured at /Users/ZainA/Downloads/fastq_screen_v0.4.4/fastq_screen line 124

                      Code:
                      --threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf --outdir /Output --paired /Users/ZainA/Downloads/Dmel_520/forward_1p.fastq /Users/ZainA/Downloads/Dmel_520/reverse_2p.fastq
                      My configuration file looks like this:

                      Code:
                      BOWTIE /Users/ZainA/Downloads/bowtie-1.1.0
                      BOWTIE2 /Users/ZainA/Downloads/bowtie2-2.2.3
                      
                      ##Drosophila Genome 5.20
                      /Users/ZainA/Downloads/Dmel_520/dmel-all-chromosome-r5.20.fasta
                      
                      DATABASE Dmel_5_20 /Users/ZainA/Downloads/Dmel_520/genome
                      If you could kindly help me. I would greatly appreciate it.

                      Also what is the easiest method to install GD:Graph.

                      Thank you in advance.

                      Comment


                      • #41
                        You can install GD::Graph with the CPAN shell (you might need 'sudo' depending on your set up):
                        Code:
                        perl -MCPAN -e 'install GD::Graph'
                        For the Fastq_screen issue, make sure your database is indexed with bowie build before running the program (it's not clear if that is the case).

                        Comment


                        • #42
                          I actually built the index offline on my laptop (OSX) using bowtie2-build genome.fasta genome_index_bowtie2 . Previously, I was using the bowtie2 index made from iPlant. Unfortunately, I still received the same error as before.

                          Code:
                          fastq_screen --threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf --outdir /Output --paired /Users/ZainA/Downloads/Dmel_520/S4A15_SRR070422_1p.fastq /Users/ZainA/Downloads/Dmel_520/S4A15_SRR070422_2p.fastq 
                          Using fastq_screen v0.4.4
                          Reading configuration from '/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf'
                          Using '/Users/ZainA/Downloads/bowtie2-2.2.3' as bowtie2 path
                          No search libraries were configured at /Users/ZainA/Downloads/fastq_screen_v0.4.4/fastq_screen line 124.
                          As for GD:Graph,

                          I get the following error:

                          Code:
                          Warning: prerequisite GD 1.18 not found.
                          Warning: prerequisite GD::Text 0.80 not found.
                          only nested arrays of non-refs are supported at /System/Library/Perl/5.12/ExtUtils/MakeMaker.pm line 664
                          Warning: No success on command[/usr/bin/perl Makefile.PL]
                          'YAML' not installed, will not store persistent state
                            RUZ/GDGraph-1.48.tar.gz
                            /usr/bin/perl Makefile.PL -- NOT OK
                          Running make test
                            Make had some problems, won't test
                          Running make install
                            Make had some problems, won't install
                          How should I go about this. Thank you for the help in advance.

                          Comment


                          • #43
                            The reason it's saying that you don't have any libraries configured is that you're specifying that you should use bowtie2, but your library isn't marked as a bowtie2 library in your config file. Have a look at the example config file we ship with the distribution to see what the syntax looks like.

                            For GD graph - you didn't say what OS you were using. Assuming some type of linux, the easiest way is to install it from your package repository. On CentOS/Fedora I'd do:

                            yum install perl-GD-Graph

                            ..but there's likely to be something similar on whatever OS you're using.

                            If that's not an option then using the perl CPAN module is the next easiest way:

                            perl -MCPAN -e 'install GD::Graph'

                            Hope this helps

                            Comment


                            • #44
                              I am trying to run this on OSX Mountain Lion. I don't think yum will work on it. The other command generates the same error as before to get GD::Graph to work.

                              I tried to rearrange the configuration file as close to example version. Unfortunately, I still get this error.

                              I used the following to build the index database for my reference genome.

                              Code:
                              bowtie2-build reference.fasta reference_index_bowtie2
                              and
                              Code:
                              bowtie-build  reference.fasta reference_index_bowtie
                              My configuration file looks like this:

                              Code:
                              BOWTIE /Users/ZainA/Downloads/bowtie-1.1.0
                              BOWTIE2 /Users/ZainA/Downloads/bowtie2-2.2.3
                              
                              ##Dmel_5_20 - sequences available from
                              /Users/ZainA/Downloads/Dmel_520/dmel-all-chromosome-r5.20.fasta
                              
                              DATABASE Dmel520_Bowtie /Users/ZainA/Downloads/Dmel_520/Genomes/Dmel520_Bowtie BOWTIE
                              DATABASE Dmel520_Bowtie2 /Users/ZainA/Downloads/Dmel_520/Genomes/Dmel520_Bowtie2 BOWTIE2
                              The output error for the following command is like this:

                              Code:
                              Dmel_520 ZainA$ fastq_screen --threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf --outdir /Output --paired /Users/ZainA/Downloads/Dmel_520/forward.fastq /Users/ZainA/Downloads/Dmel_520/reverse.fastq
                              The error:
                              Code:
                              Using fastq_screen v0.4.4
                              Reading configuration from '/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf'
                              Using '/Users/ZainA/Downloads/bowtie2-2.2.3' as bowtie2 path
                              Skipping DATABASE 'Dmel520_Bowtie2' since no bowtie index was found at '/Users/ZainA/Downloads/Dmel_520/Genomes/Dmel520_Bowtie2'
                              No search libraries were configured at /Users/ZainA/Downloads/fastq_screen_v0.4.4/fastq_screen line 124.
                              If you could kindly help. I would really appreciate it.
                              Thank you in advance.

                              Comment


                              • #45
                                You appear to be using the directory as base names for the database in the config file. Your bowtie2 index base name as indicated by your command line for bowtie2-build is "reference_index_bowtie2", so the conf file should have this line
                                Code:
                                DATABASE Dmel520_Bowtie2 /Users/ZainA/Downloads/Dmel_520/Genomes/Dmel520_Bowtie2/reference_index_bowtie2 BOWTIE2
                                This assumes that your index files for bowtie2 are in "/Users/ZainA/Downloads/Dmel_520/Genomes/Dmel520_Bowtie2/" directory. If they are not there then replace with appropriate path.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                8 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                8 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                66 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X