Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Hi,
    I am responsible for developing FastQ Screen.

    The standard way to remove contamination is:
    Run FastQ Screen (latest version) with --subset (to process the entire dataset) and --nohits. In the config file include the Bowtie1/2 indices of all the potential contaminants (human genome indices should not be included).

    A FastQ file should then be produced containing all the reads that did not map to any of the contaminants.

    I am wondering, why do you need to only remove the hits that are classified as 'one-hit/one-library' AND 'multiple-hits/one-library'? Also is this single-end data?

    Please feel free to contact me directly to discuss this further.

    Kindest regards,
    Steven W

    Comment


    • #62
      Thanks Steven

      I apologize for the delayed reply.

      Yes, I am doing similar to what you have mentioned. I have two config files in place:
      - one with just the contaminants for filtering the fastq file as you mentioned
      - and one with contaminants and mammalian genomes to generate a figure using your tool that depicts the level of contamination in comparison to its real hit in a mammal (similar plot to the example on the fast_screen webpage.

      This is single-end data.

      The question on the single-hit was motivated because we have some contaminant-like sequences (custom to our study), that we want to quantify but not remove if they have homologs in mammalian genomes. But again using a similar approach as above, I am able to tackle that (stepwise).

      Thanks for a very useful tool.

      Comment


      • #63
        Excellent, so everything is fine?

        PS I'm releasing a new version of the software today.

        Comment


        • #64
          hey, what ever happened to the --paired option? Is it still possible to screen paired end data?

          Comment


          • #65
            Hi,
            Thanks for your message; I am part of the team responsible for developing FastQ Screen.
            We removed the --paired option from the script in a recent update as we felt it was unnecessary and was causing confusion. Mapping forward or reverse reads independently should be perfectly adequate to ascertain whether there is contamination, and will also provide the user with additional information if the forward reads are more prone generally to contamination than the reverse reads (or vice versa). Also, some users were reporting that the script was sometimes failing to detect contamination in --paired mode. For example, if the read pair did not constitute a contiguous region of DNA, or if the paired reads were separated by are large distance (such as RNA seq).
            So we now recommend that you screen both read files independently.
            Is there any particular reason you would have to use the –paired mode?

            Comment


            • #66
              FastQ Screen for bisulfite samples

              Hello,

              I'm trying to use fastq screen for bisulfite sequencing samples. I've run the test data and that works fine. However, I get a file handle error when running my bisulfite samples:
              my code:

              PHP Code:
              fastq_screen --bisulfite G3.S22.fastq.gz 
              Output:
              PHP Code:
              Using fastq_screen v0.11.2
              Defaulting to Bowtie 2 
              for --bisulfite mode
              Reading configuration from 
              '/data/Bismark/fastq_screen_v0.11.2/fastq_screen.conf'
              Using '/usr/lib/bowtie2/bin/bowtie2' as Bowtie 2 path
              Using 
              '/data/Bismark/bismark' as Bismark path
              Adding database Daphnia
              Using 8 threads 
              for searches
              Option 
              --subset set to 100000 reads
              Processing G3
              .S22.fastq.gz
              Counting sequences in G3
              .S22.fastq.gz
              Making reduced sequence file with ratio 69
              :1
              Searching G3
              .S22.fastq.gz_temp_subset.fastq against Daphnia
              open
              No such file or directory
              [main_samviewfail to open "/data/Bismark/fastq_screen_v0.11.2/Daphnia.G3.S22.fastq.gz_temp_subset_bismark_bt2.bam" for reading.
              Cannot close filehandle on '/data/Bismark/fastq_screen_v0.11.2/Daphnia.G3.S22.fastq.gz_temp_subset_bismark_bt2.bam' :  at fastq_screen line 1059. 
              I do get the outputfile and mapping report of the subsample against the first database, so it seems that the mapping did work. It happens regardless of the databases I use. However when using my samples in non bisulfite mode, and mapping them against the regular genome indices, this does not happen. So I do not think my sample file is this issue. Also, I know my bismark genome build indices are fine as I used them with bismark as well.

              Any ideas on what is wrong or why this is happening?

              Thanks!

              Comment


              • #67
                FastQ Screen Bisulfite Problem

                Hi jaas,

                I am one of the developers of FastQ Screen. Hopefully we can get this problem resolved quickly.

                Would you be able to send me the configuration file you used when running FastQ Screen. This will help me resolve the problem.

                Many thanks,

                Steven

                Comment


                • #68
                  Here's the config file. I can't figure out how to send it to you alone. I have changed the extension to a txt file to be able to upload it.

                  Thanks in advance for your help
                  Attached Files

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X