Thanks for the reply Simon. Could you also advise on how to feed the fastqc "contaminants.txt" data to the program?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by albireo View PostThanks for the reply Simon. Could you also advise on how to feed the fastqc "contaminants.txt" data to the program?
Code:#!/usr/bin/perl use warnings; use strict; open (IN,'contaminant_list.txt') or die $!; open (OUT,'>','contaminant_list.fa') or die $!; while (<IN>) { next if (/^\#/); chomp; next unless ($_); my ($name,$seq) = split(/\t+/); next unless ($seq); $name =~ s/\s+/_/g; print OUT ">$name\n$seq\n"; } close OUT or die $!;
bowtie-build -f contaminant_list.fa contaminants
You can then put the contaminants database into fastq_screen.
Hope this helps
Comment
-
Hi,
I have a problem when running fastqscreen on mouse paired-end ChIPseq data. Basically for all of the four libraries I have, I'm getting more than 99% no hits in the final fastqscreen graph.
Code:Mmus 99.96 0.02 0.02 0.00 0.00
The reads are 51b paired end and I call the program as follows
Code:fastq_screen --nohits --conf=fastq_screen.conf --paired <library>_2_sequence.fastq.gz <library>_1_sequence.fastq.gz
Separately, I had used bwa to align the reads agains mm9, and the sequences did align. This is the output of samtools flagstats for one of the four bam files:
Code:78666176 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 76266600 + 0 mapped (96.95%:-nan%) 78666176 + 0 paired in sequencing 39333088 + 0 read1 39333088 + 0 read2 74908040 + 0 properly paired (95.22%:-nan%) 75455201 + 0 with itself and mate mapped 811399 + 0 singletons (1.03%:-nan%) 346117 + 0 with mate mapped to a different chr 130284 + 0 with mate mapped to a different chr (mapQ>=5)
Comment
-
Originally posted by albireo View PostHi,
Any idea on what I might be doing wrong? Apologies if I'm missing something really obvious.
If you can put a subset of your sequences up somewhere where we can see them (just 100k or so would be plenty) then we could take a look and see what's happening with your data.
Comment
-
Originally posted by simonandrews View PostIf you run the screen against just the first of your paired reads do you find any hits from that? If you don't then there's probably something odd going on in the search.Last edited by albireo; 12-06-2012, 08:19 AM.
Comment
-
Hello,
the problem had to to with the gzip compression of my fastq files. When I unpacked the gz files and used the .fastq as input instead, the program run correctly. Any idea why that should be the case?
By the way the .fastq are very large, ranging from 7 to 12GB. I'm actually using the sampling function in fastqscreen to operate on 5000000 reads only, but I completed one successful run without subsampling as well.
Comment
-
Sorry to take a while to get back to you. I tried your sequences and they worked fine on my system. I also tried gzipping them and that worked OK too.
When fastq_screen runs it simply pipes the original file through zcat so in terms of the searches there's nothing different between normal and gzipped files. Could it simply be that you don't have zcat installed on your system (it's a standard part of gzip so it should be on most unix systems).
Can you try running 'which zcat' and see if that finds anything. If it doesn't then this is the problem, but I'd have thought that that would have returned a more sensible error message.
Comment
-
Should be. A simple test would be to run:
zcat [some file which failed] > /dev/null
..and see if that produces any errors. You might also want to check if the disk you're using was getting close to being full. If you analyse a large file the temp files it makes could be pretty big. You could try running the screen with --subset 100000 to see if that works (which is what we'd normally do anyway).
Comment
-
fastq screen search libraries
Hello all,
I am having a problem with fastq screen version 0.4.1 while trying to execute this command:
Code:fastq_screen --subset 1000000 --illumina1_3 --threads 22 --outdir /someOutDirPath --paired /pathTo/raw_data/SomeFastq_L008_R1_001.fastq /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
this is the output i get when i try to execute the command:
Code:Reading configuration from '/fastq_screen_v0.4.1/fastq_screen.conf' Using 8 threads for searches No search libraries were configured at /fastq_screen_v0.4.1//fastq_screen line 119.
any help would be appreciated.
Cheers, Chen
Comment
-
Originally posted by chenz123 View PostHello all,
I am having a problem with fastq screen version 0.4.1 while trying to execute this command:
Code:fastq_screen --subset 1000000 --illumina1_3 --threads 22 --outdir /someOutDirPath --paired /pathTo/raw_data/SomeFastq_L008_R1_001.fastq /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
this is the output i get when i try to execute the command:
Code:Reading configuration from '/fastq_screen_v0.4.1/fastq_screen.conf' Using 8 threads for searches No search libraries were configured at /fastq_screen_v0.4.1//fastq_screen line 119.
Comment
-
If you're using the latest release (0.4.1) then you'll need to pass the option --aligner bowtie2 since all of the indices you have are bowtie2, this is probably the reason it's failing.
We should actually handle this better. I'll get it set up so that if your config file doesn't contain both bowtie1 and bowtie2 indices then it will automatically select the correct one for your run.
Let me know if this fixes things.
Comment
Latest Articles
Collapse
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
-
by seqadmin
The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.
Avian Conservation
Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...-
Channel: Articles
03-08-2024, 10:41 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-27-2024, 06:37 PM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
03-27-2024, 06:37 PM
|
||
Started by seqadmin, 03-27-2024, 06:07 PM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
03-27-2024, 06:07 PM
|
||
Started by seqadmin, 03-22-2024, 10:03 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
03-22-2024, 10:03 AM
|
||
Started by seqadmin, 03-21-2024, 07:32 AM
|
0 responses
69 views
0 likes
|
Last Post
by seqadmin
03-21-2024, 07:32 AM
|
Comment