Thanks for the reply Simon. Could you also advise on how to feed the fastqc "contaminants.txt" data to the program?
Unconfigured Ad
Collapse
X
-
You'd need to convert it into a fasta file. The script below should do this:Originally posted by albireo View PostThanks for the reply Simon. Could you also advise on how to feed the fastqc "contaminants.txt" data to the program?
Once you have that you can index it with bowtie-build using something like:Code:#!/usr/bin/perl use warnings; use strict; open (IN,'contaminant_list.txt') or die $!; open (OUT,'>','contaminant_list.fa') or die $!; while (<IN>) { next if (/^\#/); chomp; next unless ($_); my ($name,$seq) = split(/\t+/); next unless ($seq); $name =~ s/\s+/_/g; print OUT ">$name\n$seq\n"; } close OUT or die $!;
bowtie-build -f contaminant_list.fa contaminants
You can then put the contaminants database into fastq_screen.
Hope this helps
Comment
-
-
Hi,
I have a problem when running fastqscreen on mouse paired-end ChIPseq data. Basically for all of the four libraries I have, I'm getting more than 99% no hits in the final fastqscreen graph.
The sequences I'm checking my libraries against are human, mouse, rat, fly, vectors, adapters. I downloaded the mouse mm9 fasta from the ucsc and generated the bowtie index with bowtie 0.12.7. The same version of bowtie is used in the fastqscreen.conf file.Code:Mmus 99.96 0.02 0.02 0.00 0.00
The reads are 51b paired end and I call the program as follows
I also tried using the --bowtie="--trim5 10" option, as well as --trim3 but this didn't affect the 99% to 100% nohits results.Code:fastq_screen --nohits --conf=fastq_screen.conf --paired <library>_2_sequence.fastq.gz <library>_1_sequence.fastq.gz
Separately, I had used bwa to align the reads agains mm9, and the sequences did align. This is the output of samtools flagstats for one of the four bam files:
Any idea on what I might be doing wrong? Apologies if I'm missing something really obvious.Code:78666176 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 76266600 + 0 mapped (96.95%:-nan%) 78666176 + 0 paired in sequencing 39333088 + 0 read1 39333088 + 0 read2 74908040 + 0 properly paired (95.22%:-nan%) 75455201 + 0 with itself and mate mapped 811399 + 0 singletons (1.03%:-nan%) 346117 + 0 with mate mapped to a different chr 130284 + 0 with mate mapped to a different chr (mapQ>=5)
Comment
-
-
I can't immediately see why this would be going wrong from the data you've provided. If you run the screen against just the first of your paired reads do you find any hits from that? If you don't then there's probably something odd going on in the search. If you find hits from analysing each of the files as single end, but not when you pair them then that suggests that either something is going wrong in the pairing of sequences or that you have oddly separated pairs.Originally posted by albireo View PostHi,
Any idea on what I might be doing wrong? Apologies if I'm missing something really obvious.
If you can put a subset of your sequences up somewhere where we can see them (just 100k or so would be plenty) then we could take a look and see what's happening with your data.
Comment
-
-
Hi Simon, no I don't find any hits even using one of the paired reads.Originally posted by simonandrews View PostIf you run the screen against just the first of your paired reads do you find any hits from that? If you don't then there's probably something odd going on in the search.Last edited by albireo; 12-06-2012, 08:19 AM.
Comment
-
-
Hello,
the problem had to to with the gzip compression of my fastq files. When I unpacked the gz files and used the .fastq as input instead, the program run correctly. Any idea why that should be the case?
By the way the .fastq are very large, ranging from 7 to 12GB. I'm actually using the sampling function in fastqscreen to operate on 5000000 reads only, but I completed one successful run without subsampling as well.
Comment
-
-
Sorry to take a while to get back to you. I tried your sequences and they worked fine on my system. I also tried gzipping them and that worked OK too.
When fastq_screen runs it simply pipes the original file through zcat so in terms of the searches there's nothing different between normal and gzipped files. Could it simply be that you don't have zcat installed on your system (it's a standard part of gzip so it should be on most unix systems).
Can you try running 'which zcat' and see if that finds anything. If it doesn't then this is the problem, but I'd have thought that that would have returned a more sensible error message.
Comment
-
-
Should be. A simple test would be to run:
zcat [some file which failed] > /dev/null
..and see if that produces any errors. You might also want to check if the disk you're using was getting close to being full. If you analyse a large file the temp files it makes could be pretty big. You could try running the screen with --subset 100000 to see if that works (which is what we'd normally do anyway).
Comment
-
-
fastq screen search libraries
Hello all,
I am having a problem with fastq screen version 0.4.1 while trying to execute this command:
I have downloaded all databases that i needed and configured them in the fastq_screen.conf file.Code:fastq_screen --subset 1000000 --illumina1_3 --threads 22 --outdir /someOutDirPath --paired /pathTo/raw_data/SomeFastq_L008_R1_001.fastq /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
this is the output i get when i try to execute the command:
from a quick peak in the code, it seems the the "libraries" variable never initiated, maybe it needs to be configured somehow by hard coded? or maybe within the configuration file?Code:Reading configuration from '/fastq_screen_v0.4.1/fastq_screen.conf' Using 8 threads for searches No search libraries were configured at /fastq_screen_v0.4.1//fastq_screen line 119.
any help would be appreciated.
Cheers, Chen
Comment
-
-
It sounds like a problem with your configuration file. Could you message me the contents of your /fastq_screen_v0.4.1/fastq_screen.conf file and we can see what's going wrong.Originally posted by chenz123 View PostHello all,
I am having a problem with fastq screen version 0.4.1 while trying to execute this command:
I have downloaded all databases that i needed and configured them in the fastq_screen.conf file.Code:fastq_screen --subset 1000000 --illumina1_3 --threads 22 --outdir /someOutDirPath --paired /pathTo/raw_data/SomeFastq_L008_R1_001.fastq /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
this is the output i get when i try to execute the command:
from a quick peak in the code, it seems the the "libraries" variable never initiated, maybe it needs to be configured somehow by hard coded? or maybe within the configuration file?Code:Reading configuration from '/fastq_screen_v0.4.1/fastq_screen.conf' Using 8 threads for searches No search libraries were configured at /fastq_screen_v0.4.1//fastq_screen line 119.
Comment
-
-
If you're using the latest release (0.4.1) then you'll need to pass the option --aligner bowtie2 since all of the indices you have are bowtie2, this is probably the reason it's failing.
We should actually handle this better. I'll get it set up so that if your config file doesn't contain both bowtie1 and bowtie2 indices then it will automatically select the correct one for your run.
Let me know if this fixes things.
Comment
-
Latest Articles
Collapse
-
by SEQadmin2
I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.
Here are nine questions we think about, in roughly the order they matter, before...-
Channel: Articles
06-18-2026, 07:11 AM -
-
by SEQadmin2
Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.
The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...-
Channel: Articles
06-02-2026, 10:05 AM -
ad_right_rmr
Collapse
News
Collapse
| Topics | Statistics | Last Post | ||
|---|---|---|---|---|
|
Started by SEQadmin2, 06-26-2026, 11:10 AM
|
0 responses
10 views
0 reactions
|
Last Post
by SEQadmin2
06-26-2026, 11:10 AM
|
||
|
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population
by SEQadmin2
Started by SEQadmin2, 06-17-2026, 06:09 AM
|
0 responses
45 views
0 reactions
|
Last Post
by SEQadmin2
06-17-2026, 06:09 AM
|
||
|
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism
by SEQadmin2
Started by SEQadmin2, 06-09-2026, 11:58 AM
|
0 responses
105 views
0 reactions
|
Last Post
by SEQadmin2
06-09-2026, 11:58 AM
|
||
|
Started by SEQadmin2, 06-05-2026, 10:09 AM
|
0 responses
125 views
0 reactions
|
Last Post
by SEQadmin2
06-05-2026, 10:09 AM
|
Comment