SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SRA to fastq conversion with fastq-dump loses sequences pcantalupo Bioinformatics 13 10-08-2015 04:09 PM
For MAQ: Is there a Tool to convert sanger-format fastq file to illumina-fotmat fastq byb121 Bioinformatics 6 12-20-2013 01:26 AM
RNA-Seq: Second-Generation Sequencing Supply an Effective Way to Screen RNAi Targets Newsbot! Literature Watch 0 04-16-2011 02:50 AM
Reduce file size after Illumina FASTQ to Sanger FASTQ conversion? jjw14 Illumina/Solexa 2 06-01-2010 04:35 PM
PubMed: Implementation of Novel Pyrosequencing Assays to Screen for Common Mutations Newsbot! Literature Watch 0 05-12-2009 05:00 AM

Reply
 
Thread Tools
Old 12-06-2012, 06:17 AM   #21
albireo
Member
 
Location: Europe

Join Date: Sep 2012
Posts: 39
Default

Quote:
Originally Posted by simonandrews View Post
If you run the screen against just the first of your paired reads do you find any hits from that? If you don't then there's probably something odd going on in the search.
Hi Simon, no I don't find any hits even using one of the paired reads.

Last edited by albireo; 12-06-2012 at 07:19 AM.
albireo is offline   Reply With Quote
Old 12-07-2012, 01:41 AM   #22
albireo
Member
 
Location: Europe

Join Date: Sep 2012
Posts: 39
Default

Hello,

the problem had to to with the gzip compression of my fastq files. When I unpacked the gz files and used the .fastq as input instead, the program run correctly. Any idea why that should be the case?

By the way the .fastq are very large, ranging from 7 to 12GB. I'm actually using the sampling function in fastqscreen to operate on 5000000 reads only, but I completed one successful run without subsampling as well.
albireo is offline   Reply With Quote
Old 12-10-2012, 02:08 AM   #23
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Sorry to take a while to get back to you. I tried your sequences and they worked fine on my system. I also tried gzipping them and that worked OK too.

When fastq_screen runs it simply pipes the original file through zcat so in terms of the searches there's nothing different between normal and gzipped files. Could it simply be that you don't have zcat installed on your system (it's a standard part of gzip so it should be on most unix systems).

Can you try running 'which zcat' and see if that finds anything. If it doesn't then this is the problem, but I'd have thought that that would have returned a more sensible error message.
simonandrews is offline   Reply With Quote
Old 12-10-2012, 02:16 AM   #24
albireo
Member
 
Location: Europe

Join Date: Sep 2012
Posts: 39
Default

Hi Simon, zcat is there. I'm not an expert on gzip however I wonder if there are alternative algorithms/encodings around?
albireo is offline   Reply With Quote
Old 12-10-2012, 02:27 AM   #25
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Should be. A simple test would be to run:

zcat [some file which failed] > /dev/null

..and see if that produces any errors. You might also want to check if the disk you're using was getting close to being full. If you analyse a large file the temp files it makes could be pretty big. You could try running the screen with --subset 100000 to see if that works (which is what we'd normally do anyway).
simonandrews is offline   Reply With Quote
Old 12-10-2012, 02:33 AM   #26
albireo
Member
 
Location: Europe

Join Date: Sep 2012
Posts: 39
Default

Ok thanks a lot, will try this and report back.
albireo is offline   Reply With Quote
Old 07-10-2013, 06:19 AM   #27
chenz123
Junior Member
 
Location: Beer Sheva, Israel

Join Date: Nov 2012
Posts: 7
Default fastq screen search libraries

Hello all,
I am having a problem with fastq screen version 0.4.1 while trying to execute this command:

Code:
fastq_screen --subset 1000000  --illumina1_3 --threads 22 --outdir /someOutDirPath  --paired  /pathTo/raw_data/SomeFastq_L008_R1_001.fastq  /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
I have downloaded all databases that i needed and configured them in the fastq_screen.conf file.
this is the output i get when i try to execute the command:
Code:
Reading configuration from '/fastq_screen_v0.4.1/fastq_screen.conf'
Using 8 threads for searches
No search libraries were configured at /fastq_screen_v0.4.1//fastq_screen line 119.
from a quick peak in the code, it seems the the "libraries" variable never initiated, maybe it needs to be configured somehow by hard coded? or maybe within the configuration file?

any help would be appreciated.
Cheers, Chen
chenz123 is offline   Reply With Quote
Old 07-10-2013, 06:51 AM   #28
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by chenz123 View Post
Hello all,
I am having a problem with fastq screen version 0.4.1 while trying to execute this command:

Code:
fastq_screen --subset 1000000  --illumina1_3 --threads 22 --outdir /someOutDirPath  --paired  /pathTo/raw_data/SomeFastq_L008_R1_001.fastq  /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
I have downloaded all databases that i needed and configured them in the fastq_screen.conf file.
this is the output i get when i try to execute the command:
Code:
Reading configuration from '/fastq_screen_v0.4.1/fastq_screen.conf'
Using 8 threads for searches
No search libraries were configured at /fastq_screen_v0.4.1//fastq_screen line 119.
from a quick peak in the code, it seems the the "libraries" variable never initiated, maybe it needs to be configured somehow by hard coded? or maybe within the configuration file?
It sounds like a problem with your configuration file. Could you message me the contents of your /fastq_screen_v0.4.1/fastq_screen.conf file and we can see what's going wrong.
simonandrews is offline   Reply With Quote
Old 07-10-2013, 08:28 AM   #29
chenz123
Junior Member
 
Location: Beer Sheva, Israel

Join Date: Nov 2012
Posts: 7
Default

I've sent you a message containing the content of the configuration file.

Thanks for the help.
Cheers.
chenz123 is offline   Reply With Quote
Old 07-10-2013, 01:31 PM   #30
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

If you're using the latest release (0.4.1) then you'll need to pass the option --aligner bowtie2 since all of the indices you have are bowtie2, this is probably the reason it's failing.

We should actually handle this better. I'll get it set up so that if your config file doesn't contain both bowtie1 and bowtie2 indices then it will automatically select the correct one for your run.

Let me know if this fixes things.
simonandrews is offline   Reply With Quote
Old 07-10-2013, 01:43 PM   #31
chenz123
Junior Member
 
Location: Beer Sheva, Israel

Join Date: Nov 2012
Posts: 7
Default

Thanks for the quick reply,
but as i tried to execute the command as you suggested, it seems fastq_screen tries to obtain bowtie's index file, and not with bowtie2 as specified with the --aligner flag.

this is the command i executed:
Code:
fastq_screen --subset 1000000  --illumina1_3 --threads 22 --outdir /someOutDirPath  --aligner bowtie2 --paired  /pathTo/raw_data/SomeFastq_L008_R1_001.fastq  /pathTo/raw_data/SomeRealtedFastq_R2_001.fastq
and this is the error message i get :
Code:
Reading configuration from '/PathTo/fastq_screen_v0.4.1/fastq_screen.conf'
Using '/PathTo/bowtie2' as bowtie2 path
Using 8 threads for searches
Skipping DATABASE 'Human' since no bowtie index was found at '/PathTo/GRCh37/Sequence/Bowtie2Index'
Skipping DATABASE 'Mouse' since no bowtie index was found at '/PathTo/GRCm38/Sequence/Bowtie2Index'
Skipping DATABASE 'Rat' since no bowtie index was found at '/PathTo/RGSC3.4/Sequence/Bowtie2Index'
Skipping DATABASE 'Ecoli' since no bowtie index was found at '/PathTo/EB1/Sequence/Bowtie2Index'
Skipping DATABASE 'Yeast' since no bowtie index was found at '/PathTo/Ensembl/EF4/Sequence/Bowtie2Index'
Skipping DATABASE 'PhiX' since no bowtie index was found at '/PathTo/1993-04-28/Sequence/Bowtie2Index'
Skipping DATABASE 'Adapters' since no bowtie index was found at '/PathTo/Contaminants'
No search libraries were configured at /PathTo/fastq_screen_v0.4.1//fastq_screen line 119.
any other suggestions?
it doesn't really matter to me which bowtie to use, whether it is bowtie or bowtie2.

Cheers.
Chen
chenz123 is offline   Reply With Quote
Old 07-10-2013, 02:02 PM   #32
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

I'm assuming that the /PathTo bits are replaced by a real path in your actual run?

Can you please list the contents of /PathTo/GRCh37/Sequence/Bowtie2Index which should tell us why it's not finding the indices.
simonandrews is offline   Reply With Quote
Old 07-10-2013, 11:19 PM   #33
chenz123
Junior Member
 
Location: Beer Sheva, Israel

Join Date: Nov 2012
Posts: 7
Default

Yes, i replaced the real Path with /PathTo,
please note that, before i added the flag, it seems the program recognize the Bowtie Index files.
i did noticed the difference between the index files of Bowtie's version.
here's the content of the folder:

Code:
/GRCh37/Sequence/Bowtie2Index->ls -l
total 3963084

-rwxrwxr-x 1 chenzr biogroup 957090229 Apr 11  2012 genome.1.bt2
-rwxrwxr-x 1 chenzr biogroup 714668672 Apr 11  2012 genome.2.bt2
-rwxrwxr-x 1 chenzr biogroup      3239 Apr 11  2012 genome.3.bt2
-rwxrwxr-x 1 chenzr biogroup 714668666 Apr 11  2012 genome.4.bt2
lrwxrwxrwx 1 chenzr biogroup        29 Jul 10 16:20 genome.fa -> ../WholeGenomeFasta/genome.fa
-rwxrwxr-x 1 chenzr bioinfo 957090229 Apr 11  2012 genome.rev.1.bt2
-rwxrwxr-x 1 chenzr bioinfo 714668672 Apr 11  2012 genome.rev.2.bt2
Thanks.
Chen.
chenz123 is offline   Reply With Quote
Old 07-11-2013, 02:40 AM   #34
StevenW
Member
 
Location: UK

Join Date: May 2011
Posts: 11
Default

Hi,

I’m responsible for developing fastq_screen. The path to the index should include the genome base name. Adjust the configuration file so the index paths are:
/GRCh37/Sequence/Bowtie2Index/genome

And so on. Also include the "--aligner bowtie2" option as well.

I hope that helps.
Steven
StevenW is offline   Reply With Quote
Old 03-07-2014, 09:26 AM   #35
fireant
Junior Member
 
Location: Austin, Texas

Join Date: Sep 2013
Posts: 9
Default

Hi Steven,
I am triing to use fasq_screen to remove wolbachia contamination from my illumina paired end reads. The program appears to run fine but my --nohits output fastq files are the same as the input even though the log file reports .02 % of reads hiting the wolbachia db I have.

#Fastq_screen version: 0.4.2
Library #Reads_processed #Unmapped %Unmapped #One_hit_one_library %One_hit_one_library #Multiple_hit
s_one_library %Multiple_hits_one_library #One_hit_multiple_libraries %One_hit_multiple_libraries Multiple_hits
_multiple_libraries %Multiple_hits_multiple_libraries
Wolbachia 33070644 33063580 99.98 3201 0.01 3863 0.01 0 0.00 0 0.00

%Hit_no_libraries: 99.98
Is there another command that wil enable the removal of the reads that hit the db from the --nohit ouput?

Thanks, Nathan
fireant is offline   Reply With Quote
Old 03-07-2014, 03:45 PM   #36
collacor
Junior Member
 
Location: joyoland

Join Date: Feb 2014
Posts: 2
Default

thanks,
it will be useful for me
__________________
Want to Learn more and more

Last edited by collacor; 03-07-2014 at 03:46 PM. Reason: not complete
collacor is offline   Reply With Quote
Old 03-10-2014, 03:21 AM   #37
StevenW
Member
 
Location: UK

Join Date: May 2011
Posts: 11
Default Checking the number of reads is the same

Hi Nathan,

Thank you for your email. I would expect the “nohits” file to be very similar to the input file since 99.98% of the reads do not map to any genome. Just to check this is not a bug, are the number of lines in both the files the same? You can check with the Unix command wc –l [filename].

Please let me know if the results are the same or different and I shall investigate further if need be.

Regards,
Steven
StevenW is offline   Reply With Quote
Old 03-10-2014, 09:34 AM   #38
fireant
Junior Member
 
Location: Austin, Texas

Join Date: Sep 2013
Posts: 9
Default

Hi Steven I did grep -c '@' and the seq counts are the same for the input and ouput files.

Nathan
fireant is offline   Reply With Quote
Old 03-11-2014, 07:05 AM   #39
fireant
Junior Member
 
Location: Austin, Texas

Join Date: Sep 2013
Posts: 9
Default

I also ran the wc -l command and the the file sets are identical in line number
1_CATTTT_L003_R1_001.fastq.cor.pair_1.fq = 132282576
1_CATTTT_L003_R1_001.fastq.cor.pair_1.fq_screen.txt_no_hits.txt = 132282576
The other odd thing is that I processed 8 sets of paired end reads and the last R2 of the files processed actually tripled in size.... from 11382611519 to 34147834557

Any advice?

Nathan

Last edited by fireant; 03-11-2014 at 07:22 AM.
fireant is offline   Reply With Quote
Old 08-01-2014, 09:16 AM   #40
Zapages
Member
 
Location: NJ

Join Date: Oct 2012
Posts: 94
Default

Hi Simon, I am testing this application and I get the following error: No search libraries were configured at /Users/ZainA/Downloads/fastq_screen_v0.4.4/fastq_screen line 124

Code:
--threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf --outdir /Output --paired /Users/ZainA/Downloads/Dmel_520/forward_1p.fastq /Users/ZainA/Downloads/Dmel_520/reverse_2p.fastq
My configuration file looks like this:

Code:
BOWTIE /Users/ZainA/Downloads/bowtie-1.1.0
BOWTIE2 /Users/ZainA/Downloads/bowtie2-2.2.3

##Drosophila Genome 5.20
/Users/ZainA/Downloads/Dmel_520/dmel-all-chromosome-r5.20.fasta

DATABASE Dmel_5_20 /Users/ZainA/Downloads/Dmel_520/genome
If you could kindly help me. I would greatly appreciate it.

Also what is the easiest method to install GD:Graph.

Thank you in advance.
Zapages is offline   Reply With Quote
Reply

Tags
contamination, quality, screening, search

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:11 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO