Unconfigured Ad

**Zapages** · 08-02-2014, 12:08 PM

Thank you GenoMax. That worked perfectly. So whatever I name the bowtie1/2 index database should be the name used for the database location at the end. I will definitely remember that.

I have tested two sets out. In my first set, I know sequence belongs to what I am providing for the database, but I get everything as Unmapped. This is strange, is there I am doing something wrong?

The bowtie1/2 index were made using bowtie-build of the reference genome found on NCBI.

In other sample, there was an Arabidopsis contamination (somewhere between 2 to 0.2%) and I am trying to remove the regions that are not infected by using the --nohits option.

The same thing occurred with everything came back as Unmapped, which is strange.

Should be good version:

Code:

fastq_screen --threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf --paired /Users/ZainA/Downloads/Dmel_520/forward.fastq /Users/ZainA/Downloads/Dmel_520/reverse.fastq --outdir Output

Contamination version:

Code:

fastq_screen --threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Arabidopsis/Arabidopsis_gnomon_mRNA.conf --paired /Users/ZainA/Downloads/Dmel_520/Arabidopsis/forward_1p3.fastq /Users/ZainA/Downloads/Dmel_520/Arabidopsis/reverse_2p3.fastq --nohits --outdir output

Any ideas what is occurring and why is everything coming back unmapped?

EDIT: Is there method to filter out reads that actually match to a certain genome/s as one or separate files (paired end or single end reads- fastq).

**simonandrews** · 08-10-2014, 11:55 PM

Sorry to get to this late - have been away from internet access for a week so am still catching up.

Without seeing your files it's difficult to know why they might not be mapping. The first suspicion with any paired end files is that there's a problem with the pairing in your data. Could you try running the screen with just one of your files and remove the --paired option. Depending on whether that gives any hits will determine where you next look for problems.

**Zapages** · 08-11-2014, 06:25 PM

Originally posted by simonandrews View Post

Sorry to get to this late - have been away from internet access for a week so am still catching up.

Without seeing your files it's difficult to know why they might not be mapping. The first suspicion with any paired end files is that there's a problem with the pairing in your data. Could you try running the screen with just one of your files and remove the --paired option. Depending on whether that gives any hits will determine where you next look for problems.

Thank you for the advice on the --unpaired. Unfortunately, I still get the same results of everything being unmapped, which is strange.

Code:

fastq_screen --threads 8 --aligner bowtie2 --conf=/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf forward.fastq reverse.fastq --outdir output_single

Results:

Code:

Using fastq_screen v0.4.4
Reading configuration from '/Users/ZainA/Downloads/Dmel_520/Dmel5_20.conf'
Using '/Users/ZainA/Downloads/bowtie2-2.2.3' as bowtie2 path
Adding database Dmel520_index_bowtie2
Processing forward.fastq
Searching forward.fastq against Dmel520_index_bowtie2
Bowtie/Bowtie2 warning: sh: /Users/ZainA/Downloads/bowtie2-2.2.3: is a directory
Perl module GD::Graph::bars not installed, skipping charts
Processing reverse.fastq
Searching reverse.fastq against Dmel520_index_bowtie2
Bowtie/Bowtie2 warning: sh: /Users/ZainA/Downloads/bowtie2-2.2.3: is a directory
Perl module GD::Graph::bars not installed, skipping charts
Processing complete

I now tried the following to see if bowtie2 was working correctly and it is. I have used these control sequences before through tuxedo package (Tophat2 > Cufflinks2 > Cuffdiff2>CummeRbund) and everything worked out fine.

Code:

bowtie2 -p 8 -x  /Users/ZainA/Downloads/Dmel_520/Genomes/Dmel520_Bowtie2/Dmel520_index_bowtie2 -1 forward.fastq -2 reverse.fastq -S Dmel_test.sam

Results as expected:

Code:

27704187 reads; of these:
  27704187 (100.00%) were paired; of these:
    4441889 (16.03%) aligned concordantly 0 times
    21592908 (77.94%) aligned concordantly exactly 1 time
    1669390 (6.03%) aligned concordantly >1 times
    ----
    4441889 pairs aligned concordantly 0 times; of these:
      330640 (7.44%) aligned discordantly 1 time
    ----
    4111249 pairs aligned 0 times concordantly or discordantly; of these:
      8222498 mates make up the pairs; of these:
        5063321 (61.58%) aligned 0 times
        2980532 (36.25%) aligned exactly 1 time
        178645 (2.17%) aligned >1 times
90.86% overall alignment rate

Also I used this add bowtie2, bowtie or any other bioinformatic tools to Paths in OSX.

Add to the PATH on Mac OS X 10.8 Mountain Lion and up | Architect Ryan

http://architectryan.com/2012/10/02/add-to-the-path-on-mac-os-x-mountain-lion/#.U-l6Zv2z65O

Thoughts from Ryan Hoffman, an experienced team leader, software architect and developer.

If you have any advice on what is happening here and how to fix this to make FastQ screen work properly. I would really appreciate it.

Thank you in advance,

-Zapages

**StevenW** · 08-12-2014, 12:46 AM

No Hits Problem

Hi,

I am also responsible for developing FastQ Screen. I believe the problem is caused by the path to Bowtie2 in your configuration file being incorrect.

The Fastq Screen output states:

Bowtie/Bowtie2 warning: sh: /Users/ZainA/Downloads/bowtie2-2.2.3: is a directory

I believe /Users/ZainA/Downloads/bowtie2-2.2.3 is the path to the folder where Bowtie2 is kept, you need the path to the executable file e.g.

/Users/ZainA/Downloads/bowtie2-2.2.3/bowte2

(or something similar).

Regards,

Steven

**Zapages** · 08-12-2014, 05:39 PM

Originally posted by StevenW View Post

Hi,

I am also responsible for developing FastQ Screen. I believe the problem is caused by the path to Bowtie2 in your configuration file being incorrect.

The Fastq Screen output states:

Bowtie/Bowtie2 warning: sh: /Users/ZainA/Downloads/bowtie2-2.2.3: is a directory

I believe /Users/ZainA/Downloads/bowtie2-2.2.3 is the path to the folder where Bowtie2 is kept, you need the path to the executable file e.g.

/Users/ZainA/Downloads/bowtie2-2.2.3/bowte2

(or something similar).

Regards,

Steven

Thank you Steven, that worked perfectly.

I really appreciate the help.

Are the no hit output for the reads (paired end), are they still arranged properly in order or do I have re-match them to be paired end reads? If so what program do you recommend in this task? Thank you in advance.

I was wondering if this could be included in future release of FASTQ_Screen as a method to remove only contaminated reads. Unless this is possible with current version of FastQ_Screen.

For example:

You have your single or paired end reads - We going to go towards a Denovo assembly for either whole genome or transcriptome approach.

If we do the --nohits options in FASTQ_Screen based on the contaminated species.

This will yield us both True and False positive matches within the reads.

Now if we create index (bowtie/bowtie2) for bunch of closely related species for our de-novo reads. I really wish there was an option to retain hits based on specific database.

An example would be:

We could state which set of Organisms to keep the reads for and at the sametime eliminating reads from the contaminated organism. But when the contaminated organism and the other set of Organisms have the same read match, then keep the reads. Its sort of like metagenomics approach to eliminating contamination.

Through this only the contaminated reads will removed and the good reads will be kept.

Is this still possible with current version of the application?

Thank you for creating this awesome program and being so helpful in the whole process.

**StevenW** · 08-12-2014, 11:46 PM

Fastq_screen

Hi,

Glad that worked.

In paired-end mode the program writes the forward and reverse reads to two separate 'nohits' output files. The reads will be in order with respect to one another in the input and output files.

There is not a feature you requested specifically, but perhaps you could create 2 configuration files? One setup would map all against all genomes and the other just the contaminants (with --nohits selected).

i.e.

A : all libraries in config file, --subset 100000 (only some of the reads analysed - which is quicker)

B: contaminant libraries only in config file, --nohits, and all reads analysed

Regards,

Steven

**GenoMax** · 08-13-2014, 03:19 AM

Originally posted by Zapages View Post

We could state which set of Organisms to keep the reads for and at the sametime eliminating reads from the contaminated organism. But when the contaminated organism and the other set of Organisms have the same read match, then keep the reads. Its sort of like metagenomics approach to eliminating contamination.

Through this only the contaminated reads will removed and the good reads will be kept.

It is possible to do this now with BBMap. See this thread for an example: http://seqanswers.com/forums/showthread.php?t=45661

**Zapages** · 08-27-2014, 10:42 AM

Hey guys,

Thank you Genomax and everyone. Please let me know does this sound on removing containment reads.

I think I have figure out a method to what I was discussing earlier with FastQ Screen. Maybe this will be helpful for everyone here.

1) Conduct a metagenomic analysis using different mammals, fish species, species closely related to your experimental genomes or list of known conserved genes (i.e. beta actin, cytochrome, etc) through the containments (Arabidopsis and Maize are examples) genomes. This will be done through the use of megaBLAST and nBLAST

2) Where ever there is consensus between megaBLAST and nBLAST. - Please remove these sequences from the containments (Arabidopsis and Maize are examples) genomes (fasta files). Hence, this will will remove any conserved genes that are found across the different plants, mammals, and fish. (False positives)

3) Run FastQ Screen and take the output of unmapped sequences to containments (Arabidopsis and Maize are examples) as sequences as the not contaminated sequences. (Contamination free reads) The sequences that map to Arabidopsis and/or Maize are the true contaminated reads, which will not be outputted. (True positive contaminated reads)

Then continue on with the bioinformatic analysis as your reads are no longer contaminated with any Arabidopsis and/or Maize or any other possible containments.

Hopefully, this will allow users to have close as possible results of having not contaminated reads.

**Brian Bushnell** · 08-27-2014, 10:55 AM

Zapages,

Have you considered BBSplit? It is based on BBMap, but designed for a slightly different role; specifically, decontaminating or binning reads from multiple organisms. It maps simultaneously to all references and outputs reads to one file per reference. Each output file will only get reads that map best to that reference. Depending on your ambiguity settings, reads from conserved regions will either be written to the files of ALL references they map to equally well, or just one, or discarded. The output is fasta or fastq.

**abmmki** · 11-08-2014, 07:59 PM

configure fastq_screen.config

Hi,

I would like to use fastq_screen against Drosophila, Human, Mouse, Ecoli genome. I have downloaded Bowtie Pre-Built Index files and corresponding genome sequence (single fasta file).

I have prepared config file as below, and run command like following .... but got error:

#-------- Config file:

BOWTIE /data/users/bin/bowtie
BOWTIE2 /data/users/bin/bowtie2-2.2.4

THREADS 12
DATABASE Drosophila /data/users/Bowtie-Prebuilt-Index/dme_ucsc BOWTIE
DATABASE Human /data/users/Bowtie-Prebuilt-Index/hg19 BOWTIE
DATABASE Mouse /data/users/Bowtie-Prebuilt-Index/mm9 BOWTIE
DATABASE Ecoli /data/users/Bowtie-Prebuilt-Index/e_coli BOWTIE

#--------------- Command

fastq_screen --threads 12 --aligner bowtie --bowtie "-m 2 -g 1 --butterfly-search" $fq/MT1.fq $fq/MT2.fq $fq/MT3.fq $fq/MT4.fq $fq/MT5.fq $fq/MT6.fq $fq/MT7.fq $fq/MT8.fq

#-------------- Error

Using fastq_screen v0.4.4

Reading configuration from '/data/users/bin/fastq_screen_v0.4.4/fastq_screen.conf'

Using '/data/users/bin/bowtie/bowtie' as bowtie path

Using 12 threads for searches

Skipping DATABASE 'Drosophila' since no bowtie index was found at '/data/users/Bowtie-Prebuilt-Index/dme_ucsc'

Skipping DATABASE 'Human' since no bowtie index was found at '/data/users/Bowtie-Prebuilt-Index/hg19'

Skipping DATABASE 'Mouse' since no bowtie index was found at '/data/users/Bowtie-Prebuilt-Index/mm9'

Skipping DATABASE 'Ecoli' since no bowtie index was found at '/data/users/Bowtie-Prebuilt-Index/e_coli'

No search libraries were configured at /data/users/bin/fastq_screen_v0.4.4/fastq_screen line 124.

## But I see that Bowtie Prebuilt Index files are present in above mentioned pathways ....... fol example:

ls /data/users/Bowtie-Prebuilt-Index/hg19

hg19.1.ebwt
hg19.2.ebwt
hg19.3.ebwt
hg19.4.ebwt
hg19.fa
hg19.rev.1.ebwt
hg19.rev.2.ebwt

# Final directory names as the prefix of the pre-built index names.So, this is not the issue disccued already.

# It shows that Bowtie Index and corresponding genome seq files are present in the directory. Also I used these Index files for mapping already without problem.

# I have GD::Graph installed properly.

thanks

**simonandrews** · 11-10-2014, 01:14 AM

I sent you a direct mail about this, but just so the information stays in the post, I think the problem here is that you are only specifying the path to the directory which contains your indices, and not the full path to the actual database. In this case it's a little confusing in that the name of the database and the name of the folder it's in are the same (which makes sense, but since it doesn't have to be like that you need to explicitly tell the program).

I think the fix is simply to append the database name to the end of the paths, so instead of:

/data/users/khademul/Bowtie-Prebuilt-Index/hg19

..you'd have

/data/users/khademul/Bowtie-Prebuilt-Index/hg19/hg19

**cjdoherty** · 06-27-2015, 02:05 PM

Citing FastX Screen

Just want to make sure I'm not missing a publication, is there a preferred way to cite FastQ screen?
The program was so helpful we really appreciate it.
Thanks!

**simonandrews** · 06-29-2015, 12:49 AM

Originally posted by cjdoherty View Post

Just want to make sure I'm not missing a publication, is there a preferred way to cite FastQ screen?
The program was so helpful we really appreciate it.
Thanks!

There isn't a publication for fastq_screen. We recommend just citing the project URL.

**cjdoherty** · 06-29-2015, 05:43 AM

Originally posted by simonandrews View Post

There isn't a publication for fastq_screen. We recommend just citing the project URL.

Thank you. Will do!

**touchsk** · 08-14-2015, 11:55 AM

Remove only 'one-hit/one-library' hits

I am trying to use FASTQ Screen to remove contaminated sequences from my data and have a question. I was looking at the options provided with the tool and was wondering how I could set up something like this:
Screen my human data against potential contaminants (EColi, Yeast, Adapters,..) and only remove the hits that are classified as 'one-hit/one-library' AND 'multiple-hits/one-library'. I see that this feature is built-in as part of the plots, but was not clear if it could be (and how to) set up.

Thanks
SK

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 46 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 105 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News