SEQanswers

Go Back   SEQanswers > Applications Forums > Epigenetics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Alignment produces BAM file with sorted reads, but cannot see alignment alch Bioinformatics 11 04-14-2015 05:59 PM
Combinig technical fastq files into a single fastq file dena.dinesh RNA Sequencing 3 03-27-2015 06:15 AM
For MAQ: Is there a Tool to convert sanger-format fastq file to illumina-fotmat fastq byb121 Bioinformatics 6 12-20-2013 01:26 AM
Split Large FASTQ file in small FASTQ files with user defined number of reads Windows deepbiomed Bioinformatics 3 04-04-2013 07:14 AM
Reduce file size after Illumina FASTQ to Sanger FASTQ conversion? jjw14 Illumina/Solexa 2 06-01-2010 04:35 PM

Reply
 
Thread Tools
Old 05-09-2015, 06:15 PM   #1
yul
Junior Member
 
Location: Atlanta

Join Date: May 2015
Posts: 1
Default How do I know whether my fastq file is ready for alignment?

I received my fastq files from a ChIP-Seq experiment. The Sequencing core guy told me that the raw data was converted from .bcl file format to .fastq format using CASAVA v1.8.2. They run a standard paired-end sequencing reaction to generate 50 bp of sequence in each direction in the Illumina HiSeq2000 platform.
How do I know whether the fastq files are ready for alignment to generate the bam files using Bowtie? This is because I wonder how to know in the fastq file that the adaptors/barcodes have been removed from the read. If the fastq files still contain these primers how do I know it and how do I trim them.
Below and attaching a couple of reads from the fastq files:
@D5VG2KN1:206:C3LG1ACXX:8:1101:1216:2098 1:N:0:GCCAAT
CTTGACAAGCGCTTTCTTCAGAGTGCCCTCGCTCGTCCTATCTACAAAGCT
+
CCCFFFFFHHHHHJJIJJJJJJJFHIJJJJJJIJDHIJJIJIJJJJJJJJJ
@D5VG2KN1:206:C3LG1ACXX:8:1101:1437:2084 1:N:0:GCCAAT
TTTACCTTGTGTTAATTTTATTCAAAGCCAGAAACAATATGCATCCGGTTG
+
CCCFFFFFHHHHHJIIJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJHJJ

this is kind of basic but and I a newbie in NGS analysis.
thanks for any clues
yul is offline   Reply With Quote
Old 05-09-2015, 06:53 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

The easiest way is to trim adapters and see if anything happens. The tool will report whether or not there were adapters.
Brian Bushnell is offline   Reply With Quote
Old 05-09-2015, 06:56 PM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Your file is ready for alignment now but it is always a good idea to pass it through a trimming program to remove any extraneous sequence (adapters etc) that may be present.
GenoMax is offline   Reply With Quote
Old 05-11-2015, 06:42 AM   #4
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

FastQC (http://www.bioinformatics.babraham.a...ojects/fastqc/) has a module that checks for adapter contamination. It looks for the universal adapter sequences (used in TruSeq RNA/DNA and NEBNext among others), the Nextera sequences and Illumina small RNA sequences.
It looks for both read-through and dimer (over-represented sequences).

In general, it's always a good idea to run your fastq through FastQC to check it looks as expected. HOWEVER, please be aware that the tests implemented assume you are sequencing highly diverse, genomic material. You may get failed modules on perfectly good data as the assumptions made when performing the test are incorrect. Check the documentation and ask yourself "Is this test applicable to my data set"? For example, you may see a high level of duplication when sequencing highly enriched samples (such as ChIP and/or RNA-Seq).
TonyBrooks is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:04 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO