Seqanswers Leaderboard Ad

**Wallysb01** · 12-11-2011, 06:16 PM

What do your sequence headers look like and how are your files split up, if at all? I've run into this problem before, and it just required getting the formatting right.

**marb** · 12-12-2011, 02:32 AM

Originally posted by Wallysb01 View Post

What do your sequence headers look like and how are your files split up, if at all? I've run into this problem before, and it just required getting the formatting right.

Do you think about header of bam file?
I obtained 28 fastq files from Casava - 14 right-end (R1) 14 left-end (R2).
I have processed them by tophat.

**Wallysb01** · 12-12-2011, 03:23 PM

Originally posted by marb View Post

Do you think about header of bam file?
I obtained 28 fastq files from Casava - 14 right-end (R1) 14 left-end (R2).
I have processed them by tophat.

Are you sure tophat used them as paired and not singled? How do the ends of your sequence headers look in the fastq format? If they come out with:

@XXXX 1:N:0 @XXXX 2:N:0
AGC.. GCT
+XXXX 1:N:0 +XXXX 1:N:0
.... .....

a lot of programs won't recognize that as paired end files. You need to convert it to:

@XXXX/1 @XXXX/2
AGC.. GCT
+XXXX/1 +XXXX/2
.... .....

I came on here with the same kinds of issues and a friendly commenter made this post to help people like me out:

Newbler input III: a quick fix for the new Illumina fastq header

http://contig.wordpress.com/2011/09/01/newbler-input-iii-a-quick-fix-for-the-new-illumina-fastq-header/

One unfortunate drawback of working with Illumina sequences is the many changes to the format of their fastq readfiles. The quality scoring has been changed several times since the first Solexa rea…

**marb** · 12-13-2011, 04:56 AM

I've tested cufflinks processing on other data and then cufflinks recognised them correctly as 57bp x 57bp.
Hence I know that there is the mistake at tophat processing fastq files level.

I know that is necessary to all sequences R1 and R2 (pair-end) be typed in the same order, so I used following command:

Code:

tophat /path/to/genome $(printf "%s," ./*.gz | sed 's/,$/\n/')

Do you think that way type args (fastq files) is incorect?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Cufflinks doesn't recognize read type

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News