SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Subsampling from one paired-end fastq file morning latte Bioinformatics 2 08-21-2013 09:20 AM
Given BAM/SAM file, how to see if it's single-end or paired-end sequencing? xxatbio Bioinformatics 2 08-11-2013 02:51 AM
Converting Tophats bam output back to separate paired end read fastq files bob-loblaw Bioinformatics 0 12-03-2012 04:23 AM
Alternative tools for BAM to Paired-End FASTQ oiiio Bioinformatics 1 07-05-2012 11:54 AM
Paired-end Bam from single-end aligned sam ramouz87 Bioinformatics 4 08-17-2011 12:55 PM

Reply
 
Thread Tools
Old 05-05-2015, 02:38 AM   #1
adrian
Member
 
Location: baltimore

Join Date: Oct 2009
Posts: 89
Default converting paired-end (PE) bam file to single-end (SE) fastq

Hi:
while working with COAD TCGA BAM files, I find the very annoying to find PE reads. These files are mashed up and not consistent.
for example:
1. read lengths are not consistent. Some are 34 some 76 reads.
2. Many reads miss mate or pair.

I want to identify novel splicing differences however TCGA BAM files are mapped to known transcripts (known exon pairing from known isoforms gtf) thus limiting the discovery of novel isoforms.

I decided convert BAM to fastq and realign to full genome.

While doing this, because of loss of many pair and mates in bam, I converted them to single end fastq.

Any ideas if converting a paired-end bam to single end fastq pose any problem in philosophical ways.

thanks
adrian is offline   Reply With Quote
Old 05-05-2015, 03:18 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Yes, you'll be expected to decrease your mapping efficiency a bit, since one mate can act as an anchor to rescue the other. Further, it's much easier to use paired-end reads to find isoforms, since you're then not relying solely on alignments over a splice junction.
dpryan is offline   Reply With Quote
Old 05-05-2015, 03:37 AM   #3
adrian
Member
 
Location: baltimore

Join Date: Oct 2009
Posts: 89
Default

Yes thats a disadvantage I agree.

Unfortunately, the bam file does not have enough PE reads.

When I used bamtofastq for PE fastq files, interestingly I obtained 0 fastq reads.
adrian is offline   Reply With Quote
Old 05-05-2015, 10:00 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Adrian,

You can try running repair.sh to split the file into paired and unpaired reads, and then map twice, once for the paired and once for the unpaired, and then merge the bam files. That will allow maximal use of the available information.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:04 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO