SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
split fastq file Balat Bioinformatics 10 09-22-2016 07:55 AM
split a fastq file lfaino Bioinformatics 4 04-14-2011 03:28 PM
Split fastq to fasta and qual file? ewilbanks Bioinformatics 8 01-07-2011 02:02 AM
how to split BED file according to chromsome sunsnow86 Bioinformatics 4 11-30-2010 02:39 PM
Split GA FASTQ file aritakum Bioinformatics 3 06-10-2010 04:15 AM

Reply
 
Thread Tools
Old 12-20-2011, 09:12 AM   #1
rahul
Member
 
Location: US

Join Date: Nov 2011
Posts: 11
Default Split a SAM file

Hi All,

I am looking for a tool that can split SAM file into smaller sub SAM files. I just wanted to process(annotate reads) the SAM files in parallel and join them again? Do you think it is possible?

Thank you,

Rahul
rahul is offline   Reply With Quote
Old 12-20-2011, 09:30 AM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

With samtools, you can generate sub bams by indicating what chromosome and positions you want in the sub file. Then you can convert back to sams.
swbarnes2 is offline   Reply With Quote
Old 12-20-2011, 10:13 AM   #3
rahul
Member
 
Location: US

Join Date: Nov 2011
Posts: 11
Default

Thanks for the immediate reply.I want to break the SAM file into smaller files based on the number of reads rather than chromosomal location. Still your method should do the work.

Thanks a lot.

Rahul
rahul is offline   Reply With Quote
Old 12-20-2011, 10:30 AM   #4
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default linux split

The split function should do this for you though be sure that you are using a multiple of 4 or 8 depending on if it is paired end or not.
severin is offline   Reply With Quote
Old 12-20-2011, 10:55 AM   #5
rahul
Member
 
Location: US

Join Date: Nov 2011
Posts: 11
Default

I have tried the split function and it does break the file. But SAM tools is unable to read the resulting subfiles. I am getting the following error when I try to sort the resulting subSAM file.


[bam_header_read] EOF marker is absent.
[bam_sort_core] truncated file. Continue anyway.
Segmentation fault


Please let me know if you have seen this errors before.

Thank you,

Rahul
rahul is offline   Reply With Quote
Old 12-20-2011, 11:08 AM   #6
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default header

You have to be sure the header information is contained in the split files so you might have a separate header file that you can then add to each of the split files.

Quote:
Originally Posted by rahul View Post
I have tried the split function and it does break the file. But SAM tools is unable to read the resulting subfiles. I am getting the following error when I try to sort the resulting subSAM file.


[bam_header_read] EOF marker is absent.
[bam_sort_core] truncated file. Continue anyway.
Segmentation fault


Please let me know if you have seen this errors before.

Thank you,

Rahul
severin is offline   Reply With Quote
Old 12-20-2011, 11:12 AM   #7
rahul
Member
 
Location: US

Join Date: Nov 2011
Posts: 11
Default

Sure, That makes sense.Thanks a lot for all the help.

Rahul
rahul is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:29 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO