Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa

Similar Threads
Thread Thread Starter Forum Replies Last Post
TopHat file.bam file.bed join Trudy Bioinformatics 1 05-21-2013 11:59 AM
Are there any good ways to use SAMtools java API to convert .bam file into .txt file? alextree Bioinformatics 8 01-24-2012 09:20 AM
BAM file to Histogram on UCSC Genome Browser qnc Bioinformatics 3 10-14-2011 06:11 AM
Upload Bam file to custom track UCSC Genome Browser gabrielw Bioinformatics 4 06-15-2011 11:26 AM

Thread Tools
Old 04-15-2011, 09:56 AM   #1
Location: Boston

Join Date: May 2009
Posts: 11
Default what is the file size for a 30X human genome sequencing file, raw and BAM?

I am moving from RNA-seq into whole genome DNA sequencing and wondering how big is the file? Illumina fastq file, SOLiD csfasta file as well as the aligned result in sam and bam format. What would be my minimum system requirement to deal with this kind of data then?
your help is really appreciated
RNA-seq is offline   Reply With Quote
Old 04-15-2011, 10:55 AM   #2
Richard Finney
Senior Member
Location: bethesda

Join Date: Feb 2009
Posts: 700

Assume genome is 3GB bases. Want 30x coverage.

3GB * 30 = 30 bases per genomic position = 90GB uncompressed.
Since quality is also needed in a fastq multiple by 2 = 180GB uncompressed (ignore fastq"@ control lines" and carriage returns in fastq file). So budget 220GB for fastqs. They typically compress to 25% of original size.

A resulting 30x bam should be about 100GB

Example: the TCGA file TCGA-AB-2977-11A-01D-0739-09_whole.bam (illumina) with 75base read lengths providing 28x coverage and is 88GB.

To actually run it, you'll need some temp file space. I'd budget about 500GB (1/2 TB) for end to end processing.

3TB drives are $175: warning not all motherboards support them and make sure you have the pci slots for the "raid card" to support them if your computer hardware bios doesn't support them. Otherwise stick to 2TBs.

Corrections on this comment are welcome!
Richard Finney is offline   Reply With Quote
Old 04-15-2011, 11:27 AM   #3
Location: Boston

Join Date: May 2009
Posts: 11

Thank you very much for you reply.
RNA-seq is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 04:52 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO