Hi all,
I am interested in a particular genomic region that I would like to retrieve from all 1103 mapped genomes that are available today. I used this line in a python script to extract my region:
os.system('samtools view -b ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/{0} 2:69,130,000-69,160,000 > {1}'.format(sample, fn))
It works about 20% of the time, but other times I get this kind of responses:
[kftp_connect_file] 331 Anonymous login ok, send your complete email address as your password.
[main_samview] fail to open "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00611/alignment/HG00611.mapped.ILLUMINA.bwa.CHS.low_coverage.20101123.bam" for reading.
OR
[kftp_connect_file] 227 Entering Passive Mode (130,14,250,10,195,247).
[main_samview] fail to open "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00628/alignment/HG00628.mapped.ILLUMINA.bwa.CHS.low_coverage.20101123.bam" for reading.
OR
[kftp_connect_file] 331 Anonymous login ok, send your complete email address as your password.
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00614/alignment/HG00614.mapped.ILLUMINA.bwa.CHS.low_coverage.20101123.bam".
Has anyone previously encountered these types of errors? Are there any easy fixes?
Thanks in advance for any help!
-giror
p.s. I have also tried the new 1000genomes slicer tool for some of these bam files only to get a 404 error.
I am interested in a particular genomic region that I would like to retrieve from all 1103 mapped genomes that are available today. I used this line in a python script to extract my region:
os.system('samtools view -b ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/{0} 2:69,130,000-69,160,000 > {1}'.format(sample, fn))
It works about 20% of the time, but other times I get this kind of responses:
[kftp_connect_file] 331 Anonymous login ok, send your complete email address as your password.
[main_samview] fail to open "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00611/alignment/HG00611.mapped.ILLUMINA.bwa.CHS.low_coverage.20101123.bam" for reading.
OR
[kftp_connect_file] 227 Entering Passive Mode (130,14,250,10,195,247).
[main_samview] fail to open "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00628/alignment/HG00628.mapped.ILLUMINA.bwa.CHS.low_coverage.20101123.bam" for reading.
OR
[kftp_connect_file] 331 Anonymous login ok, send your complete email address as your password.
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00614/alignment/HG00614.mapped.ILLUMINA.bwa.CHS.low_coverage.20101123.bam".
Has anyone previously encountered these types of errors? Are there any easy fixes?
Thanks in advance for any help!
-giror
p.s. I have also tried the new 1000genomes slicer tool for some of these bam files only to get a 404 error.
Comment