Hello,
I aligned some human samples with Casava, and then converted the alignments to SAM using samtools:
llumina_export2sam.pl --read1=file_R1.export.txt.gz --read2=file_R2.export.txt.gz > file.sam
Then I try to convert to bam:
samtools view -bt genome.fa.fai file.sam > file.bam
but I get a lot of
[sam_read1] reference 'chr8.fa' is recognized as '*'.
[sam_read1] reference 'chr11.fa' is recognized as '*'.
etc.
There was a similar thread that suggested changing the header as follows:
The first line needs to be header.
The second line needs to be a dummy read group line
The next lines need to contain the chromosomes and their lengths.
An example of a good header is as follows:
@HD VN:1.0 SO:unsorted
@RG ID:unknownReadGroup SM:unknownSample
@SQ SN:chrI AS:ce6_32r_index LN:15072421
@SQ SN:chrII AS:ce6_32r_index LN:15279323
@SQ SN:chrIII AS:ce6_32r_index LN:13783681
@SQ SN:chrIV AS:ce6_32r_index LN:17493785
@SQ SN:chrM AS:ce6_32r_index LN:13794
@SQ SN:chrV AS:ce6_32r_index LN:20919568
@SQ SN:chrX AS:ce6_32r_index LN:1771885
where do I get all of that information? I also might have to do this over and over again so I was looking for some software or precise ways so that I can write a script to do it.
Thanks in advance,
Ramiro
I aligned some human samples with Casava, and then converted the alignments to SAM using samtools:
llumina_export2sam.pl --read1=file_R1.export.txt.gz --read2=file_R2.export.txt.gz > file.sam
Then I try to convert to bam:
samtools view -bt genome.fa.fai file.sam > file.bam
but I get a lot of
[sam_read1] reference 'chr8.fa' is recognized as '*'.
[sam_read1] reference 'chr11.fa' is recognized as '*'.
etc.
There was a similar thread that suggested changing the header as follows:
The first line needs to be header.
The second line needs to be a dummy read group line
The next lines need to contain the chromosomes and their lengths.
An example of a good header is as follows:
@HD VN:1.0 SO:unsorted
@RG ID:unknownReadGroup SM:unknownSample
@SQ SN:chrI AS:ce6_32r_index LN:15072421
@SQ SN:chrII AS:ce6_32r_index LN:15279323
@SQ SN:chrIII AS:ce6_32r_index LN:13783681
@SQ SN:chrIV AS:ce6_32r_index LN:17493785
@SQ SN:chrM AS:ce6_32r_index LN:13794
@SQ SN:chrV AS:ce6_32r_index LN:20919568
@SQ SN:chrX AS:ce6_32r_index LN:1771885
where do I get all of that information? I also might have to do this over and over again so I was looking for some software or precise ways so that I can write a script to do it.
Thanks in advance,
Ramiro
Comment