I have filtered SAM files to get only the aligned reads with samtools view -F4.
my output is a file in SAM format, the header is lost, but I can still convert the file to BAM and then use picardtools "addOrReplaceReadGroups" to add a header.
this time I have a list of read names in a txt file and I used awk to find the matching rows in the SAM file. of course the header is lost, and I cannot convert this file to BAM to add the header back so I can index the file and visualize the alignments.
I can open both files in a spreadsheet, they look and act exactly the same.
is there some kind of identifier that is lost when awk writes the new file? Or will I have to use the matching fastq reads and redo the alignment?
I am trying to find regions in a virus that has matching sequences in a plant genome. So I did: align all reads to virus genome, filter out aligned reads. align those virus reads to plant genome, filter again. Now I need to see where these reads align in the virus, and since they are already contained in the first alignment file I thought I could just separate them out......
my output is a file in SAM format, the header is lost, but I can still convert the file to BAM and then use picardtools "addOrReplaceReadGroups" to add a header.
this time I have a list of read names in a txt file and I used awk to find the matching rows in the SAM file. of course the header is lost, and I cannot convert this file to BAM to add the header back so I can index the file and visualize the alignments.
I can open both files in a spreadsheet, they look and act exactly the same.
is there some kind of identifier that is lost when awk writes the new file? Or will I have to use the matching fastq reads and redo the alignment?
I am trying to find regions in a virus that has matching sequences in a plant genome. So I did: align all reads to virus genome, filter out aligned reads. align those virus reads to plant genome, filter again. Now I need to see where these reads align in the virus, and since they are already contained in the first alignment file I thought I could just separate them out......
Comment