View Single Post
Old 07-21-2016, 06:12 AM   #43
jmwhitha
Senior Member
 
Location: NC State, Raleigh, NC

Join Date: Mar 2013
Posts: 107
Default Multi-fasta

Thank you very much for your fast reply! Yes, each contig/chromosome (aka @SQ) is actually a gene. The format of the multi-fasta file I used for my bowtie2 alignment is below. It's not a transcriptome (http://www.ncbi.nlm.nih.gov/nuccore/NC_017304). Why do you think that? My goal is to map count gene reads, not transcript reads. I thought by replacing "lcl|NC............................_" with gene, I could get HTseq count to count my reads. What should I do instead? Thank you again!

>lcl|NC_017304.1_cds_WP_003516465.1_1 [gene=CLO1313_RS00010] [protein=/inference=EXISTENCE: similar to AA sequence:SwissProt:A3DHZ4.1] [protein_id=WP_003516465.1] [location=212..1543
]
ATGAATACTCAGTTGAATGAAATATGGCAAAAAACTTTAGGACTGCTTAAAAATGAGCTTACAGAAATCA
GTTTTAACACCTGGATCAAGACCATCGATCCATTGTCCTTGACAGGCAATACTATAAACCTGGCTGTCCC
GGCGGAATTCAACAAGGGAATTCTTGAGTCCAGGTATCAAACTCTGATTAAAAATGCCATTAAGCAAGTT
ACTTTTAAGGAATACGAGATTGCATTTATCGTGCCTTCACAGGAAAATTTAAACAAGCTGACGAAGCAGA
CCGAGTCCGCCGGCAATGAGGATTCTCCTTTGTCAGTATTAAACCCGAAGTACACGTTTGACACTTTTGT
CATAGGAAACAGCAACAGATTTGCACACGCAGCCGCACTGGCCGTGGCCGAGGCACCGGGAAAAGCATAC
AATCCCTTGTTCATATATGGCGGAGTGGGACTTGGGAAGACTCATCTTATGCATGCCATCGGGCACTACA
TTCTGGAACAGAATTCTTCCCAAAGGGTTTTGTATGTTTCATCTGAAAAATTTACCAACGAACTTATCAA
TGCCATTAAAGACAACAGAAATGAAGAATTCAGATCCAAATACAGAAATATTGACGTACTGCTTATAGAC
GACATACAATTCATTGCCGGAAAGGAAAGAACGGAGGAGGAGTTCTTCCATACCTTCAATGCTCTTTACG
AAGCAAACAAACAGATAATCCTGTCAAGCGACAAGCCTCCGAAAGAAATTTCTCTTGAGGACCGCCTGAG
ATCCAGGTTTGAATGGGGCTTGATTGCGGACATGCAGGCACCGGATCTGGAAACCAGGATAGCAATACTA
AGGAAAAAAGCCCAGCTTGAAAACCTTACTGTTCCAAATGAAGTAATTGTATTCATTGCAGACAAGATAG
CATCAAACATCAGAGAACTTGAAGGTGCCTTAAACAGAGTAATAGCATATTCATCGCTTACGGAAAACGA
AATTACCGTCGAACTCGCCAGCGAAGCATTAAAAGACATACTGTCAGCAAACAAGGCGAAAGTTTTAAAC
TGCACCACAATCCAGGAAGCAGTGGCCAGATACTTTGACATAAGACCGGAAGAATTTAAATCAAAGAAGA
GGACAAGGGACATCGCATTCCCAAGACAAATTGCAATGTACCTGTGCAGAGAACTTACCGAAATGTCCCT
CCCAAAAATCGGCGAGGAATTCGGCGGAAGAGATCATACTACTGTAATACATGCATGTGAAAAGATAAGT
GAAGAAATCGAAAGCAACTCCGAAACCAGGAGGGCCGTAAGTGAAATAAAGAGGAACCTGCTGGGAAAAT
AA
>lcl|NC_017304.1_cds_WP_003513339.1_2 [gene=CLO1313_RS00015] [protein=/inference=EXISTENCE: similar to AA sequence:RefSeq:WP_003513339.1] [protein_id=WP_003513339.1] [location=1793..
2893]
ATGAAAATAGTTTGTTCCAAAGAACAGCTAATGGAAGGAATCAACGTCGTGCAAAAAGCAGTGCCGACAA
AAGCCACTCTAACCATACTGGAAGGAATATTGCTGGAAGCATACGACAATTTTAAAATGACCGGAAATGA
TTTGGAACTGGGAATAGAATGCCTTATAGATGCAGACATTCTGGAAAAAGGATCTATAGTCTTAAATTCA
AAAATGTTCGGAGACATAGTAAGAAGACTTCCCGACTCAGAGGTACTTATTGAAGTTAAAGAGAACAATA
CAGTTATCATTGAATGTGACAACTCTCACTTTGAGTTAAGGGGTATGCCTTCTGACAGCTTTCCGTCACT
GCCTTCAATTGAAAAAGAGAACATGATCAAAGTCAGCCAAAAGGCAATCAGGGATATGATAAGACAAACA
CTTTTTGCCGTAAGTATGGAAGGAACCAGACCGATACTTACCGGTTCACTTATTGAATGTGCAGGAAACG
AAATTACCTTCGTTTCAATAGACGGATTCAGAATGGCTCTGAGAAAAAACTTTAACAACGAAGGATTTTC
CGAATTCAGTGTTGTCGTACCCGCAAAAACCCTCAGCGAGATAGGCAAAATCTTACAGCCGGTTGATGAA
GATATTTACATATACAGTTCTCAAAACCAGATACTGTTTGAAATTGGAAATTGCAAAGTTGTATCAAGAC
TTTTAGAGGGTGAATATCTAAACTATAAAAGTATTATACCACCGGAATATGAAACCAGCGTAAGACTTAG
AACCGAGGACCTTTTGTCCAGCCTTGAAAGGGCGTCATTGATTACTTCGGACGAAAAGAAATACCCGGTT
AAATTTAATATTATAGACGATAAAATCATAATTACCTCCAACACTGAAATAGGAGCAGTAAGGGAAGAAA
TCAGAGTCGAAGTAAACGGCAGCAACATGGAAGTGGGCTTCAACCCCAGATATTTTATCGAAGCGCTCAG
GGTCATAGATGACGAGCTGGTTGACATATACTTCAATTCAAGTGTCGGTCCGTGTACAATAAGACCTCTT
GAAGGCGACAGTTTTGCATACATGATACTTCCGGTAAGAATAAATAAATAA

Commands:

#Index
bowtie2-build -o 2 --threads 12 -f NC_017304.1.nucleotide.fa cipA_b2index

#Mapped 3 fastq files to the index
bowtie2 --threads 12 --very-sensitive-local -x cipA_b2index -U 1.fastq.gz,2.fastq.gz,3.fastq.gz -S Project_mapped.sam
jmwhitha is offline   Reply With Quote