SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trouble With Trim_Galore processing paired reads Dave-bo_Baggins Bioinformatics 4 05-09-2013 08:41 AM
Forcing paired-end data mapped as single-end in SAM puggie Bioinformatics 1 03-16-2013 10:50 AM
paired end fastq format in bwa Protaeus Bioinformatics 4 12-09-2010 02:28 PM
Does Cufflinks support single-end and paired end data together ? ersenkavak Bioinformatics 1 10-22-2010 07:26 AM
BWA and Overlapped Paired-End? Lee Sam Bioinformatics 1 06-08-2010 01:07 PM

Reply
 
Thread Tools
Old 06-14-2013, 10:20 AM   #1
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default After using FASTQC and Trim_Galore on my data, I used BWA with my first paired end...

sequence.

This is what the same file looked like. Is it supposed to look like this? I tried to google the sam format and I wasn't too sure what went wrong or if I did it right.

Code:
2010	147	Serratia	4602318	54	74M	=	4601933	-459	GGCATCGGCGACCGCACCCTCGACGTTGTGCGCCAGGCGGCGCGCGATCGCCAACTGACGTTGTGGCGGGCGAC	@@9)):D@=0+<)D:3))<)3<5C+BCCCDCDE5+*>555+55+CE+C7EC>A86++A<@9<5+5++++<55<<	NM:i:5	AS:i:49	XS:i:0
M00532:8:000000000-A17VF:1:1101:15815:2019	99	Serratia	5016228	60	227M	=	5016443	403	GTGCTGGCCGCCGCCGGCGCGCGCGTGATCCTCAACGGCTTCGGCGATGTGGAAGCGGCGAAGACGCAGGTTGCCCGGCTGGGCGCCGCGCCGGGGTATCACGGCGCCGATCTCGGCGATGCGGCCCAGATAGCGGACATGATGCAGTATGCCGAACGTGAGTTCGGCGGCGTGGACATTCTGGTGAACAACGCCGGCATTCAGCACGTGGCGCCGCTGGATCAGTT	??<??<9?BB@BBB<BCCCCCCHHAC7EF;FDGHHHEHEHHFC:CC:DDD@;;@@DEE@@7??EE8:6;;CEE?EEE?;;?AAE62;;;2;;?;??4?EEEE?8;????;??EEAEE;?;882;???EEEEEEE;?;8'??CEAEEEA:A:?C88;?;8ACE?:C)'8.2;2''48A*//?C::/:??AECE;;'';828CEE****.*.)'5?2;;;4;C?EEA0A	NM:i:16	AS:i:147	XS:i:0
M00532:8:000000000-A17VF:1:1101:15815:2019	147	Serratia	5016443	60	188M	=	5016228	-403	GCTGGATCAGTTCCCGGTGGAGAAATGGAACGCCATCCTCGCCATCAACCTGTCGGCGGTGTTCCACACTTGCCGATTGGCGCTGCCGGGCATGCGCGAGCGCCACTGGGGGCGCATCATCAACGTAGCGTCGGTGCACGGGCTGGTGGCGTCGAAAGACAAGTCGGCCTATGTGGCGGCCAAGCACG	CEE:?*:**.';8'8AC?0E?*::*??EDE>:)CA2;4'5:::*?:CC?2DD?D?DDEEEE?:):C?:82?;EECE8E8D>?'DD?EC:?))''8EEE<EBECEEEBEEEEEEHHHHFHHHHHHHDDHEECHHDFE@HHHFHFFHHEHEADFDHHGGHF?HCCCEFC>AFFFDD@@7DBDBB??????	NM:i:6	AS:i:158	XS:i:19
M00532:8:000000000-A17VF:1:1101:13390:2020	99	Serratia	4718589	60	252M	=	4719007	534	AAATTGCTGCAGGGGCGTTGGATGCAGGGCGAGGTGCAAACCTGCGACGGCCAAAGCATGAAACCGGGACTGGATGCCGCCTCCATCGTCTGGATCGAGAAGCGTGCCCGCAGCAGCAGCCGGCCGGTGAGCGTCGCCTGGCTGGAAGCGCCGGAAGGCAGCGAACTGCTGCTGGTGGCGAACGACGATTTCTGCAGCTGGTGACCGACAGAAGACCCACTATAAACAAGACCCCGCGCTGCGGAGCCTCTT	?????BBBD<B9B@DBFCCFFFFHFHF->EECAC+CCFHHHHH,?>CCHHBCHHHFHHFDFFHDFCEB>@EF@DDB??)@BEEFF;CAAA?AAEEFEEFFECEDEEEEFEDD8DFCCCE*?'8;DDD4'?*:;;EDDE88?AECEEEFF>DDDDD8AEEFE?DD>8C:C:**:*::*??>?D?;?;>;8AEFFFF?A***0**110;;'.'1??*?A*..00:**?*1?0*00..5'4;''.''....:*1:	NM:i:24	AS:i:132	XS:i:0
M00532:8:000000000-A17VF:1:1101:13390:2020	147	Serratia	4719007	60	116M	=	4718589	-534	GCCGCTGGTGACGACGTCATCGCCCTGCAGGCGCGTACCGGCGTGGAAGACCTTACCGTCGGCGCTTTCCTGCTGCGGTAGCCCCTGGATAACCTCACCGTTGCGATAGTCGCCCG	<;8<(=;<96''666;6(2<<?29;E9<<;83@@@:8:=DEDEED@9ED;+D@CD5CCCC=CDEEEEEEDEC>CCCE@ECCCCEFGEEEEC8,CC+>@CC@@@@@@7@=<7==,9=	NM:i:7	AS:i:83	XS:i:0
M00532:8:000000000-A17VF:1:1101:17000:2031	77	*	0	0	*	*	0	0	AGGGCAACCACAACCCTCTGATGCAAATGCTCTAATGATCGTCCCTCATCCGATTTAAGAGCATTGATTAAGAGAGGTAGCTCAAGAGACTTGTTAAGAGGACCACCTTCGGGATCTTC	????????DDDDEEDDGGGGGGIIHIIHIIIIHHIIGHHIIHIHHIIIIHIHHHHHHHIHIHIIIFGHHIIIIIHHHHHDHHIHDGFFEEGHIIIIIIHHDFFHHHHGHHHHHHHGGGG	AS:i:0	XS:i:0
M00532:8:000000000-A17VF:1:1101:17000:2031	141	*	0	0	*	*	0	0	AACTCCGAAATCAGAGAATCGTTCAAAGGTCATGTTGCCGACCGACTTAAACGAATATCGACCCATGCGGGCTATGCAGTGTATCAGGATGTCTGGAGAGAGGTGCTGAGACATTGGGGTAACCCAGCACCTCAAGTTGATACAGAGTCACACCTAATAGATCTGTTTGAAATCGCTATCAATCGTGCTCGTTCACAAAAAAGGTTAT	?A??AB?ADDDBDDBBFGGGGGHHHIIHIEFFHHHIIIHHHHHHHHHHIHEFHHHIIFHIHHHHHIHHEHEHBFHHHHHD,4CFGEFGGGDFGFGEDDD@E@D4A>C->ACGGGGGGGG@CEEGCC8CEGGGCE??C*0:?CEGECCCEGCEGCEGG?CGECEEG:CEGGCC?*8C9CCCE:**.0CCC:C289??CGGCC288::::	AS:i:0	XS:i:0
M00532:8:000000000-A17VF:1:1101:14261:2037	83	Serratia	3418147	60	148M106S	=	3418067	-228	TGATTTCTCACCAATCAATACCTCTGGGATCACCTACTCTAGAGAGATGGCGTGCAGGAAACTACCGCCGAATCGCAAGAGTTCTGCTCCAAATGCAACCAGCTGTAAATTTCCCGCGATCTGCTGTAATAACTCAATGAAACTTAAACCTTCCCGCAGCGACGAAAATAAATATAACAACGACGAGCCAGTGACCCAGACGTGAAATCTTCACTCATCGCGCGCTGACCTCGACCAGGCAGCTCATGGCGTTC	:1*:1*0*)***:*A:CA::C8??:A1::***00110*::EAA?AEEE?2)A0**EFEC:182..'DE?FFFD>EAFEEECFEFEA??*CEFFEFEFFFFEC:?AEFFEA8?D>DEFFFFEEECEEEFFEEEEA?CEC:CCAEECCC=BE@@EBE@:EEFEFADDFHHHHHHHHFFFBEDED=HHHHHHHHHFFHHFHHHHHGFHHHFHHHFGHHHHHEHEHHHFHHHHHHHFFFFFFDBBBDBBD?@??????	NM:i:3	AS:i:133	XS:i:0
M00532:8:000000000-A17VF:1:1101:14261:2037	163	Serratia	3418067	60	149M	=	3418147	228	ACTATATGGCAGGCAAAAAAAAACCTACGCATCCGCGTAGGTTGGTGCAATTGAAAATGGCTTCAACATACAGAGTATGCTGATTTCTCACCAATCAATACCTCTGGGATCACCTACTCTAGAGAGATGGCGTGCAGGAAACTACCGCC	?????BBBDDDDDDDEGGGGGGHHHIHFEHHEHIEHE>FHDCGFCH;GHFHHGFIIIIIIBFFFHFDDC,@FDFDCFHHHHHHHHHFHFGGGGGFFFGEGGG.4D>D=EGGGGGGGGCEGGCCCE8;A->C8C2CE?CGCCGGGG:28A	NM:i:1	AS:i:144	XS:i:0
M00532:8:000000000-A17VF:1:1101:14313:2053	99	Serratia	293636	60	15S208M	=	294091	552	TTTGCGTGCAGCTGATGAGGTTGCATTTTATTACAACTGTGTCTGCCGCTTTCGGAATCATGTTAATGATTACTTAAGAAATTCGGCTCACATTGAGGGCTTAACCCAAGGAGGCCTCAATGTTAAATGCGACCCGGCTGCAACTGATGAATCACTTCGCTTACCTGCAGCAATTTATGGCTTCACCGCGCACCGTCGGTACGCTGGCACCTTCTTCCCCGTG	?????@=?D-5<BDBBFCFFC;CFFF;>EAEFFFGHHHFBD?EGFGGC+>EFH@7EDFHHFGHFHHHHHDEHFHHHGGHHHHHGFDEHHGCEEHGHHHEEHFHHHHHHHFDD;D:FF,B?4DDFEEEEBDE@@<@BEBEBEEEE:EEC:A**::AAAEEAEEEEEE??CEECE:*::?AAC:8*0?*).8;8;''8?2>8?))58A2):C)?0::EE0:C8?A	NM:i:4	AS:i:188	XS:i:0
M00532:8:000000000-A17VF:1:1101:14313:2053	147	Serratia	294091	60	97M	=	293636	-552	CTGCCTCTGCTGTCGATCCCGGTCAGGATCAGCGTGCGCATTCTTCAGCAGGCCCGGCAGCGGCTGCTGGCGCGCAACGGCACGCTGGTGCTGTTCC	9?::*7EE@:::*:8*@@@8+<+;+@:DC9;CCDDEEEEEEDEECA=EEDDC>C5ECC7CC@E;DA+@CCCC79EC+CA+E@<-@@@@@>>9===<=	NM:i:3	AS:i:85	XS:i:0
M00532:8:000000000-A17VF:1:1101:11897:2065	99	Serratia	4185170	60	243M11S	=	4185331	352	TTTCCCTGAAAAGATAACGTATTGAGGATTCACCATGAGCATCAAAAATATTTTACCCGGCAAGATCGGTTTGGGCGGCGCGCCGCTCGGCAATATGTACCGCGCCATTCCAGAAGAAGAAGCGCGGGCTACCGTAACCAGCGCCTGGGACTTGGGCATCCGCTACTTCGACACCGCGCCGCTTTACGGTTCCGGCCTGTCGGAAATTCGCATGGGCGAAGCGCTTTCTCAGTACCCACGCGATGAGTTCGTAC	AAAAABBBDDDDDDDDGFEFGFFHHHFFHIFCFGHHFFHIIIHIIHHIHHHIIIIHHIHHHHHHFHIFFCFHEEHHHHHHEGGGEGGGGGGGGGGGGEGCCC'8>>EGGGGGCCGGGGGGGGEGGGGGGGCEC>EGGEGEEGGGDEEGGGGEEG?EGGGGGGGEDDGG?EGECAG8>A>DDGGGGGGEECECEC<>AG8:*8.48<*?EEEGECCEEA<24<G<)8??EGCC:::?C?C8C8<8CEG1:*08:C	NM:i:28	AS:i:103	XS:i:0
M00532:8:000000000-A17VF:1:1101:11897:2065	147	Serratia	4185331	60	191M	=	4185170	-352	GCTACTTCGACACCGCGCCGCTTTACGGGTCCGGCCTGTCGGAAATTCGCATGGGCGAAGCGCTTTCTCAGTACCCACGCGATGAGTTCGTACTGAGCACTAAAGTGGGCCGCATCATGCTGGACGAAATGGAAGATCCCGCCGCCCGCGATCTGGGTGAGAAAGGCGGCCTGTTCGAACACGGTTTGAAA	::C?ECCC:88.428>D<C?C8)>8<85'>D<C80'88??CCEEADGGEEEEGGE>GGGGEECGGEEECECE>DGG>DEGGGGGGGGEEEC:GGGC?CEGGGGGGGGGG>DCGEGGGGGEGEE?ECC;GEGGGGGGGGGHDHHHHDHHHHHHHIIIIHIHGIHHFEEIIGGGGGGDDDDDDDDBBB?????	NM:i:22	AS:i:85	XS:i:0
M00532:8:000000000-A17VF:1:1101:18758:2082	121	Serratia	2219275	60	247M	=	2219275	0	GTAGCTCCATGCCAACCCCAAGGCCAGAAAGCCGTTGTACAGCCCCTGATTGGCGGCCAACNCCCGCGTGGCTGCGGCGAACTCCGCCGTTGTGCCGAATGCGCGCCTGCCGAGTGGCGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGNAGCGCTGCGATGAGCAGCAGCAANATATCNGCAAAGAACTTCATGGTGGTNCCAGTCGTTTNTTGAAGATGTCATCAATTATAAACT	:CCCEC?9?C:'2..88::CCC:0*:E:.4'..*C:CCECAGDC:**GE><4'<?:?.000#5DDE<EGE<A>D>CDC:?8GGGG>GGGGGGDGGGGGGGGGGGGGGGGGEGEC??444#####################################F?6#HHHHHHHIIIIIIIHIIIHFCA5#EFEA5#IHHIIHHHIHHIHIHFFFA7#HHEHHHFFA7#HFHGGGGGGDDDDDDDDBBBA????	NM:i:56	AS:i:100	XS:i:0

Last edited by prs321; 06-14-2013 at 11:12 AM.
prs321 is offline   Reply With Quote
Old 06-14-2013, 11:10 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

It is hard to see since you are not using the Quote/Code tags (you will find those under the "Go advanced" tab when you are editing a post, highlight the text that you want to quote/code and then use the "#" button in the two row of icons).

It does look right (have you clipped off some lines from the top): Here is simple explanation of SAM http://genome.sph.umich.edu/wiki/SAM
Actual format specification: http://samtools.sourceforge.net/SAM1.pdf
SAM flag meaning: http://picard.sourceforge.net/explain-flags.html

Last edited by GenoMax; 06-14-2013 at 11:14 AM.
GenoMax is offline   Reply With Quote
Old 06-14-2013, 11:13 AM   #3
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default

It has been fixed. What is the next step? I have 20 other sequences. Do I map them to the same sam file or will I end up having 21 separate sam files that I end up mapping together?
prs321 is offline   Reply With Quote
Old 06-14-2013, 11:15 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by prs321 View Post
It has been fixed. What is the next step? I have 20 other sequences. Do I map them to the same sam file or will I end up having 21 separate sam files that I end up mapping together?
What are you trying do exactly?
GenoMax is offline   Reply With Quote
Old 06-14-2013, 11:26 AM   #5
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default

Trying to map the 21 paired end sequences of Serratia over the reference. Then I think I'm supposed to do some sort of analysis.
prs321 is offline   Reply With Quote
Old 06-14-2013, 11:27 AM   #6
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default

Reference Genome *
prs321 is offline   Reply With Quote
Old 06-14-2013, 11:34 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by prs321 View Post
Trying to map the 21 paired end sequences of Serratia over the reference.
If these are separate samples then you need to do the alignments separately. By creating these sam files you are mapping the reads to the reference sequence.

Quote:
Originally Posted by prs321 View Post
Then I think I'm supposed to do some sort of analysis.


Are you looking to identify SNP/SV or going to do some sort of phylogenetic analysis?
GenoMax is offline   Reply With Quote
Old 06-14-2013, 11:36 AM   #8
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default After using FASTQC and Trim_Galore on my data, I used BWA with my first paired end..

What do you mean by 21 sequences?

Do you have 21 fastq files from one sample, or do you have files that are from 21 different samples?

How many reads are in each file?
mastal is offline   Reply With Quote
Old 06-14-2013, 11:39 AM   #9
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default

Sorry, let me clarify.

I have 21 pairs of fastq files. The first 20 pairs (40 fastq files) are the paired end files for the specimens examined. The last pair is the ancestral one.

I think the base pair reads vary from 20-255ish.
prs321 is offline   Reply With Quote
Old 06-14-2013, 11:41 AM   #10
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default

And I'm not really sure what I'm supposed to do after aligning and mapping, it's up to the person who is higher up from me.

I was supposed to first clean the files, which I did using Trim_Galore.

Now I'm supposed to learn how to align and map.

The final process is some sort of analysis, i'm not too sure. Probably looking at SNPs and mutations and such and then seeing if the results were consistent with the paper.
prs321 is offline   Reply With Quote
Old 06-17-2013, 05:21 AM   #11
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default

bump 10char
prs321 is offline   Reply With Quote
Old 06-17-2013, 05:55 AM   #12
chadn737
Senior Member
 
Location: US

Join Date: Jan 2009
Posts: 392
Default

Its impossible to help you when you can't even answer basic questions of what you are trying to do. You should talk to your boss/professor and get a clear idea of what the goal is. Then make use of google, google scholar, and pubmed and read papers on the subject. Go to the websites of the actual tools you use and learn the manuals. After you have done your homework.....then ask specific questions that you still have. As a researcher, one of the most important skills (perhaps the most important) is being able to find information on your own.
chadn737 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:19 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO