SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
HLA and KIR typing Lj Lee Bioinformatics 1 06-03-2014 12:01 PM
How to find SNP's in various genes fizzle123456789 Bioinformatics 9 03-11-2014 06:36 AM
Find representative genes SDPA_Pet Bioinformatics 5 09-16-2013 08:25 AM
How to find DE genes using RPKM values? casshyr Bioinformatics 2 10-08-2010 08:03 AM

Reply
 
Thread Tools
Old 01-08-2015, 05:41 PM   #1
arkilis
Senior Member
 
Location: Australia

Join Date: Jul 2013
Posts: 119
Default how to use mafft do find out kir genes?

recently get a batch of data from miseq, is there way to use mafft to find those kir genes?

I searched online for mafft usage, which looks no where to put kir databases.

http://mafft.cbrc.jp/alignment/software/

many thanks!
arkilis is offline   Reply With Quote
Old 01-08-2015, 06:10 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,079
Default

kir == Killer cell immunoglobulin-like receptors (KIRs) genes? Why not get the sequences you are interested in, use BBSplit to fish out the sequences from MiSeq that map and then do mafft or some other MSA on the sequences of interest.
GenoMax is offline   Reply With Quote
Old 01-11-2015, 04:04 PM   #3
arkilis
Senior Member
 
Location: Australia

Join Date: Jul 2013
Posts: 119
Default

Thanks GenoMax.

I tried the whole weekend. BBSplit/BBMap looks like a alignment tool. I have the KIR gene database, need to get the what it has (KIR) in the sequence files. Any clue on this?

Thanks again!
arkilis is offline   Reply With Quote
Old 01-11-2015, 05:00 PM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,079
Default

Quote:
Originally Posted by arkilis View Post
Thanks GenoMax.
I have the KIR gene database, need to get the what it has (KIR) in the sequence files. Any clue on this?
Can you elaborate as to what "database" here means? You have KIR genes sequences but in what format (plain text, blast, one of NGS aligner index format)?

BBSplit uses alignments to identify reads that map to sequences of interest and then separate them (this can work the other way around as well i.e. you discard reads that map as being uninteresting and collect unmapped reads to process further).
GenoMax is offline   Reply With Quote
Old 01-11-2015, 05:19 PM   #5
arkilis
Senior Member
 
Location: Australia

Join Date: Jul 2013
Posts: 119
Default

Quote:
Originally Posted by GenoMax View Post
Can you elaborate as to what "database" here means? You have KIR genes sequences but in what format (plain text, blast, one of NGS aligner index format)?

BBSplit uses alignments to identify reads that map to sequences of interest and then separate them (this can work the other way around as well i.e. you discard reads that map as being uninteresting and collect unmapped reads to process further).
Sorry for the confusion.

KIR gene from www.ebi.ac.uk/ipd/kir in fasta format. I might want to use this as the database to find out how many of them are in the current sequence files. I read some papers online, which says all the sequences (fastq) need to be assemblied first. http://www.nature.com/nrg/journal/v1...l/nrg3174.html

Newbie to this area. Thanks for your advice!

Ben

Last edited by arkilis; 01-11-2015 at 05:39 PM.
arkilis is offline   Reply With Quote
Old 01-11-2015, 05:41 PM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,079
Default

If your KIR files are in fasta format then here is what you do (command only for reference, make appropriate changes as needed):

If you have single end reads
Code:
$ bbsplit.sh in=reads.fq ref=KIR.fa outu=things_that_do_not_match.fq outm=reads_match_KIR.fq
Reads that match things in KIR.fa are collected in "reads_match_KIR.fq" file rest are put in the other file.

If you have paired-end reads:
Code:
$ bbsplit.sh in1=reads1.fq in2=reads2.fq ref=KIR.fa outu1=no_match_1.fq outu2=no_match_2.fq outm1=matched_to_KIR1.fq outm2=matched_to_KIR_2.fq
Note: This should definitely work with a single KIR sequence file (it also may work with a multi-fasta file). Give it a try and see what happens.
GenoMax is offline   Reply With Quote
Old 01-12-2015, 09:47 PM   #7
arkilis
Senior Member
 
Location: Australia

Join Date: Jul 2013
Posts: 119
Default

Quote:
Originally Posted by GenoMax View Post
If your KIR files are in fasta format then here is what you do (command only for reference, make appropriate changes as needed):

If you have single end reads
Code:
$ bbsplit.sh in=reads.fq ref=KIR.fa outu=things_that_do_not_match.fq outm=reads_match_KIR.fq
Reads that match things in KIR.fa are collected in "reads_match_KIR.fq" file rest are put in the other file.

If you have paired-end reads:
Code:
$ bbsplit.sh in1=reads1.fq in2=reads2.fq ref=KIR.fa outu1=no_match_1.fq outu2=no_match_2.fq outm1=matched_to_KIR1.fq outm2=matched_to_KIR_2.fq
Note: This should definitely work with a single KIR sequence file (it also may work with a multi-fasta file). Give it a try and see what happens.
Finally I used the command you suggested:

bbsplit.sh in1=read1.fastq in2=read2.fastq ref=KIR_nuc.fasta outu1=no_match_1.fq outu2=no_match_2.fq outm1=matched_to_KIR1.fq outm2=matched_to_KIR2.fq

Will discuss with colleagues and get back Thanks!
arkilis is offline   Reply With Quote
Reply

Tags
kir, mafft

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:17 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO