SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
EDGE-pro into DESeq, DESeq run error? epistatic Bioinformatics 5 09-02-2015 04:32 AM
DESeq error clemen Bioinformatics 22 03-27-2014 04:59 AM
DESeq anil KR Illumina/Solexa 3 08-02-2013 06:38 PM
DESeq question drdna Bioinformatics 2 11-24-2012 11:06 AM
DESeq and edgeR papori Bioinformatics 3 05-15-2012 06:29 PM

Reply
 
Thread Tools
Old 01-13-2014, 09:16 AM   #1
emalandrak
Member
 
Location: Greece

Join Date: Jan 2014
Posts: 11
Default DESeq help

Hello everybody
I am new in RNAseq analysis, so I have some trouble trying to study differential gene expression. DEseq manual is really informative to work with but my question is about creating the input file. I found HT count but it needs a gff file and a SAM file with the aligned reads. If there is not a reference genome or transcriptome to download a gff file which is the way to construct it? I suppose by changing a tab delimited file from blastx but which exactly is the format for a gff as an input to HT count? Is there any automated software to annotate the constructed transcripts (output of e.g. Oases or Trinity) and create a gff?
Thanks in advance.
emalandrak is offline   Reply With Quote
Old 01-14-2014, 12:30 AM   #2
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

It might be easier to say what you do have and then work from there. Ultimately DESeq just needs a matrix of count values for different genes and it doesn't really care where they come from, so if you have count data already you can just make a matrix out of it and start from that point in DESeq.
simonandrews is offline   Reply With Quote
Old 01-14-2014, 04:38 AM   #3
emalandrak
Member
 
Location: Greece

Join Date: Jan 2014
Posts: 11
Default DE seq

Dear Simon
Thank you very much for your reply.
The fact is that I don't have the count data. I have the BAM files from tophat, where the reads are mapped on the contructed transcriptome from Trinity. The problem is how to produce the count table for DESeq. I tryed HTseq but I had problems on installing it (and also needs a gff). So i need the way to produce the count table for DEseq
emalandrak is offline   Reply With Quote
Old 01-14-2014, 04:47 AM   #4
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

If the reads are mapped to assembled transcripts already then your count table is simply going to be the number of hits to each different accession in the set of transcripts. You just need to collate the number of hits to each different sequence id.

There may well be a program around which collates this already, but if not then it's a very small script to generate this data from a set of BAM files. If no one knows of a pre-built solution for this and you're not confident in having a go yourself then I can put something together which does this pretty quickly.
simonandrews is offline   Reply With Quote
Old 01-14-2014, 04:57 AM   #5
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Having thought about this I found I did have a script which almost does what you want. This was something we used for miRNA mapping, but it's essentially the same idea you have in that you're mapping to a library of different sequences. The script below simply collates the hits to each different sequence in a set of SAM files (you could pipe the open of your bam files through "samtools view" to make it work from BAM files.

Code:
#!/usr/bin/perl
use warnings;
use strict;

my @sam_files = <*sam>;

my %counts;

for my $index (0..$#sam_files) {
    my $sam_file = $sam_files[$index];

    warn "Processing $sam_file\n";

    open (IN,$sam_file) or die $!;

    while (<IN>) {
	next if (/^@/);

	my (undef,undef,$mirna) = split(/\t/);
	next if ($mirna eq '*');

	$counts{$mirna}->[$index]++;
    }
}

open (OUT,'>','summarised_mirna_counts.txt') or die $!;

print OUT join("\t",("miRNA",@sam_files)),"\n";

foreach my $mirna (keys %counts) {

    my @counts = @{$counts{$mirna}};

    for (0..$#counts) {
	$counts[$_] = 0 unless ($counts[$_]);
    }

    for ($#counts+1..$#sam_files) {
	$counts[$_] = 0;
    }

    print OUT join("\t",($mirna,@counts)),"\n";
} 


close OUT or die $!;
simonandrews is offline   Reply With Quote
Old 01-14-2014, 05:54 AM   #6
twotwo
Member
 
Location: Ohio

Join Date: May 2012
Posts: 40
Default

Hi,
I used HTseq from python. First you need to install python, then HTseq.... Just follow the instruction.
twotwo is offline   Reply With Quote
Old 01-14-2014, 06:00 AM   #7
emalandrak
Member
 
Location: Greece

Join Date: Jan 2014
Posts: 11
Default DEseq help

Dear Simon
Thank you very much for your script.
But I can't understand the script, since I have no background on programming. Anyway, I ll read some Pearl and I will give it a try. Thank you very much.
emalandrak is offline   Reply With Quote
Old 01-14-2014, 06:01 AM   #8
emalandrak
Member
 
Location: Greece

Join Date: Jan 2014
Posts: 11
Default DEseq help

Thank you twotwo

I will give it a try again....
emalandrak is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:50 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO