Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq help

    Hello everybody
    I am new in RNAseq analysis, so I have some trouble trying to study differential gene expression. DEseq manual is really informative to work with but my question is about creating the input file. I found HT count but it needs a gff file and a SAM file with the aligned reads. If there is not a reference genome or transcriptome to download a gff file which is the way to construct it? I suppose by changing a tab delimited file from blastx but which exactly is the format for a gff as an input to HT count? Is there any automated software to annotate the constructed transcripts (output of e.g. Oases or Trinity) and create a gff?
    Thanks in advance.

  • #2
    It might be easier to say what you do have and then work from there. Ultimately DESeq just needs a matrix of count values for different genes and it doesn't really care where they come from, so if you have count data already you can just make a matrix out of it and start from that point in DESeq.

    Comment


    • #3
      DE seq

      Dear Simon
      Thank you very much for your reply.
      The fact is that I don't have the count data. I have the BAM files from tophat, where the reads are mapped on the contructed transcriptome from Trinity. The problem is how to produce the count table for DESeq. I tryed HTseq but I had problems on installing it (and also needs a gff). So i need the way to produce the count table for DEseq

      Comment


      • #4
        If the reads are mapped to assembled transcripts already then your count table is simply going to be the number of hits to each different accession in the set of transcripts. You just need to collate the number of hits to each different sequence id.

        There may well be a program around which collates this already, but if not then it's a very small script to generate this data from a set of BAM files. If no one knows of a pre-built solution for this and you're not confident in having a go yourself then I can put something together which does this pretty quickly.

        Comment


        • #5
          Having thought about this I found I did have a script which almost does what you want. This was something we used for miRNA mapping, but it's essentially the same idea you have in that you're mapping to a library of different sequences. The script below simply collates the hits to each different sequence in a set of SAM files (you could pipe the open of your bam files through "samtools view" to make it work from BAM files.

          Code:
          #!/usr/bin/perl
          use warnings;
          use strict;
          
          my @sam_files = <*sam>;
          
          my %counts;
          
          for my $index (0..$#sam_files) {
              my $sam_file = $sam_files[$index];
          
              warn "Processing $sam_file\n";
          
              open (IN,$sam_file) or die $!;
          
              while (<IN>) {
          	next if (/^@/);
          
          	my (undef,undef,$mirna) = split(/\t/);
          	next if ($mirna eq '*');
          
          	$counts{$mirna}->[$index]++;
              }
          }
          
          open (OUT,'>','summarised_mirna_counts.txt') or die $!;
          
          print OUT join("\t",("miRNA",@sam_files)),"\n";
          
          foreach my $mirna (keys %counts) {
          
              my @counts = @{$counts{$mirna}};
          
              for (0..$#counts) {
          	$counts[$_] = 0 unless ($counts[$_]);
              }
          
              for ($#counts+1..$#sam_files) {
          	$counts[$_] = 0;
              }
          
              print OUT join("\t",($mirna,@counts)),"\n";
          } 
          
          
          close OUT or die $!;

          Comment


          • #6
            Hi,
            I used HTseq from python. First you need to install python, then HTseq.... Just follow the instruction.

            Comment


            • #7
              DEseq help

              Dear Simon
              Thank you very much for your script.
              But I can't understand the script, since I have no background on programming. Anyway, I ll read some Pearl and I will give it a try. Thank you very much.

              Comment


              • #8
                DEseq help

                Thank you twotwo

                I will give it a try again....

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X