Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTseq:Adding GTF annotation to SAM alignment

    Please help,

    I mapped my RNA-seq (Illumina) reads to E. coli index (index built from UCSC obtained FASTA file) using BowTie. Now I'm trying to assign gene annotation to the mapped reads. I decided to use HTSeq for this purpose. I use SAM file i obtained after running Bowtie (also used TopHat to get the SAM file) with GTF files from either UCSC Microbial or Ensembl Bacterial (I also tried converting them to GFF3 format). In either case, when I run HTseq-count using the command:
    $ HTseq-count <file.sam> <file.gff (or .gtf)>

    I get an error message such as:
    Warning: Skipping read 'HWI-ST330_0103:1:1:19499:9909#CGATGT/1', because chromosome 'gi|49175990|ref|NC_000913.2|', to which it has been aligned, did not appear in the GFF file.

    I also tried to provide GTF or GFF files while running TopHat, but it failed in that case also.

    Can someone one make a suggestion or has previously encountered such problem and knows how to solve it? Please help?!

    Also if anyone has done RNA-seq for bacteria, what is your workflow? My plan for now is BowTie (or TopHat) -> HTseq -> DEseq

    Appreciate any feedback, thank you!

    Max Bobrovskyy
    University of Illinois

  • #2
    So, does chromosome 'gi|49175990|ref|NC_000913.2|' appear in your GFF file or does it not? That would have been the first thing to check.

    Obviously, chromosomes need to be identified with the same name in you Bowtie index and in your GFF file. Your Bowtie index seems to use these strange lengthy chromosome identifiers that the FASTA files from UCSC always use.

    (A quick search on RefSeq tell me that NC_000913.2 is "Escherichia coli str. K-12 substr. MG1655 chromosome, complete genome". Is this what you want?)

    Now look into your GFF file, how is it called there? Not the same, I'd guess.

    You need to the reference sequence names in one of two file to what is used in the other. However, be very careful to make sure there that both files base their coordinates on the same built (assembly).

    Comment


    • #3
      Dear Simon,

      So, I downloaded "Escherichia coli str. K-12 substr. MG1655" genome in fasta (NC_000913.fna) and gff (NC_000913.gff) formats, from RefSeq ftp, so both files should be made from the same genome built (NC_000913.2). Here are the first 10 lines of each.
      FNA

      gi|49175990|ref|NC_000913.2| Escherichia coli str. K-12 substr. MG1655 chromosome, complete genome
      AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
      TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA
      TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC
      ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG
      CCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
      GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC
      AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTG
      AAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT
      GACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTT


      GFF

      ##gff-version 3
      #!gff-spec-version 1.14
      #!source-version NCBI C++ formatter 0.2
      ##Type DNA NC_000913.2
      NC_000913.2 RefSeq source 1 4639675 . + . organism=Escherichia coli str. K-12 substr. MG1655;mol_type=genomic DNA;strain=K-12;sub_strain=MG1655;db_xref=taxon:511145
      NC_000913.2 RefSeq gene 190 255 . + . ID=NC_000913.2:thrL;locus_tag=b0001;gene_synonym=ECK0001%3B JW4367;db_xref=ECOCYC:EG11277;db_xref=EcoGene:EG11277;db_xref=GeneID:944742
      NC_000913.2 RefSeq CDS 190 252 . + 0 ID=NC_000913.2:thrL:unknown_transcript_1;Parent=NC_000913.2:thrL;locus_tag=b0001;gene_synonym=ECK0001%3B JW4367;function=leader%3B Amino acid biosynthesis: Threonine;function=1.5.1.8 metabolism%3B building block biosynthesis%3B amino acids%3B threonine;GO_process=GO:0009088 - threonine biosynthetic process;transl_table=11;product=thr operon leader peptide;protein_id=NP_414542.1;db_xref=GI:16127995;db_xref=ASAP:ABE-0000006;db_xref=UniProtKB%2FSwiss-Prot:P0AD86;db_xref=ECOCYC:EG11277;db_xref=EcoGene:EG11277;db_xref=GeneID:944742;exon_number=1
      NC_000913.2 RefSeq start_codon 190 192 . + 0 ID=NC_000913.2:thrL:unknown_transcript_1;Parent=NC_000913.2:thrL;locus_tag=b0001;gene_synonym=ECK0001%3B JW4367;function=leader%3B Amino acid biosynthesis: Threonine;function=1.5.1.8 metabolism%3B building block biosynthesis%3B amino acids%3B threonine;GO_process=GO:0009088 - threonine biosynthetic process;transl_table=11;product=thr operon leader peptide;protein_id=NP_414542.1;db_xref=GI:16127995;db_xref=ASAP:ABE-0000006;db_xref=UniProtKB%2FSwiss-Prot:P0AD86;db_xref=ECOCYC:EG11277;db_xref=EcoGene:EG11277;db_xref=GeneID:944742;exon_number=1
      NC_000913.2 RefSeq stop_codon 253 255 . + 0 ID=NC_000913.2:thrL:unknown_transcript_1;Parent=NC_000913.2:thrL;locus_tag=b0001;gene_synonym=ECK0001%3B JW4367;function=leader%3B Amino acid biosynthesis: Threonine;function=1.5.1.8 metabolism%3B building block biosynthesis%3B amino acids%3B threonine;GO_process=GO:0009088 - threonine biosynthetic process;transl_table=11;product=thr operon leader peptide;protein_id=NP_414542.1;db_xref=GI:16127995;db_xref=ASAP:ABE-0000006;db_xref=UniProtKB%2FSwiss-Prot:P0AD86;db_xref=ECOCYC:EG11277;db_xref=EcoGene:EG11277;db_xref=GeneID:944742;exon_number=1
      NC_000913.2 RefSeq gene 337 2799 . + . ID=NC_000913.2:thrA;locus_tag=b0002;gene_synonym=ECK0002%3B Hs%3B JW0001%3B thrA1%3B thrA2%3B thrD;db_xref=ECOCYC:EG10998;db_xref=EcoGene:EG10998;db_xref=GeneID:945803



      I made the index from FNA file using bowtie-build. Then I mapped my Trimmed and Groomed Illumina reads to this index. Next I tried to use htseq-count again inputting SAM file from bowtie with the GFF file described above. I get the following Warning again:

      mbobro2:tophat bobrovskyy$ htseq-count -t -h ./output/pHDB3_5_bowtie.sam ./genomes/RefSeq_Genomes/NC_000913.gff
      52211 GFF lines processed.
      Warning: No features of type '-h' found.
      Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'gi|49175990|ref|NC_000913.2|', to which it has been aligned, did not appear in the GFF file.
      Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'gi|49175990|ref|NC_000913.2|', to which it has been aligned, did not appear in the GFF file.


      ...and keeps printing this warning over and over.

      I'm having hard time figuring out what part of FNA or GFF does not match and how to fix it, as I'm new to the unix script as well as bioinformatics. I really appreciate your help and I would love to make it work. Several other people I know have never used Bowtie and HTseq with e. coli RNAseq, so I think figuring out the way would be extremely useful for our e.coli community. Thank you very much and appreciate your help!


      Max Bobrovskyy
      University of Illinois
      Last edited by mbobro2; 05-05-2011, 02:21 PM.

      Comment


      • #4
        In your GTF file, the chromosome is called "NC_000913.2", in the FASTA file, it is called "gi|49175990|ref|NC_000913.2|". You cannot expect a script to recognize these as the same.

        I usually download my data from Ensembl, which uses shorter identidiers and is more consistent between FASTA and GTF, so I didn't address this issue so far in HTSeq.

        You could edit your FASTA file to remove the extra stuff from the sequence name and start over.

        Alternatively you could write a little script to remove the extra characters from the SAM files. This Python script here might do the trick:

        Code:
        import HTSeq
        
        for a in HTSeq.SAM_Reader( "myfile.sam" ):
           if a.aligned:
              a.iv.chrom = a.iv.chrom.split("|")[3]
           print a.get_sam_line()
        It splits each chromosome name at the vertical bars and then retains only the part between the third and fourth bar. As I don't have your SAM file, I couldn't test it, but it should be simple to adjust.

        Simon

        Comment


        • #5
          Dear Simon,

          I have tried to use Ensembl FASTA file for bowtie index and GFF file for HTseq-count, nevertheless chromosome names did not match. I slightly modified FASTA name to obtain matching chromosome name "Chromosome" in the SAM file. GTF calls it "Chromosome" too. Still, HTseq gives an error. Here are the headers of SAM and GTF files with the error I get when I run HTseq:

          SAM
          @HD VN:1.0 SO:unsorted
          @SQ SN:Chromosome LN:4639675
          @PG ID:Bowtie VN:0.12.7 CL:"./bowtie -S -q -a -t -v 1 ./indexes/Ensembl_Indexes_2/ecoli_k12_index ./rnaseq/1_CV104_pHDB3_5.fastqsanger.txt"
          HWI-ST330_0103:1:1:1210:2132#NGATGT/1 4 * 0 0 * NCCCATTCGGAAATCGCCGGTTATAACGGTTCATATCACCTTACCGACGCTTATCGCAGATTAGCACGTCCTTCATCGCCTCTGACAGA #96828<598@@@@@DDDDDDDDDD<<<><>====DDDDD??????=?7??=?;?;??????==?5<787:::;:66599<<<<<8479 XM:i:0
          HWI-ST330_0103:1:1:1202:2211#NGATGT/1 4 * 0 0 * CTGGCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCNATAAGNGTCGGTAAGGTGATATGAACCGTTATAACCGGCGATTTCCTAATG DDDD@BEEEEDEDCE?DEEE<5?@@:549;@?A######################################################## XM:i:0
          HWI-ST330_0103:1:1:1268:2150#NGATGT/1 4 * 0 0 * CTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTTACGAGTGGCGGGCG FBFDDEBEAEGEGEFAGFCEFFFFBDAAC.???0;A?D,?DC5DCCC<C;A:@7>?################################# XM:i:0
          HWI-ST330_0103:1:1:1355:2205#CGATGT/1 0 Chromosome 224030 255 89M * 0 0 GTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGG FF:CFEDEEEFGGGGEHFHDFHFFHGGBF?D5@C?GFEBECEDE;>ED?D=DEE8ED<ACCF>9CA(>AAA:=6?DD=AA;?D1@#### XA:i:0 MD:Z:89NM:i:0
          HWI-ST330_0103:1:1:1355:2205#CGATGT/1 0 Chromosome 4164941 255 89M * 0 0 GTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGG FF:CFEDEEEFGGGGEHFHDFHFFHGGBF?D5@C?GFEBECEDE;>ED?D=DEE8ED<ACCF>9CA(>AAA:=6?DD=AA;?D1@#### XA:i:0 MD:Z:89NM:i:0
          HWI-ST330_0103:1:1:1355:2205#CGATGT/1 0 Chromosome 4033813 255 89M * 0 0 GTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGG FF:CFEDEEEFGGGGEHFHDFHFFHGGBF?D5@C?GFEBECEDE;>ED?D=DEE8ED<ACCF>9CA(>AAA:=6?DD=AA;?D1@#### XA:i:0 MD:Z:89NM:i:0
          HWI-ST330_0103:1:1:1355:2205#CGATGT/1 0 Chromosome 4206429 255 89M * 0 0 GTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGG FF:CFEDEEEFGGGGEHFHDFHFFHGGBF?D5@C?GFEBECEDE;>ED?D=DEE8ED<ACCF>9CA(>AAA:=6?DD=AA;?D1@#### XA:i:0 MD:Z:89NM:i:0


          GTF
          Chromosome protein_coding exon 148 5020 . + . gene_id "EBESCG00000001716"; transcript_id "EBESCT00000002097"; exon_number "1"; gene_name "thrB"; transcript_name "thrB-1";
          Chromosome protein_coding CDS 2801 3730 . + 0 gene_id "EBESCG00000001716"; transcript_id "EBESCT00000002097"; exon_number "1"; gene_name "thrB"; transcript_name "thrB-1"; protein_id "EBESCP00000002097";
          Chromosome protein_coding start_codon 2801 2803 . + 0 gene_id "EBESCG00000001716"; transcript_id "EBESCT00000002097"; exon_number "1"; gene_name "thrB"; transcript_name "thrB-1";
          Chromosome protein_coding stop_codon 3731 3733 . + 0 gene_id "EBESCG00000001716"; transcript_id "EBESCT00000002097"; exon_number "1"; gene_name "thrB"; transcript_name "thrB-1";
          Chromosome protein_coding exon 148 5020 . + . gene_id "EBESCG00000000900"; transcript_id "EBESCT00000001087"; exon_number "1"; gene_name "thrL"; transcript_name "thrL-1";
          Chromosome protein_coding CDS 190 252 . + 0 gene_id "EBESCG00000000900"; transcript_id "EBESCT00000001087"; exon_number "1"; gene_name "thrL"; transcript_name "thrL-1"; protein_id "EBESCP00000001087";
          Chromosome protein_coding start_codon 190 192 . + 0 gene_id "EBESCG00000000900"; transcript_id "EBESCT00000001087"; exon_number "1"; gene_name "thrL"; transcript_name "thrL-1";
          Chromosome protein_coding stop_codon 253 255 . + 0 gene_id "EBESCG00000000900"; transcript_id "EBESCT00000001087"; exon_number "1"; gene_name "thrL"; transcript_name "thrL-1";
          Chromosome protein_coding exon 148 5020 . + . gene_id "EBESCG00000002850"; transcript_id "EBESCT00000003475"; exon_number "1"; gene_name "thrC"; transcript_name "thrC-1";
          Chromosome protein_coding CDS 3734 5017 . + 0 gene_id "EBESCG00000002850"; transcript_id "EBESCT00000003475"; exon_number "1"; gene_name "thrC"; transcript_name "thrC-1"; protein_id "EBESCP00000003475";


          HTseq error
          Maksym-Bobrovskyys-MacBook-Pro:tophat bobrovskyy$ htseq-count -t -h ./output/pHDB3_5/pHDB3_5_bowtie.sam ./genomes/Ensembl_Genomes/e_coli_k12.EB1_e_coli_k12.60.gtf
          20586 GFF lines processed.
          Warning: No features of type '-h' found.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1355:2205#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.
          Warning: Skipping read 'HWI-ST330_0103:1:1:1303:2244#CGATGT/1', because chromosome 'Chromosome', to which it has been aligned, did not appear in the GFF file.


          I tried different combinations of FASTA and GTF files and it wouldn't work, I would still get the same message. I really appreciate your help on this!

          I tried the script you provided me with in order to change SAM file from my previous example, but because my SAM file is 35Gb (~25 Million reads mapped) It was problematic for me to finish the job on my MacBook as it takes most of CPU (we are getting access to a local cluster/server, so this job will become easier and faster). But I figured it would be easier to provide appropriate FASTA from the beginning as I have other 5 datasets that require same sort of analysis.

          Please let me know what you think! I'm sure there is a simple solution that I just don't see. Thank you in advance!


          Max Bobrovskyy
          University of Illinois
          Last edited by mbobro2; 05-10-2011, 11:02 AM.

          Comment


          • #6
            Your problem is here:

            Originally posted by mbobro2 View Post
            Maksym-Bobrovskyys-MacBook-Pro:tophat bobrovskyy$ htseq-count -t -h ./output/pHDB3_5/pHDB3_5_bowtie.sam ./genomes/Ensembl_Genomes/e_coli_k12.EB1_e_coli_k12.60.gtf
            20586 GFF lines processed.
            Warning: No features of type '-h' found.
            You instructed HTSeq-count to only look at lines in the GFF files with feature type (i.e., third column) "-h". As such lines don't exist, it skips the whole GFF file. Omit the "-t -h" in your command line.

            Comment


            • #7
              Yay, thank you very much! It seems to be working! Highly appreciate your help! I might have some more questions on how to generate a tab-delimited table with multiple samples but I know there is a section on that in the manual. Hopefully I will be able to figure it out.

              Once again, thank you for your help and wonderful software!

              Max Bobrovskyy
              University of Illinois

              Comment


              • #8
                Dear Simon,

                So I ran HTseq and it seems like it did count some of the reads and what I obtained is this:


                EBESCG00000102254 40
                EBESCG00000102255 1
                EBESCG00000102256 1
                EBESCG00000102257 343
                EBESCG00000102258 7
                EBESCG00000102260 0
                EBESCG00000195128 0
                EBESCG00000195129 0
                EBESCG00000210071 0
                EBESCG00000210072 188
                EBESCG00000210073 0
                EBESCG00000210074 0
                EBESCG00000210075 67
                no_feature 55630126
                ambiguous 2951860
                too_low_aQual 0
                not_aligned 3760807
                alignment_not_unique 0


                What I don't understand is why I get Gene_ID instead of Gene_Name in the first column! Can you suggest a way to fix this!? (Unless I have to use TopHat with a gtf file for annotation first, which I thought was not necessary for bacteria due to lack of splice junctions).

                Also, I'm not sure but it seams like get a lot of reads that have no_feature, ambiguous and not_aligned. Is there a way to improve or is this a common thing?

                Max Bobrovskyy
                University of Illinois

                Comment


                • #9
                  Simon,

                  I think all I need is to use -i gene_name option so that the final tab_delimited file will contain gene_name attribute instead of gene_id! I'm trying it right now. Sorry for unnecessary question! Let me know in case I'm wrong!

                  Nevertheless, number of no_feature, ambiguous and not_aligned reads worries me! What are your thoughts on it?

                  PS: Is it ok to make a tab_delimited table with multiple samples simply by pasting final result for two samples into excel and saving as tab delimited! If not what would be an easier way to do it? This table is intended to be used in DEseq. Thank you!

                  Max Bobrovskyy
                  University of Illinois

                  Comment


                  • #10
                    bacterial genome diff expression

                    Finally did u manage with the analysis. Even I am working with bacterial genome and your input will be very helpful

                    Thank you

                    Comment


                    • #11
                      I have managed to get differential expression values using DEseq. Nevertheless, I have a small problem, some of my genes show 0 reads, whereas I know for a fact they are expressed. One example is sRNA SgrS, which shows up to have 0 reads but I know it is expressed because a Northern blot was performed on that sample. Also SgrT (small prot. encoded in SgrS is not expressed). Another example is manXYZ genes, which should be expressed but also show up as 0 reads. FASTA and GTF files obtained from Ensembl Bacteria. Single end priming was used in RNAseq.

                      I used bowtie for mapping with the following options:

                      ./bowtie -S -q -a -t -v 2 ./indexes/Ensembl_Indexes_2/ecoli_k12_index ./rnaseq/pHDB3_5.fastqsanger.txt > ./output/rnaseq_2/pHDB3_5_bowtie.sam

                      I used HTseq for annotation using the following options:

                      htseq-count -m intersection-nonempty -i gene_name ./output/pLCV1_20_bowtie.sam ./genomes/Ensembl_Genomes/e_coli_k12.EB1_e_coli_k12.60.gtf > ./output/rnaseq_3/pLCV1_20_htseq.txt

                      Can anyone make any suggestions as to why I obtain 0 reads for some obviously expressed transcripts and how to modify my workflow in order to fix this?! Thank you!

                      Comment


                      • #12
                        Have you looked into you SAM file at the locus of the gene with a gene browser (e.g., IGV)?

                        Comment


                        • #13
                          I am using the following commands, I get lot of reads as "no_feature", I tried union as well as intersection strict, but no much change in the nubmer of no_feature. only the ambiguous reads nubmer has change. My total number of reads is around 18 million. So I am worried in this regard.

                          By the by what is the no_feature mean: are these number of reads not mapped?

                          Please suggest what might be wrong with my htseq command. Thank you

                          ===
                          htseq-count -m union -s yes -t CDS -i locus_tag -o WT-1_geneid_htseq_out3 sorted.sam /home/sun/Rna_seq_data/Xylella/chrComplete.gff > PD_out_union &
                          no_feature 14867530
                          ambiguous 4453
                          too_low_aQual 0
                          not_aligned 1913205
                          alignment_not_unique 0

                          htseq-count -m intersection-strict -s yes -t CDS -i locus_tag -o WT-1_geneid_htseq_out2 sorted.sam /home/sun/Rna_seq_data/Xylella/chrComplete.gff > PD_out &
                          no_feature 14905057
                          ambiguous 13
                          too_low_aQual 0
                          not_aligned 1913205
                          alignment_not_unique 0

                          Comment


                          • #14
                            mbobro2, may be do u think we need to sort the sam file before HTSeq count, sometimes if we dont sort it can give odd results

                            Comment


                            • #15
                              Originally posted by sunkorner View Post
                              By the by what is the no_feature mean: are these number of reads not mapped?
                              No, it means that the read does not overlap with any feature in your GTF file, i.e., that it falls onto intergenic or intronic space.

                              To investigate, first display your SAM file and you GTF file alongside in a genome browser (e.g., IGV) and inspect visually, whether there are indeed many reads that fall between features.

                              On the other hand: Is this your complete output? Not a single feature? Or did you cut them off.

                              If not, are you sure your GFF file fits your options? You have instructed htseq-count to only use those lines in your GFF file that have the string "CDS" in its third field, and have a suitable attribute called "locus_tag" in the last field. Maybe post an extract of the GFF file.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              17 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              46 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X