![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Annotate total variants against reference genome using CLC genomics? | Karlos | Illumina/Solexa | 0 | 02-26-2015 08:13 AM |
Digital gene expression in CLC without reference genome | RyNkA | De novo discovery | 4 | 10-31-2013 07:14 PM |
Difference between assembly, gene annotation, and reference genome? | prs321 | Bioinformatics | 4 | 08-29-2013 12:24 PM |
entamoeba histolytica reference genome (gene annotation file) in GTF format | paula123 | Bioinformatics | 1 | 06-04-2012 03:39 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Malaysia Join Date: Jan 2016
Posts: 5
|
![]()
Dear All,
I did a RNA-seq on my bacteria (Salmonella) samples for a differential expression study. I used the cufflinks pipeline for this study. When I checked the results of cuffdiff it showed 3752 genes but when I checked the Salmonella reference gene list it shows a total of 4949 genes. How is there such a massive difference in the gene numbers? As I understand some of the genes may not be detected due to inactivation or are hypothetical genes. However this is a not satisfactory answer for me and I would like to know if anyone has an explanation for this. I may be looking at the wrong keywords so if possible can someone please point me in the correct direction for me to read up on this? Thank you. |
![]() |
![]() |
![]() |
#2 |
Member
Location: France Join Date: Dec 2015
Posts: 39
|
![]()
Hi,
I think it's a good result ! I'm not specialist in bacteria but if i remember rigth there is a big diversity in population and the genomic dynamic is intensive ( i speak about bacteria in general i don t know if this rules match for Salmonella). But if I’m right if you add the genomic diversity and the temporal expression plus extraction bias, at the end you have 75% of the genes describes , is not bad , don't you think ? Depends at the end what are you looking for ? An other point is i don't know witch databases you use but you have to be care about it because you have 2 types of information : -Gene describe (protein expressed function etc ) -Gene predicted , based on functional domain who are described on other protein and genomic structure/pattern |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Malaysia Join Date: Jan 2016
Posts: 5
|
![]()
Hi Tristan,
Thanks for the input. I totally did not take into consideration the RNA extraction bias and the sequencing bias as well. Thanks for your help. I was using the RefSeq complete genome reference from NCBI as my reference. |
![]() |
![]() |
![]() |
Thread Tools | |
|
|