SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Transcript redundancy in denovo assembly galata44 RNA Sequencing 12 04-18-2013 07:54 AM
TCGA Level 3 RNASeq data duplicate rows for a gene yww RNA Sequencing 1 04-06-2013 07:53 AM
MAF to VCF of ovarian TCGA data Seq^3 Bioinformatics 1 04-06-2013 07:48 AM
TCGA data analysis details skilpinen Bioinformatics 4 03-07-2013 03:08 PM
Remove redundancy from 454 data sklages 454 Pyrosequencing 2 09-23-2010 08:14 AM

Reply
 
Thread Tools
Old 04-06-2013, 08:38 AM   #1
dsmarcoantonio
Junior Member
 
Location: Brazil, Ribeirão Preto

Join Date: Sep 2012
Posts: 5
Default Redundancy in TCGA data

maf file in TCGA have redundancy on most genes (See below a example). Can anyone help me in this issue? This is not only in mutation annotation format files but also in RNA-seq to. I don't know what this mean.
Why the same gene is replicated by rows???

Ex:
A4GNT 51146 broad.mit.edu 37 3 137843364 137843364 + Silent SNP T C C TCGA-CJ-4882-01A-02D-1429-08 TCGA-CJ-4882-11A-01D-1429-08 Somatic Phase_I Capture Illumina GAIIx
A4GNT 51146 broad.mit.edu 37 3 137843364 137843364 + Silent SNP T C C TCGA-CJ-4882-01A-02D-1429-08 TCGA-CJ-4882-11A-01D-1429-08 Somatic Phase_I Capture Illumina GAIIx
A4GNT 51146 broad.mit.edu 37 3 137843364 137843364 + Silent SNP T C C TCGA-CJ-4882-01A-02D-1429-08 TCGA-CJ-4882-11A-01D-1429-08 Somatic Phase_I Capture Illumina GAIIx
A4GNT 51146 broad.mit.edu 37 3 137843364 137843364 + Silent SNP T C C TCGA-CJ-4882-01A-02D-1429-08 TCGA-CJ-4882-11A-01D-1429-08 Somatic Phase_I Capture Illumina GAIIx

Thanks a lot
dsmarcoantonio is offline   Reply With Quote
Old 04-06-2013, 09:33 AM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

There's multiple version of KIRC mafs in various directories.
I'm not sure which one is gospel and which ones are apocryphal.

____ my notes ... _____

-bash-3.00$ grep -n "A4GNT" *.ma* | grep 137843364 | grep "TCGA.CJ.4882" | cut -f1-3
An_KIRC_Freeze_1.4_Broad_184_Capture.release.cleaned.maf:10685:A4GNT 51146 broad.mit.edu
An_KIRC_Freeze_1.4_Broad_184_Capture.release.cleaned.maf.1:10685:A4GNT 51146 broad.mit.edu
BI_and_BCM_1.4.aggregated.tcga.somatic.maf:21903:A4GNT 51146 broad.mit.edu
BI_and_BCM_1.4.aggregated.tcga.somatic.maf.1:21903:A4GNT 51146 broad.mit.edu
BI_and_BCM_1.4.aggregated.tcga.somatic.maf.2:21903:A4GNT 51146 broad.mit.edu
BI_and_BCM_1.4.aggregated.tcga.somatic.maf.3:21903:A4GNT 51146 broad.mit.edu
PR_TCGA_KIRC_PAIR_Capture_All_Pairs_QCPASS.aggregated.capture.tcga.uuid.somatic.maf:21924:A4GNT 51146 broad.mit.edu
-bash-3.00$ ls -l An_KIRC_Freeze_1.4_Broad_184_Capture.release.cleaned.maf
-rw-rw-r-- 1 finneyr finneyr 2703818 Jan 30 2012 An_KIRC_Freeze_1.4_Broad_184_Capture.release.cleaned.maf
-bash-3.00$ ls -l BI_and_BCM_1.4.aggregated.tcga.somatic.maf
-rw-rw-r-- 1 finneyr finneyr 21113158 Aug 8 2012 BI_and_BCM_1.4.aggregated.tcga.somatic.maf
-bash-3.00$ ls -l BI_and_BCM_1.4.aggregated.tcga.somatic.maf
-rw-rw-r-- 1 finneyr finneyr 21113158 Aug 8 2012 BI_and_BCM_1.4.aggregated.tcga.somatic.maf
-bash-3.00$ ls -l PR_TCGA_KIRC_PAIR_Capture_All_Pairs_QCPASS.aggregated.capture.tcga.uuid.somatic.maf
-rw-rw-r-- 1
finneyr finneyr 32848977 Dec 6 12:10 PR_TCGA_KIRC_PAIR_Capture_All_Pairs_QCPASS.aggregated.capture.tcga.uuid.somatic.maf
-bash-3.00$ grep -n "A4GNT" *.ma* | grep 137843364 | grep "TCGA.CJ.4882" | cut -f1 -d":" | awk '{print "ls -l "$1}' | bash
-rw-rw-r-- 1 finneyr finneyr 2703818 Jan 30 2012 An_KIRC_Freeze_1.4_Broad_184_Capture.release.cleaned.maf
-rw-rw-r-- 1 finneyr finneyr 3747830 Oct 23 16:24 An_KIRC_Freeze_1.4_Broad_184_Capture.release.cleaned.maf.1
-rw-rw-r-- 1 finneyr finneyr 21113158 Aug 8 2012 BI_and_BCM_1.4.aggregated.tcga.somatic.maf
-rw-rw-r-- 1 finneyr finneyr 23056811 Oct 19 16:55 BI_and_BCM_1.4.aggregated.tcga.somatic.maf.1
-rw-rw-r-- 1 finneyr finneyr 23056811 Nov 5 12:00 BI_and_BCM_1.4.aggregated.tcga.somatic.maf.2
-rw-rw-r-- 1 finneyr finneyr 23056811 Jan 9 12:48 BI_and_BCM_1.4.aggregated.tcga.somatic.maf.3
-rw-rw-r-- 1 finneyr finneyr 32848977 Dec 6 12:10 PR_TCGA_KIRC_PAIR_Capture_All_Pairs_QCPASS.aggregated.capture.tcga.uuid.somatic.maf


Richard Finney is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:27 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO