SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Reconstructing a de Bruijn Graph from Velvet's LastGraph (http://seqanswers.com/forums/showthread.php?t=77028)

Splinter479 07-12-2017 05:11 AM

Reconstructing a de Bruijn Graph from Velvet's LastGraph
 
Hi,

I was wondering if someone can explain/ give hints to me about how to regain a de Bruijn Graph from Velvet's LastGraph representation of nodes and arcs?

I created an example with hashsize (k)=7 and told velvetg to "keep all data" by setting cov_cutoff=1 min_contig_lgth=1:

Code:

>SEQUENCE_0
TTTTTTCATGCATGCTAGCGTGTGTGTGT
>SEQUENCE_1
GTGTGTGTGTTAGCTGCAGTATGCGGAAC
>SEQUENCE_2
ACACACACACATCGATTGTTATGCGGAAC
>SEQUENCE_3
TATGCGGAACGCATTGCTAACTCGGGGGG

results in

Code:

5        4        7        1
NODE        1        12        13        13        0        0
GCTAGCATGCAT
GCTAGCGTGTGT
NODE        2        1        6        6        0        0
G
C
NODE        3        1        7        7        0        0
T
A
NODE        4        15        15        15        0        0
TAGCTGCAGTATGCG
CTGCAGCTAACACAC
NODE        5        14        14        14        0        0
TCGATTGTTATGCG
ACAATCGATGTGTG
ARC        1        -1        2
ARC        -1        2        1
ARC        2        3        6
ARC        -2        -3        4
ARC        3        4        1
ARC        -3        5        1

I don't get exactly HOW to read out the sequence from the traversal of the nodes, e.g. (1) --> (-1), (-1) --> (2) and so on.
Once I know how to read the sequence I could rebuild a simple de Bruijn Graph from it. Or is there maybe some simpler (back-)transformation from LastGraph to de Bruijn Graph?

The (very old) manual page of Velvet is not helping too much :(

Any help highly appreciated.

Thanks!


------------------------------
EDIT
------------------------------

Another example:
Let's take a sequence s=ACTGGACTGAA
As I recall this results in the dBG of s:
Code:

ACT --> CTG --> TGG --> GGA --> GAC --> (ACT...)
        |
        |--> TGA --> GAA

Velvet's result is (k=3, velvetg cov_cutoff=1 min_contig_lgth=1):
1 1 3 1
NODE 1 5 7 7 0 0
TGGAC
CCAGT
ARC 1 1 1

This means I would be able to recover:
Code:

(Node 1 upper seq.)              TGGAC
                                |||    --> ACTGGAC  (and an arbitrary amount of concatenations of this seq. since ARC (1) --> (1) )
(Node 1 lower seq. rev.comp.)  ACTGG

BUT, it seems to me that Velvet is missing the entire alternative path "TGAA" of the dBG of s.
I would have expected another node and an additional arc to find that sequence.

Am I right until here? Do I miss some/ should I change some parameter?


All times are GMT -8. The time now is 02:25 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.