SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   De novo discovery (http://seqanswers.com/forums/forumdisplay.php?f=27)
-   -   de novo assembly using Velvet to reconstruct a PCR amplicon (http://seqanswers.com/forums/showthread.php?t=90486)

guineverelee 08-13-2019 03:27 PM

de novo assembly using Velvet to reconstruct a PCR amplicon
 
Hi all. I'm a newbie in de novo assembly. I have successfully installed Velvet, was able to get it to read my fastq and run of my test set, but it is not giving me the expected result.

This sample came from a 100% clonal PCR amplicon that is ~9000bp in length. The amplicon is tagmented using Nextera XT, and put into Illumina MiSeq (multiplexed run of 96 samples). My fastq contains MiSeq non-pair-end reads with length 142bp. After de novo assembly, I am hoping to get one single contig that constructs the original 9000bp PCR amplicon. But for whatever result I am unable to get Velvet to produce contigs over like 500bp.

Please see an example of my command below

$ ./velveth ./assembly 21 -fastq S4411.fastq
$ ./velvetg ./assembly

This gave me 629 tiny contigs <500bp. There should only be one contig, and the software Geneious was able to produce that 1 contig.
Any pointers how to work with Velvet?

You can download the fastq file here.
https://www.dropbox.com/s/ypb8233krv...411.fastq?dl=0

Thank you so much in advance.

GenoMax 08-13-2019 03:55 PM

I suggest you give "tadpole.sh" from BBMap suite a try. It works well with small genomes. Tadpole guide is here.

If you are looking to make velvet work then this is a moot point.

Are you letting velvet go through entire range of k-mers?

guineverelee 08-13-2019 04:07 PM

Thank you GenoMax!
I'll give Tadpole a try.
As for Velvet: Please excuse my newbie question. How do I make Velvet go through an entire range of k-mers?

GenoMax 08-13-2019 04:57 PM

See this thread: https://www.biostars.org/p/78315/

SNPsaurus 08-14-2019 06:31 AM

Have you thought about using PacBio for long amplicons instead of tagmenting and trying to re-assemble? Each amplicon could get multiple polymerase passes for a high-quality consensus, and be sequenced multiple times for an even better merged consensus. It would definitely be cheaper than Nextera XT preps and a MiSeq run.

guineverelee 08-14-2019 01:15 PM

Thank you GenoMax! The thread you pointed to is super useful.

guineverelee 08-14-2019 02:10 PM

SNPsaurus, thanks for the suggestion. We don't have immediate access to PacBio, but I'll keep that in mind!

guineverelee 08-16-2019 11:25 AM

Hi GenoMax and all

I am unable to re-define my velveth MAXKMERLENGTH. It is currently at default 31. Please see error message below. Any clue what's going on? Thanks again!!

$ cd ./Velvet_1.2.10
$ make ’MAXKMERLENGTH=150’
rm obj/*.o obj/dbg/*.o
rm: obj/dbg/*.o: No such file or directory
make: [cleanobj] Error 1 (ignored)
mkdir -p obj
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/tightString.c -o obj/tightString.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/run.c -o obj/run.o
In file included from src/run.c:31:
In file included from src/run.h:35:
src/scaffold.h:21:9: warning: '_SSCAFFOLD_H_' is used as a header guard here,
followed by #define of a different macro [-Wheader-guard]
#ifndef _SSCAFFOLD_H_
^~~~~~~~~~~~~
src/scaffold.h:22:9: note: '_SCAFFOLD_H_' is defined here; did you mean
'_SSCAFFOLD_H_'?
#define _SCAFFOLD_H_
^~~~~~~~~~~~
_SSCAFFOLD_H_
1 warning generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/splay.c -o obj/splay.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/splayTable.c -o obj/splayTable.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/graph.c -o obj/graph.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/run2.c -o obj/run2.o
In file included from src/run2.c:26:
In file included from src/run.h:35:
src/scaffold.h:21:9: warning: '_SSCAFFOLD_H_' is used as a header guard here,
followed by #define of a different macro [-Wheader-guard]
#ifndef _SSCAFFOLD_H_
^~~~~~~~~~~~~
src/scaffold.h:22:9: note: '_SCAFFOLD_H_' is defined here; did you mean
'_SSCAFFOLD_H_'?
#define _SCAFFOLD_H_
^~~~~~~~~~~~
_SSCAFFOLD_H_
1 warning generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/fibHeap.c -o obj/fibHeap.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/fib.c -o obj/fib.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/concatenatedGraph.c -o obj/concatenatedGraph.o
In file included from src/concatenatedGraph.c:25:
src/graph.h:173:26: warning: inline function 'getShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline Coordinate getShortReadMarkerPosition(ShortReadMarker * marker);
^
src/concatenatedGraph.c:56:14: note: used here
position = getShortReadMarkerPosition(marker);
^
In file included from src/concatenatedGraph.c:25:
src/graph.h:174:20: warning: inline function 'setShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline void setShortReadMarkerPosition(ShortReadMarker * marker,
^
src/concatenatedGraph.c:59:4: note: used here
setShortReadMarkerPosition(marker, position);
^
2 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/passageMarker.c -o obj/passageMarker.o
src/passageMarker.c:50:1: warning: unused function 'PM_I2P' [-Wunused-function]
DECLARE_FAST_ACCESSORS (PM, PassageMarker, markerMemory)
^
src/allocArray.h:69:21: note: expanded from macro 'DECLARE_FAST_ACCESSORS'
static inline type* name##_I2P(ArrayIdx idx) \
^
<scratch space>:412:1: note: expanded from here
PM_I2P
^
1 warning generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/graphStats.c -o obj/graphStats.o
src/graphStats.c:1961:69: warning: format specifies type 'long' but the argument
has type 'long long' [-Wformat]
..."PLACEHLDR.%ld PLACEHOLDER000", (int64_t) refIndex + 1);
~~~ ^~~~~~~~~~~~~~~~~~~~~~
%lld
/usr/include/secure/_stdio.h:47:56: note: expanded from macro 'sprintf'
__builtin___sprintf_chk (str, 0, __darwin_obsz(str), __VA_ARGS__)
^~~~~~~~~~~
In file included from src/graphStats.c:28:
src/graph.h:173:26: warning: inline function 'getShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline Coordinate getShortReadMarkerPosition(ShortReadMarker * marker);
^
src/graphStats.c:848:19: note: used here
starts[index] = getShortReadMarkerPosition(marker);
^
In file included from src/graphStats.c:28:
src/graph.h:177:27: warning: inline function 'getShortReadMarkerOffset' is not
defined [-Wundefined-inline]
extern inline ShortLength getShortReadMarkerOffset(ShortReadMarker * marker);
^
src/graphStats.c:849:34: note: used here
stops[index] = starts[index] - getShortReadMarkerOffset(...
^
3 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/correctedGraph.c -o obj/correctedGraph.o
In file included from src/correctedGraph.c:26:
src/graph.h:173:26: warning: inline function 'getShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline Coordinate getShortReadMarkerPosition(ShortReadMarker * marker);
^
src/correctedGraph.c:776:15: note: used here
position = getShortReadMarkerPosition(shortMarker);
^
In file included from src/correctedGraph.c:26:
src/graph.h:174:20: warning: inline function 'setShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline void setShortReadMarkerPosition(ShortReadMarker * marker,
^
src/correctedGraph.c:799:4: note: used here
setShortReadMarkerPosition(shortMarker, position);
^
2 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/dfib.c -o obj/dfib.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/dfibHeap.c -o obj/dfibHeap.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/recycleBin.c -o obj/recycleBin.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/readSet.c -o obj/readSet.o
src/readSet.c:641:21: warning: incompatible pointer types assigning to 'gzFile'
(aka 'struct gzFile_s *') from 'AutoFile *' [-Wincompatible-pointer-types]
file.gzFile = file.autoFile = NULL;
^ ~~~~~~~~~~~~~~~~~~~~
src/readSet.c:680:22: warning: incompatible pointer types assigning to 'gzFile'
(aka 'struct gzFile_s *') from 'AutoFile *' [-Wincompatible-pointer-types]
file1.gzFile = file1.autoFile = NULL;
^ ~~~~~~~~~~~~~~~~~~~~~
src/readSet.c:681:22: warning: incompatible pointer types assigning to 'gzFile'
(aka 'struct gzFile_s *') from 'AutoFile *' [-Wincompatible-pointer-types]
file2.gzFile = file2.autoFile = NULL;
^ ~~~~~~~~~~~~~~~~~~~~~
src/readSet.c:632:1: warning: unused function 'kseq_rewind' [-Wunused-function]
KSEQ_INIT(FileGZOrAuto, fileGZOrAuto_read)
^
src/kseq.h:220:2: note: expanded from macro 'KSEQ_INIT'
__KSEQ_BASIC(type_t) \
^
src/kseq.h:152:21: note: expanded from macro '__KSEQ_BASIC'
static inline void kseq_rewind(kseq_t *ks...
^
4 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/binarySequences.c -o obj/binarySequences.o
src/binarySequences.c:304:69: warning: format specifies type 'unsigned long' but
the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
...location 0x%lx for seq %ld beyond end 0x%lx\n", (uint64_t) tmp, (uint64_...
~~~ ^~~~~~~~~~~~~~
%llx
src/binarySequences.c:304:85: warning: format specifies type 'long' but the
argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
...seq %ld beyond end 0x%lx\n", (uint64_t) tmp, (uint64_t) sequenceIndex, (...
~~~ ^~~~~~~~~~~~~~~~~~~~~~~~
%llu
src/binarySequences.c:304:111: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
...0x%lx\n", (uint64_t) tmp, (uint64_t) sequenceIndex, (uint64_t) arrayEnd);
~~~ ^~~~~~~~~~~~~~~~~~~
%llx
src/binarySequences.c:389:45: warning: format specifies type 'long' but the
argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
velvetLog("CnySeq bufIdx %ld too large\n", bufIdx);
~~~ ^~~~~~
%llu
4 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/shortReadPairs.c -o obj/shortReadPairs.o
In file included from src/shortReadPairs.c:34:
src/scaffold.h:21:9: warning: '_SSCAFFOLD_H_' is used as a header guard here,
followed by #define of a different macro [-Wheader-guard]
#ifndef _SSCAFFOLD_H_
^~~~~~~~~~~~~
src/scaffold.h:22:9: note: '_SCAFFOLD_H_' is defined here; did you mean
'_SSCAFFOLD_H_'?
#define _SCAFFOLD_H_
^~~~~~~~~~~~
_SSCAFFOLD_H_
In file included from src/shortReadPairs.c:27:
src/graph.h:173:26: warning: inline function 'getShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline Coordinate getShortReadMarkerPosition(ShortReadMarker * marker);
^
src/shortReadPairs.c:590:14: note: used here
position = getShortReadMarkerPosition(marker);
^
In file included from src/shortReadPairs.c:27:
src/graph.h:174:20: warning: inline function 'setShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline void setShortReadMarkerPosition(ShortReadMarker * marker,
^
src/shortReadPairs.c:593:4: note: used here
setShortReadMarkerPosition(marker, position);
^
3 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/locallyCorrectedGraph.c -o obj/locallyCorrectedGraph.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/graphReConstruction.c -o obj/graphReConstruction.o
src/graphReConstruction.c:1407:65: warning: format specifies type 'long' but the
argument has type '__darwin_suseconds_t' (aka 'int') [-Wformat]
...=== Ghost-Threaded in %ld.%06ld s\n", diff.tv_sec, diff.tv_usec);
~~~~~ ^~~~~~~~~~~~
%06d
src/graphReConstruction.c:1436:59: warning: format specifies type 'long' but the
argument has type '__darwin_suseconds_t' (aka 'int') [-Wformat]
velvetLog(" === Threaded in %ld.%06ld s\n", diff.tv_sec, diff.tv_usec);
~~~~~ ^~~~~~~~~~~~
%06d
2 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/roadMap.c -o obj/roadMap.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/preGraph.c -o obj/preGraph.o
In file included from src/preGraph.c:35:
In file included from src/run.h:35:
src/scaffold.h:21:9: warning: '_SSCAFFOLD_H_' is used as a header guard here,
followed by #define of a different macro [-Wheader-guard]
#ifndef _SSCAFFOLD_H_
^~~~~~~~~~~~~
src/scaffold.h:22:9: note: '_SCAFFOLD_H_' is defined here; did you mean
'_SSCAFFOLD_H_'?
#define _SCAFFOLD_H_
^~~~~~~~~~~~
_SSCAFFOLD_H_
src/preGraph.c:165:17: warning: taking address of packed member 'preArcRight' of
class or structure 'preNode_st' may result in an unaligned pointer value
[-Waddress-of-packed-member]
preArcPtr = &(preNode->preArcRight);
^~~~~~~~~~~~~~~~~~~~
src/preGraph.c:167:17: warning: taking address of packed member 'preArcLeft' of
class or structure 'preNode_st' may result in an unaligned pointer value
[-Waddress-of-packed-member]
preArcPtr = &(preNode->preArcLeft);
^~~~~~~~~~~~~~~~~~~
src/preGraph.c:285:17: warning: taking address of packed member 'preArcRight' of
class or structure 'preNode_st' may result in an unaligned pointer value
[-Waddress-of-packed-member]
preArcPtr = &(preNode->preArcRight);
^~~~~~~~~~~~~~~~~~~~
src/preGraph.c:287:17: warning: taking address of packed member 'preArcLeft' of
class or structure 'preNode_st' may result in an unaligned pointer value
[-Waddress-of-packed-member]
preArcPtr = &(preNode->preArcLeft);
^~~~~~~~~~~~~~~~~~~
src/preGraph.c:520:27: warning: unused function 'mergeDescriptors_pg'
[-Wunused-function]
static inline Descriptor *mergeDescriptors_pg(Descriptor * descr,
^
src/preGraph.c:635:27: warning: unused function 'mergeDescriptorsH2H_pg'
[-Wunused-function]
static inline Descriptor *mergeDescriptorsH2H_pg(Descriptor * descr,
^
src/preGraph.c:760:27: warning: unused function 'mergeDescriptorsF2F_pg'
[-Wunused-function]
static inline Descriptor *mergeDescriptorsF2F_pg(Descriptor * descr,
^
8 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/preGraphConstruction.c -o obj/preGraphConstruction.o
src/preGraphConstruction.c:525:58: warning: format specifies type 'long' but the
argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
...velvetLog("readIndex %ld beyond string len %ld\n", (uint64_t) readIndex...
~~~ ^~~~~~~~~~~~~~~~~~~~
%llu
src/preGraphConstruction.c:525:80: warning: format specifies type 'long' but the
argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
...string len %ld\n", (uint64_t) readIndex, (uint64_t) tString->length);
~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~
%llu
2 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/concatenatedPreGraph.c -o obj/concatenatedPreGraph.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/readCoherentGraph.c -o obj/readCoherentGraph.o
In file included from src/readCoherentGraph.c:25:
src/graph.h:173:26: warning: inline function 'getShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline Coordinate getShortReadMarkerPosition(ShortReadMarker * marker);
^
src/readCoherentGraph.c:367:14: note: used here
position = getShortReadMarkerPosition(marker);
^
In file included from src/readCoherentGraph.c:25:
src/graph.h:174:20: warning: inline function 'setShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline void setShortReadMarkerPosition(ShortReadMarker * marker,
^
src/readCoherentGraph.c:369:3: note: used here
setShortReadMarkerPosition(marker, position);
^
2 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/utility.c -o obj/utility.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/kmer.c -o obj/kmer.o
src/kmer.c:28:23: warning: unused variable 'longLongLeftFilter'
[-Wunused-const-variable]
static const uint64_t longLongLeftFilter = (uint64_t) 3 << 62;
^
src/kmer.c:29:23: warning: unused variable 'longLeftFilter'
[-Wunused-const-variable]
static const uint32_t longLeftFilter = (uint32_t) 3 << 30;
^
src/kmer.c:30:23: warning: unused variable 'intLeftFilter'
[-Wunused-const-variable]
static const uint16_t intLeftFilter = (uint16_t) 3 << 14;
^
src/kmer.c:31:22: warning: unused variable 'charLeftFilter'
[-Wunused-const-variable]
static const uint8_t charLeftFilter = (uint8_t) 3 << 6;
^
4 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/scaffold.c -o obj/scaffold.o
In file included from src/scaffold.c:39:
src/scaffold.h:21:9: warning: '_SSCAFFOLD_H_' is used as a header guard here,
followed by #define of a different macro [-Wheader-guard]
#ifndef _SSCAFFOLD_H_
^~~~~~~~~~~~~
src/scaffold.h:22:9: note: '_SCAFFOLD_H_' is defined here; did you mean
'_SSCAFFOLD_H_'?
#define _SCAFFOLD_H_
^~~~~~~~~~~~
_SSCAFFOLD_H_
In file included from src/scaffold.c:32:
src/graph.h:173:26: warning: inline function 'getShortReadMarkerPosition' is not
defined [-Wundefined-inline]
extern inline Coordinate getShortReadMarkerPosition(ShortReadMarker * marker);
^
src/scaffold.c:455:7: note: used here
getShortReadMarkerPosition(shortMarker);
^
In file included from src/scaffold.c:32:
src/graph.h:177:27: warning: inline function 'getShortReadMarkerOffset' is not
defined [-Wundefined-inline]
extern inline ShortLength getShortReadMarkerOffset(ShortReadMarker * marker);
^
src/scaffold.c:457:7: note: used here
getShortReadMarkerOffset(shortMarker);
^
3 warnings generated.
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/kmerOccurenceTable.c -o obj/kmerOccurenceTable.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/allocArray.c -o obj/allocArray.o
gcc -Wall -m64 -O3 -D MAXKMERLENGTH=31 -D CATEGORIES=2 -c src/autoOpen.c -o obj/autoOpen.o
src/autoOpen.c:56:19: warning: duplicate 'const' declaration specifier
[-Wduplicate-decl-specifier]
static const char const* decompressors[] = {"","pigz", "gunzip", "pbunzi...
^
1 warning generated.
gcc -Wall -m64 -O3 -o velveth obj/tightString.o obj/run.o obj/recycleBin.o obj/splay.o obj/splayTable.o obj/readSet.o obj/binarySequences.o obj/utility.o obj/kmer.o obj/kmerOccurenceTable.o obj/autoOpen.o -lz -lm
gcc -Wall -m64 -O3 -o velvetg obj/tightString.o obj/graph.o obj/run2.o obj/fibHeap.o obj/fib.o obj/concatenatedGraph.o obj/passageMarker.o obj/graphStats.o obj/correctedGraph.o obj/dfib.o obj/dfibHeap.o obj/recycleBin.o obj/readSet.o obj/binarySequences.o obj/shortReadPairs.o obj/scaffold.o obj/locallyCorrectedGraph.o obj/graphReConstruction.o obj/roadMap.o obj/preGraph.o obj/preGraphConstruction.o obj/concatenatedPreGraph.o obj/readCoherentGraph.o obj/utility.o obj/kmer.o obj/kmerOccurenceTable.o obj/allocArray.o obj/autoOpen.o -lz -lm
Guineveres-MacBook-Pro:Velvet_1.2.10 guin$


All times are GMT -8. The time now is 01:45 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.