SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Compare de-novo transcriptome assembly to genome reference guided assembly IdoBar Bioinformatics 1 04-04-2014 12:28 AM
huge duplicates or high expression yuliu RNA Sequencing 1 08-16-2013 02:34 PM
GALAXY: Huge amount of data? sklages Bioinformatics 1 04-12-2011 05:03 AM
picard markduplicates on huge files rcorbett Bioinformatics 2 09-17-2010 04:39 AM

Reply
 
Thread Tools
Old 05-07-2018, 07:02 PM   #1
lin
Junior Member
 
Location: Utah

Join Date: May 2018
Posts: 1
Default transcriptome assembly got huge contigs

Hi, I am a beginner of transcriptome assembly. I did a algae transcriptome assembly using spades. I don't understand why I got contigs that is over 25K bp in length. is that possible?
length. counts of contigs
100:199 149494
200:299 709026
300:399 83948
400:499 31883
500:599 16756
600:699 10221
700:799 6982
800:899 4988
900:999 3538
1000:1099 2663
1100:1199 2098
1200:1299 1627
1300:1399 1368
1400:1499 1045
1500:1599 879
1600:1699 690
1700:1799 581
1800:1899 452
1900:1999 404
2000:2099 341
2100:2199 312
2200:2299 274
2300:2399 213
2400:2499 179
2500:2599 174
2600:2699 140
2700:2799 132
2800:2899 132
2900:2999 96
3000:3099 93
3100:3199 83
3200:3299 91
3300:3399 85
3400:3499 73
3500:3599 75
3600:3699 65
3700:3799 87
3800:3899 69
3900:3999 67
4000:4099 55
4100:4199 48
4200:4299 49
4300:4399 48
4400:4499 56
4500:4599 61
4600:4699 47
4700:4799 53
4800:4899 49
4900:4999 44
5000:5099 42
5100:5199 30
5200:5299 46
5300:5399 37
5400:5499 41
5500:5599 32
5600:5699 27
5700:5799 35
5800:5899 30
5900:5999 44
6000:6099 34
6100:6199 35
6200:6299 22
6300:6399 25
6400:6499 28
6500:6599 24
6600:6699 39
6700:6799 33
6800:6899 28
6900:6999 28
7000:7099 22
7100:7199 22
7200:7299 24
7300:7399 19
7400:7499 30
7500:7599 36
7600:7699 27
7700:7799 14
7800:7899 16
7900:7999 23
8000:8099 16
8100:8199 21
8200:8299 14
8300:8399 17
8400:8499 17
8500:8599 14
8600:8699 18
8700:8799 13
8800:8899 11
8900:8999 12
9000:9099 12
9100:9199 10
9200:9299 8
9300:9399 11
9400:9499 14
9500:9599 13
9600:9699 16
9700:9799 6
9800:9899 9
9900:9999 11
10000:10099 8
10100:10199 5
10200:10299 10
10300:10399 8
10400:10499 7
10500:10599 4
10600:10699 6
10700:10799 5
10800:10899 4
10900:10999 7
11000:11099 6
11100:11199 3
11200:11299 3
11300:11399 6
11400:11499 7
11500:11599 8
11600:11699 6
11700:11799 13
11800:11899 6
11900:11999 4
12000:12099 4
12100:12199 7
12200:12299 4
12300:12399 4
12400:12499 5
12500:12599 4
12600:12699 6
12700:12799 8
12800:12899 10
12900:12999 3
13000:13099 8
13100:13199 3
13200:13299 6
13300:13399 1
13400:13499 4
13500:13599 6
13600:13699 4
13700:13799 4
13800:13899 3
13900:13999 1
14000:14099 3
14100:14199 3
14200:14299 3
14300:14399 3
14400:14499 2
14500:14599 3
14600:14699 3
14700:14799 2
14800:14899 1
14900:14999 1
15000:15099 2
15100:15199 3
15200:15299 2
15300:15399 2
15400:15499 1
15500:15599 2
15600:15699 1
15700:15799 2
15800:15899 2
15900:15999 2
16000:16099 3
16100:16199 2
16200:16299 1
16300:16399 1
16400:16499 2
16500:16599 4
16600:16699 1
16700:16799 1
16800:16899 2
17000:17099 1
17300:17399 2
17400:17499 1
17500:17599 1
17700:17799 1
17800:17899 1
17900:17999 3
18200:18299 1
18300:18399 1
18500:18599 1
18700:18799 1
18800:18899 1
18900:18999 1
19100:19199 1
19200:19299 1
19300:19399 1
19500:19599 2
19800:19899 1
19900:19999 1
20100:20199 1
20300:20399 2
20500:20599 1
20900:20999 1
21100:21199 1
21500:21599 1
21700:21799 2
21900:21999 1
22000:22099 1
23000:23099 1
23100:23199 1
24600:24699 1
25200:25299 1
25500:25599 1
26500:26599 1
26900:26999 1
27800:27899 1
28800:28899 1
31500:31599 1
38800:38899 1

Total length of sequence: 298114914 bp
Total number of sequences: 1033416
N25 stats: 25% of total sequence length is contained in the 76269 sequences >= 437 bp
N50 stats: 50% of total sequence length is contained in the 327373 sequences >= 238 bp
N75 stats: 75% of total sequence length is contained in the 657535 sequences >= 214 bp
Total GC count: 137442543 bp
GC %: 46.10 %
lin is offline   Reply With Quote
Old 05-10-2018, 06:14 AM   #2
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,283
Default

Unlikely that they derive from RNA. My guess would be that if you took a few kb of sequence from the largest contig and did a blastn to nt at genbank you would get a hit to chloroplast. Speculation on my part, of course...

--
Phillip
pmiguel is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:00 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO