Seqanswers Leaderboard Ad

**ffinkernagel** · 09-28-2010, 07:21 AM

sure, any scripting language (such as Python) will do this, especially since both the hindIII and the ecoRi site have no ambiguity whatsoever.
Only problem might be that your files are in ABI, and you either need to convert this to something the script will be able to read (FASTA), or find an adaptor.. there is one for python http://www.bioinformatics.org/groups/?group_id=497

here's a simple script from the top of my head that will do it to a folder of fasta files...

Code:

#!/usr/bin/python
directory = 'path/to/data'
target_directory = 'path/to/output'
for filename in os.listdir(directory):
    op = open(os.path.join(directory, filename),'rb')
    fasta = op.read().split("\n")
    op.close()
    name = fasta[0][1:] #cut off >
    sequence = "".join(fasta[1:]).upper() #transform into one line of uppercase bases...
    hindIIIpos = sequence.find("AAGCTT")
    if hindIIIpos == -1:
        raise ValueError("%s did not contain a hindIII site" % filename)
    ecoRIpos = sequence.rfind("GAATTC")#search for last ecoRI site
    if ecoRIpos == -1:
        raise ValueError("%s did not contain a ecoRI site" % filename)
    cut = sequence[hindIIIpos + 1: ecoRIpos + 1] #compensate for actual cutting position
    op = open(os.path.join(target_directory, filename), 'wb')
    op.write('>%s\n%s' % (name, cut))
    op.close()

**raymoniz** · 09-28-2010, 07:38 AM

Thanks for the help. I should have also disclosed that I am not versed in any computer language - but am fortunate to have some friends who are. I will take this to them. Thank you again.

**sklages** · 09-29-2010, 05:37 AM

you might want to download 'lucy' at

AMOS - Browse Files at SourceForge.net

http://sourceforge.net/projects/amos/files/

AMOS is a collection of tools for genome assembly

or, if you prefer a GUI,

Lucy2 - ISU Complex Computation Lab

http://www.complex.iastate.edu/download/Lucy2/index.html

cheers,
Sven

**raymoniz** · 09-29-2010, 11:16 AM

Thanks very much Sven, I will give those a try.

**Richard Finney** · 09-29-2010, 12:12 PM

I am not versed in any computer language
Doing nextgen without knowing Perl/bash/c/java/sed/awk and/or python is like backpacking through South America without knowing any Spanish. You're liable to wind up lost and sick. You need to make the time to learn some basic text stream manipulation.

**sklages** · 09-29-2010, 12:28 PM

Well, the OP's questions was a bit off-topic, as he still uses PGS data (previous generation sequencing ;-) ), some 1000 ABI traces ... lucy2 should do the job. If not, I totally agree, without basic knowledge of e.g. perl it gets pretty hard ...

Sven

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 37 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

batch editing of ABI files

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News