Seqanswers Leaderboard Ad

**aarthi.talla** · 05-17-2011, 12:32 PM

Guys, would be glad if anyone helped me out ! Thanks !

**sklages** · 05-17-2011, 01:31 PM

Originally posted by aarthi.talla View Post

Hi everyone,

I am working on 454 assembly of Norton Genome. I havent got the original data yet.

But I have been trying to test the MIRA2 software followed by BAMBUS to build scaffolds with test data.
I have a question regarding the .mates file.

Do we have to create it ?

If yes, what exactly is the naming convention followed by 454 mate pairs ? Is it fixed?
What is the perl regular expression for it ?

I know it goes some thing like this :
Library threeKB 700 3000 (..............).*
pair (.*)\.f (.*)\.r

But is it fixed for 454 ? I know the max and min size numbers vary. But I wanna know about the reg ex.

Thanks.
Wud appreciate if anyone helped me out.

You're way too impatient .. :-)

- hopefully you mean MIRA3
- yes, you have to create the mates file on your own, tab separated
- your regular expression looks good if you use 'sff_extract' to generate input for MIRA
- there is no fixed naming convention, the 454 reads just have a linker between f and r read. The format of the downstream conversion is dependent on the software which has to deal with the data (here MIRA)

Just give it a try and you'll see if the regex works ..

hth,
Sven

**aarthi.talla** · 05-17-2011, 01:40 PM

Thanks sklages

wud this be the correct output after using sff-extract of the mate paired sff files
The FASTQ files are clipped, devoid of linkers and each mate pair is separated as follows:

@GSDFVHG01BWBMB.r
AACCCGAGCCAAACTACTCAAAGAAA
+
IIHHHIIIIIHHHIIIIIIHHHIF?;
@GSDFVHG01BWBMB.f
ATAGGTTATGAGTACACGGGCTCGTAATTGGCGTATACACCATCTGCAAGAAAACAAAAGAAGGCA
+
IIIIIIIIIIIIIIIIIHHHIIIIIIIIIGGGIIIIIGHHIIIIEEEEEE9999C:===I@@IIII

so wud that be the correct corresponding .mates file to create ?

**sklages** · 05-17-2011, 01:47 PM

Originally posted by aarthi.talla View Post

Thanks sklages

wud this be the correct output after using sff-extract of the mate paired sff files
The FASTQ files are clipped, devoid of linkers and each mate pair is separated as follows:

@GSDFVHG01BWBMB.r
AACCCGAGCCAAACTACTCAAAGAAA
+
IIHHHIIIIIHHHIIIIIIHHHIF?;
@GSDFVHG01BWBMB.f
ATAGGTTATGAGTACACGGGCTCGTAATTGGCGTATACACCATCTGCAAGAAAACAAAAGAAGGCA
+
IIIIIIIIIIIIIIIIIHHHIIIIIIIIIGGGIIIIIGHHIIIIEEEEEE9999C:===I@@IIII

so wud that be the correct corresponding .mates file to create ?

Looks good. Give it a try and see what happens.
If you want MIRA to use read pair information you should provide a XML file (and fasta/qual AFAIR) with the template info.

Sven

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 33 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 48 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 34 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 46 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

454 Mate pair naming convention, for input to BAMBUS

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News