Seqanswers Leaderboard Ad

**aarthi.talla** · 05-17-2011, 12:32 PM

Guys, would be glad if anyone helped me out ! Thanks !

**sklages** · 05-17-2011, 01:31 PM

Originally posted by aarthi.talla View Post

Hi everyone,

I am working on 454 assembly of Norton Genome. I havent got the original data yet.

But I have been trying to test the MIRA2 software followed by BAMBUS to build scaffolds with test data.
I have a question regarding the .mates file.

Do we have to create it ?

If yes, what exactly is the naming convention followed by 454 mate pairs ? Is it fixed?
What is the perl regular expression for it ?

I know it goes some thing like this :
Library threeKB 700 3000 (..............).*
pair (.*)\.f (.*)\.r

But is it fixed for 454 ? I know the max and min size numbers vary. But I wanna know about the reg ex.

Thanks.
Wud appreciate if anyone helped me out.

You're way too impatient .. :-)

- hopefully you mean MIRA3
- yes, you have to create the mates file on your own, tab separated
- your regular expression looks good if you use 'sff_extract' to generate input for MIRA
- there is no fixed naming convention, the 454 reads just have a linker between f and r read. The format of the downstream conversion is dependent on the software which has to deal with the data (here MIRA)

Just give it a try and you'll see if the regex works ..

hth,
Sven

**aarthi.talla** · 05-17-2011, 01:40 PM

Thanks sklages

wud this be the correct output after using sff-extract of the mate paired sff files
The FASTQ files are clipped, devoid of linkers and each mate pair is separated as follows:

@GSDFVHG01BWBMB.r
AACCCGAGCCAAACTACTCAAAGAAA
+
IIHHHIIIIIHHHIIIIIIHHHIF?;
@GSDFVHG01BWBMB.f
ATAGGTTATGAGTACACGGGCTCGTAATTGGCGTATACACCATCTGCAAGAAAACAAAAGAAGGCA
+
IIIIIIIIIIIIIIIIIHHHIIIIIIIIIGGGIIIIIGHHIIIIEEEEEE9999C:===I@@IIII

so wud that be the correct corresponding .mates file to create ?

**sklages** · 05-17-2011, 01:47 PM

Originally posted by aarthi.talla View Post

Thanks sklages

wud this be the correct output after using sff-extract of the mate paired sff files
The FASTQ files are clipped, devoid of linkers and each mate pair is separated as follows:

@GSDFVHG01BWBMB.r
AACCCGAGCCAAACTACTCAAAGAAA
+
IIHHHIIIIIHHHIIIIIIHHHIF?;
@GSDFVHG01BWBMB.f
ATAGGTTATGAGTACACGGGCTCGTAATTGGCGTATACACCATCTGCAAGAAAACAAAAGAAGGCA
+
IIIIIIIIIIIIIIIIIHHHIIIIIIIIIGGGIIIIIGHHIIIIEEEEEE9999C:===I@@IIII

so wud that be the correct corresponding .mates file to create ?

Looks good. Give it a try and see what happens.
If you want MIRA to use read pair information you should provide a XML file (and fasta/qual AFAIR) with the template info.

Sven

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

454 Mate pair naming convention, for input to BAMBUS

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News