Seqanswers Leaderboard Ad

**ThePresident** · 06-30-2012, 02:47 PM

Oh yeah, another thing: I'm not sure if I had to trim 5' or 3' ends of my reads. In fact, we performed signle reads 50pb RNA-seq and it wasn't directional (so I guess same adaptors on 5' and 3'). For the alignment, I took directly my sequence files that I obtained from our Seq Department, so I'm not sure if the trimming has been done or I have to do it? Perhaps, thats the problem...

**JChase** · 07-01-2012, 11:22 PM

I'm no expert, but I think the first thing you should do is find out to what degree your sequencing facility processes the reads before they're handed over to you. Regardless of what they do, I think it's probably a good idea to take a look at the quality scores with something like FastQC and trim bad quality or adapter contamination with something like AdaptCut before aligning.

Now, as for your specific problem, you should be able to take a look at the fastq file that your reads are in and see what's going on with this one particular read that's giving you an error. (Look up the grep command if you're not sure how to do that.) If it's just one read causing a problem, and you can't easily see what the issue is, I would just delete that read.

**ThePresident** · 07-02-2012, 07:06 AM

JChase, thanks a lot for your answer. Indeed, I will check with my sequencing facility about their output file just to be sure how do they handle it. I've already performed a FastQC and it seems okay to me. I'll try to check for trimming my sequences.

Finaly, I do agree that the easiest way of dealing with this problem is to delete that particular read from the Fastq file. However, the Fastq file is like 3Gb, so it takes hours before it opens with gedit and mostly it crushes before it actually finish opening. Is there a simple way to search within a file without actually opening it?

In the other way, I tried to ignore this error and just continue with SAM to BAM and sorting, but got another error... I'll post another thread for that one.

Thanks again!

**JChase** · 07-02-2012, 09:07 AM

I would use grep -n to get the line number (http://www.unix.com/shell-programmin...-filename.html) and then use sed to delete those linenumbers from the file (http://snipplr.com/view/6152/); be sure you delete the line that the read name is on as well as the lines associated with it, because fastq reads are not on a single line. Sed can be slow on big files, because it will go through the entire file, but it's better than trying to edit the file manually.

**manianslab** · 07-02-2012, 09:15 AM

I had to delete some lines from a file once like this. I used the sed command. See this thread: http://www.unix.com/shell-programmin...xtra-line.html

**ThePresident** · 07-03-2012, 06:13 AM

Thank you so much guys. I'm working on it right now, hope it will work!

Cheers

**ThePresident** · 07-05-2012, 10:53 AM

It worked perfectly. Commands:

grep -n STRING_TO_SEARCH *file (search for a line)

and after you go:

sed -i 2,+3d *file: it will go to the second line and from there delete 3 other lines (in this exemple, it will delete line 3,4 and 5)!

Thanks again!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Bowtie error

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News