Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie error

    Hello everybody,

    I'm new here, and already have a question. Lately, we have performed some bacterial RNA seq on Illumina HiSeq. I got my data and went for alignment using Bowtie2.

    Honestly, I've never touched Linux before; however, I downloaded Ubundu, and started from scratch. One week later, I was able to run test alignments and everything seemed to work perfectly. So, few hours ago, I performed full alignment of my first replicate and things didn't go that good.

    Here is my command line:

    bowtie2 -q -t -p 6 -D 20 -x IndexFile -U FastqFile -S OutputFile

    It took 2 hours to align on 8Go RAM and 6core desktop and it generated an output file of 6Gb. However, an error was associated with the alignment:

    Error: Read HWI-ST766:125:COPRMACXX:6:1306:11865:38045 1:N:0:ATNACG has more quality values than read characters!

    I tried to look for posts about this error here but found nothing. Tried to google it, same thing. Anybody knows what this error is about? Can I still trust my alignment?

    Thanks a lot people!

  • #2
    Oh yeah, another thing: I'm not sure if I had to trim 5' or 3' ends of my reads. In fact, we performed signle reads 50pb RNA-seq and it wasn't directional (so I guess same adaptors on 5' and 3'). For the alignment, I took directly my sequence files that I obtained from our Seq Department, so I'm not sure if the trimming has been done or I have to do it? Perhaps, thats the problem...

    Comment


    • #3
      I'm no expert, but I think the first thing you should do is find out to what degree your sequencing facility processes the reads before they're handed over to you. Regardless of what they do, I think it's probably a good idea to take a look at the quality scores with something like FastQC and trim bad quality or adapter contamination with something like AdaptCut before aligning.

      Now, as for your specific problem, you should be able to take a look at the fastq file that your reads are in and see what's going on with this one particular read that's giving you an error. (Look up the grep command if you're not sure how to do that.) If it's just one read causing a problem, and you can't easily see what the issue is, I would just delete that read.

      Comment


      • #4
        JChase, thanks a lot for your answer. Indeed, I will check with my sequencing facility about their output file just to be sure how do they handle it. I've already performed a FastQC and it seems okay to me. I'll try to check for trimming my sequences.

        Finaly, I do agree that the easiest way of dealing with this problem is to delete that particular read from the Fastq file. However, the Fastq file is like 3Gb, so it takes hours before it opens with gedit and mostly it crushes before it actually finish opening. Is there a simple way to search within a file without actually opening it?

        In the other way, I tried to ignore this error and just continue with SAM to BAM and sorting, but got another error... I'll post another thread for that one.

        Thanks again!

        Comment


        • #5
          I would use grep -n to get the line number (http://www.unix.com/shell-programmin...-filename.html) and then use sed to delete those linenumbers from the file (http://snipplr.com/view/6152/); be sure you delete the line that the read name is on as well as the lines associated with it, because fastq reads are not on a single line. Sed can be slow on big files, because it will go through the entire file, but it's better than trying to edit the file manually.

          Comment


          • #6
            I had to delete some lines from a file once like this. I used the sed command. See this thread: http://www.unix.com/shell-programmin...xtra-line.html

            Comment


            • #7
              Thank you so much guys. I'm working on it right now, hope it will work!

              Cheers

              Comment


              • #8
                It worked perfectly. Commands:

                grep -n STRING_TO_SEARCH *file (search for a line)

                and after you go:

                sed -i 2,+3d *file: it will go to the second line and from there delete 3 other lines (in this exemple, it will delete line 3,4 and 5)!

                Thanks again!

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                25 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                29 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X