Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • paired end data not aligned in NextSeq 500

    Hello!!
    I'm quite new in NGS and I got these fastq files (read1 and read2) from illumina nextseq 500.
    Now the problem is that in somehow the two file are not aligned, I mean, I checked for the coordinates in read1 and read2 on the same row and it appears that sometimes are not the same!
    Since I'm quite new, this can be possible? or anyway is there any tools or bash script that can help me in the alignment of these files?


    Hope seriously in somebody that can help me!

  • #2
    Hi Julia_m,

    Is that Genomic data or RNAseq - transcriptomic data. Okay !

    If it was not aligning that could mean.. paired read was not found in one of the other file.

    Probably you might have the trimmed data and considered all the ones. If it was trimmed you should keep the read that present in both the files. (Check any trimming software eg: cutadapt and try mapping)

    Hope it works.
    Krishna

    Comment


    • #3
      cutadapt is for adaptor cutting isn't it?
      i mean my problem is quite different, I have two fastq files like these:
      READ1.fastq
      @NB501365:8:HF3HLAFXX:1:11101:22082:1033 1:N:0:CGAGTA
      READ2.fastq
      @NB501365:8:HF3HLAFXX:1:11101:22082:1033 2:N:0:CGAGTA

      the coordinates 22082:1033 for read1 and 22082:1033 for read2 shouldn't be the same in paired ends?

      Comment


      • #4
        sorry big mistake READ2.fast has the following header:
        @NB501365:8:HF3HLAFXX:1:11101:64563:1033 2:N:0:CGAGTA

        Comment


        • #5
          Your data looks as it should. The "1:N:0:CGAGTA" portion, as an example, just says "I'm read 1, I didn't fail quality control filtering on the machine, and my barcode was CGAGTA". Read 2 should look the same (and have the same read name), with the exception that there's a 2 rather than a 1 in the second block of text.

          So go ahead and quality/adapter trim this dataset (e.g., with "Trim Galore!" or trimmomatic) and then use an aligner (bowtie2, bbmap, hisat2, bwa, STAR, etc.).

          Edit: Oh, if the read names are really different (I just now noticed your most recent post) then you'll need to resync the files. There's a tool in BBMap that can do this for you (it has a LOT of convenient tools).

          Comment


          • #6
            As @Devon said your R1/R2 files may be out of sync. You can use repair.sh from BBMap suite to re-sync the files.

            You will find example of that command line and lots of other things BBMap suite can do in this thread.

            Comment


            • #7
              It would still be interesting, what has caused the reads to be out-of-sync.
              Have these reads been pre-processed with some tool before or is this the data that has been sent by sequencing provider?

              Fixing is important, avoiding, or knowing how to avoid it, is more important ;-)

              Just my 2p.

              Comment


              • #8
                problem solved!! seems that something happened during the unzipping process (I don't know why), if you process everything directly from the gzip file it's fine!.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X