Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Concatenate several SRA reads to a single fastq file

    Hi all,

    I want to re-analyse a dataset available in GEO:

    I've downloaded all the files, however it seems that each replicate/experiment has several file corresponding to several runs. The question is: how to concatenate these runs once they are converted into fastq? Would a simple cat command work?

    Also, is this a silly thing to do? (I'm a newbie) Ultimately my goal is to determine gene expression changes.

    Cheers.

  • #2
    [QUOTE=krespim;80152]

    I've downloaded all the files, however it seems that each replicate/experiment has several file corresponding to several runs. The question is: how to concatenate these runs once they are converted into fastq? Would a simple cat command work?
    [./QUOTE]

    I'm sorry, why exactly do you want to cat all these files? A simple a simple
    cat *.fastq >> consolidated_fastq.fq

    will be fine but whats your need for doing so? Every single run usually corresponds to a different sample so why merge all?

    Originally posted by krespim View Post

    Also, is this a silly thing to do? (I'm a newbie) Ultimately my goal is to determine gene expression changes.

    Cheers.
    Process files separately then use comparitive studies on the sam/bam files!

    Comment


    • #3
      Originally posted by arkal View Post
      will be fine but whats your need for doing so? Every single run usually corresponds to a different sample so why merge all?
      Well, this is actually my main issue as I don't know if each run is a different sample, or the same sample ran in multiple lanes. The sample GEO page lists 3 SRA files.

      The paper does not mention biological or technical replicates.

      Comment


      • #4
        Originally posted by krespim View Post
        Well, this is actually my main issue as I don't know if each run is a different sample, or the same sample ran in multiple lanes. The sample GEO page lists 3 SRA files.

        The paper does not mention biological or technical replicates.
        It seems to be the same sample in different lanes/runs... so you can either merge the fastqs and align or align the 3 separately and merge the sams! I recommend the latter as it will take less time (provide you have the resources to align them parallelly)!

        Comment


        • #5
          Originally posted by arkal View Post
          It seems to be the same sample in different lanes/runs... so you can either merge the fastqs and align or align the 3 separately and merge the sams! I recommend the latter as it will take less time (provide you have the resources to align them parallelly)!
          I have recently been granted access to a server with a very decent number or processors so that should be easy

          Thanks a lot arkal!

          Comment


          • #6
            No worries all the best!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            46 views
            0 likes
            Last Post seqadmin  
            Working...
            X