Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ONT Data Analysis Questions

    Hey Guys,

    I'm collaborating with someone who has the minION. I just got my first data set, and am stoked to start looking into it! But I was expected a fast5 file and was sent a fastq file....

    None of the nanopore tools will take a fastq input for their analysis commands, and I can't seem to use the fastx toolkit for normal filtering... I keep getting a ***stack smashing detected*** error.

    Does anybody have any experience they'd like to share?

    Thanks!

  • #2
    Hi 7tbear7,

    I'm surprised you are having trouble with fastq files. Fast5 files were a nightmare to convert to fastq when I first started using the MinION; once in fastq they were relatively easy to analyse. Now I think fastq and fasta are produced automatically by the MinION software, in addition to fast5 files.

    Which tools are you having problems with? Have you had a look at the fastq file in a text editor, just to see if the data is actually in fastq format?

    Vince

    Comment


    • #3
      Thanks for your reply Vince! I have looked at the fastq file and it looks normal to me. I keep getting the segmentation fault or stack smashing -- both having to do with memory limitations? I have a large amount of memory available though. I tried using FastQC to filter the data and that also froze and was unable to process it.

      It's 7,000,000 lines long. I attached a screen shot of the first few lines.

      Originally posted by VinceM View Post
      Hi 7tbear7,

      I'm surprised you are having trouble with fastq files. Fast5 files were a nightmare to convert to fastq when I first started using the MinION; once in fastq they were relatively easy to analyse. Now I think fastq and fasta are produced automatically by the MinION software, in addition to fast5 files.

      Which tools are you having problems with? Have you had a look at the fastq file in a text editor, just to see if the data is actually in fastq format?

      Vince
      Attached Files

      Comment


      • #4
        Certainly looks OK. Have you tried converting the fastq to fasta to see if that would work? If it's a memory issue that might help.

        Not sure I can contribute much more - wet lab person, not bioinformatics. Anyone else like to pitch in?

        Comment


        • #5
          Just one more suggestion. Copy a small part of the fastq file (for example, 50 - 100 sequences) and run that truncated file through the analysis programs.

          Comment


          • #6
            Thanks again for your input Vince.

            I copied the first 3 lines (head) into a new file and tried to use the fastx toolkit to quality filter and it gave me the same segmentation fault. So something is clearly wrong and it's hard to imagine it's the file. Must be one of the dependencies? I have a feeling it's Java...

            If I convert to FASTA I lose all my quality scores though, correct?

            Comment


            • #7
              Originally posted by 7tbear7 View Post

              If I convert to FASTA I lose all my quality scores though, correct?


              Yes, you would lose all your quality data.

              Comment


              • #8
                Originally posted by 7tbear7 View Post
                I copied the first 3 lines (head) into a new file and tried to use the fastx toolkit to quality filter and it gave me the same segmentation fault.
                Er... don't use fastx-toolkit for nanopore files. It was designed more than a few years ago, and is probably not coded to be able to handle lines with tens of thousands of bases.

                And before the question is asked, "Well, what should be used instead?", have a think about what it is that you want to do and turn that desire into a question; ask about your problem, rather than your attempts at a solution.
                Last edited by gringer; 08-01-2017, 11:56 PM.

                Comment


                • #9
                  Hi @7tbear7,
                  since you have MinION device, I am sure you have access to Nanopore Community. Look at the "Community-developed data-analysis tools" topic to find out which tools are devoted to deal with nanopore reads.
                  Regarding the usage of FASTQC, you can find te following option:

                  --nano Files come from naopore sequences and are in fast5 format. In
                  this mode you can pass in directories to process and theprogram
                  will take in all fast5 files within those directories and produce
                  a single output file from the sequences found in all files.

                  However, I think it is still under development and someone here will be more adequate to answer some questions about it

                  Hope this helps!

                  Comment


                  • #10
                    hi! i am bioinformatic student in Mali i work in autism gene i am now need to determine an frequence of some autism gènes in another country eg europe america and so off i need to know if exist and genomics database who can help me, i hope on another issus if possible. thank you.

                    Comment


                    • #11
                      If you find a functional gene in one human population, it is highly likely that that gene exists in all human populations. Perhaps you could post your question as a separate thread; it doesn't seem to be related to ONT Data Analysis.

                      Comment


                      • #12
                        autism gene mutations identification

                        thank you for your answer,

                        now i need to know for this major autism gene mef2c mutation frequency for europeen, African and American autist. thank you

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM
                        • seqadmin
                          Techniques and Challenges in Conservation Genomics
                          by seqadmin



                          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                          Avian Conservation
                          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                          03-08-2024, 10:41 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Yesterday, 06:37 PM
                        0 responses
                        8 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, Yesterday, 06:07 PM
                        0 responses
                        8 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-22-2024, 10:03 AM
                        0 responses
                        49 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-21-2024, 07:32 AM
                        0 responses
                        67 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X