Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    1 is the lane, 2782 the x coordinate, 993 the y-coordinate, etc.. There's no standard for read names, you could name them anything you want (in fact, they could all be the same if you wanted).

    Comment


    • #17
      Originally posted by dpryan View Post
      1 is the lane, 2782 the x coordinate, 993 the y-coordinate, etc.. There's no standard for read names, you could name them anything you want (in fact, they could all be the same if you wanted).
      hmm I guess that depends on the version of illumina reads..I haven't seen five numerical...so I got confuse!!!
      but it means this is fastq format that's for sure..

      Comment


      • #18
        Originally posted by ron128 View Post
        Dear Sir/Madam, I have no clue as to what platform was used for sequencing this data. The only thing which has been told to me is that this data is from NIH3T3 cell lines :/ the sequencing company has shut up shop and unfortunately there is no way to be sure about this. I already tried using the phred64 option in prinseq, but I was getting the same error which says that the input file is not in fastq format. Thanks for your thoughts on this

        may be this will help

        Comment


        • #19
          Actually, this is somewhere between illumina 1.4 and 1.7, since the multiplex tag is included. I should also mention that it's lane 5, not 1 (it's tile 1). I'm used to seeing the 1101+ tile numbers from the hiseq...

          Comment


          • #20
            Originally posted by paa6 View Post
            http://www.biostars.org/p/911/
            may be this will help
            Thanks a ton for this! I tried solexaQA as well, but guess what another weird error I have written in to solexaQA mailing list. awaiting a response.

            Well I cannot know for certain whether the file is corrupt. My intuition is that it is not corrupt, as i can open the file perfectly fine in a text editor like emacs or vi.

            Comment


            • #21
              Originally posted by dpryan View Post
              Actually, this is somewhere between illumina 1.4 and 1.7, since the multiplex tag is included. I should also mention that it's lane 5, not 1 (it's tile 1). I'm used to seeing the 1101+ tile numbers from the hiseq...
              Dear Mr Ryan, Thanks a ton for all your insights. I am trying to run the grep command and see if it is what you are suggesting. I should have some insights by tomorrow upon greping the data, once i get access to my server. thanks a ton! I will post back tomorrow what i get for this data.

              Comment


              • #22
                The mystery Deepens

                @ Mr Ryan: Tried what you suggested. Turns out it IS pre 1.8

                Tried out solexa QA as well. Returns Casava 1.3 as the pipeline used for generating the data.

                Now here is where things start to get intriguing. I managed to run fastqc on this data, which is now telling me in the report that Casava 1.5 was used for generating the data. It is differing with the solexaQA. might be a minor difference between the 1.3 and 1.5 which both the tools are not able to pick up. What intrigues me is that, I still cannot run this stupid data using bowtie (which is format independant) OR Bwa. ANd the data is not corrupt, because I have managed to run fastqc. I am looking at a few things in here. WIll keep everyone posted. Is there a website someone knows where I can host these reads? Gdrive maybe? or is there something NGS specific? I am thinking this is a great dataset for ppl troubleshooting NGS programs and wanting to learn about NGS in general, to get started working upon

                Comment


                • #23
                  Google drive, dropbox, copy.com, there are a few option out there for sharing bigger files. What sort of error do you get when you run bowtie? It has an option to tell it how the phred scores were encoded (and anyway, one could always change them with a simple script).

                  Comment


                  • #24
                    For a bit more understanding of the different FASTQ formats, have a look at the Wikipedia page:



                    The example sequence in that section looks pretty close to your format.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    17 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    22 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    16 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    46 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X