Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quality of the reads

    Hi

    I have recently received first whole genome data from Complete Genomic. The file format is 1.5. I have read the documentation and I realized that the reads quality format is in ASCI-33 coding. How to convert this quality format to the standard fastq quality so to be used in BWA or Bowtie2 ?

    This is an example of how quality at the TSV file looks like

    93::499'2888521408):;%;*:7*81+3090.577774.6259;'82*=<;%48,7435%77;;&%-

    Thanks

  • #2
    I think BWA support both format convention. Here http://bio-bwa.sourceforge.net/bwa.shtml is written: "-I The input is in the Illumina 1.3+ read format (quality equals ASCII-64).". If not, you don't put "I"

    Comment


    • #3
      Originally posted by thedamian View Post
      I think BWA support both format convention. Here http://bio-bwa.sourceforge.net/bwa.shtml is written: "-I The input is in the Illumina 1.3+ read format (quality equals ASCII-64).". If not, you don't put "I"
      ok but I have already the ASCII-33 quality so why to use -I ?? I don't know what exact quality I have ? I already worked with FASTQ quality and it is completely different from the one I have here.

      This is how the quality should look like: this is just an example

      SXXX<NDUETSUBTMW]#\Z

      Comment


      • #4
        Try http://www.bioinformatics.babraham.a...ojects/fastqc/

        Comment


        • #5
          Could you explain to us why you want to re-align reads that have already been aligned ? If you do re-align CG reads with BWA or Bowtie2 you will likely produce worse alignments as these aligners are not aware of the CG read structure (sub-reads).
          Bioinformatics Applications, Europe
          Lifetech Inc. http://www.lifetech.com/

          Comment


          • #6
            Originally posted by gtyrelle View Post
            Could you explain to us why you want to re-align reads that have already been aligned ? If you do re-align CG reads with BWA or Bowtie2 you will likely produce worse alignments as these aligners are not aware of the CG read structure (sub-reads).
            ok I realized later that I can simply convert CG reads/mapping data to SAM without going to fastq, so you are right.

            I need basically to call SNPs and indels from SAM and annotate variations using last dbsnp and 1000g versions

            Comment


            • #7
              If you have CG data, then SNPs and indels have already been called. Why do you need to do it again ? Put simply, going down this path will result in poor SNP and indel calls. In fact it is unlikely that you will get past the alignement stage with non-CG aware tools.

              Basically the CG read structure makes the data incompatible with most third-party tools for alignement and SNP calling.
              Bioinformatics Applications, Europe
              Lifetech Inc. http://www.lifetech.com/

              Comment


              • #8
                Originally posted by gtyrelle View Post
                If you have CG data, then SNPs and indels have already been called. Why do you need to do it again ? Put simply, going down this path will result in poor SNP and indel calls. In fact it is unlikely that you will get past the alignement stage with non-CG aware tools.

                Basically the CG read structure makes the data incompatible with most third-party tools for alignement and SNP calling.
                So why they have map2sam tool at cgatools package ? you can convert your CG data to SAM and sort the results using samtools according to (cgatools-methods.pdf)?

                If it isn't going to work, so what you suggest to do to annotate our CG variants with last dbsnp and 1000g versions ?

                Comment


                • #9
                  Originally posted by fuad193 View Post
                  So why they have map2sam tool at cgatools package ?
                  Why indeed.

                  So for clarity I work for CG, in the App Sci team. Also, if you are already a CG customer you can get direct help by contacting customer support.

                  If you want to re-annotate the provided small variant calls, you could try snpEff, convert your masterVar to VCF and then use that as input. The VCF conversion tool is on the CG community website. There are numerous options for annotation annovar, SeattleSeq etc.
                  Bioinformatics Applications, Europe
                  Lifetech Inc. http://www.lifetech.com/

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  25 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  21 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X