Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 454 fasta qual description format NGS QC toolkit Quality Trimming

    Hello, I am using NGS QC toolkit to do a quality trimming. Specifically I am using the 454QC_PE.pl tool for 454 paired-end reads. It works perfectly with the example input but not with my data. When I compared the formats of the fasta files they look like this:
    my data:
    >G7I7FDK01EB4WS rank=0000056 x=1661.0 y=1354.0 length=73
    TCGTGTACTCGTATATGTATGCTATACGAGTATGCACCTCGTATACTCGTATA

    the example input data:
    >E6PIHNP01B74B0 length=157 xy=0795_1886 region=1 run=R_2008_03_05_15_54_49_
    TCAGAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCT

    so I guess these differences in the description format are what is causing the problem. Does anyone know how can I convert my data into the same format style or how can I directly extract them form the sff file into the same format for the description?

  • #2
    I just noticed something else. I checked the script and it says
    my $linker = "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC";

    that is the linker for FLX but I used titanium. Does anyone know how can I modify the script in order to include the two linkers for titanium

    >titlinker1
    TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG
    >titlinker2
    CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA

    I am not a programmer, and I need to have one or the other, something like:

    my $linker = "TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG" or "CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA"

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    39 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X