Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BFAST + GATK -> strand bias?

    Hi All,

    I have HiSeq exome data, 75 b paired end. I've used bfast to align, which, if I am not mistaking, converts one end to its complement so that both members of a pair are attributed to the same strand, the pos one, in the resulting bam file.

    If this is true it leads to a problem in GATK (when filtering calls), simply that one can not use strand bias tests (they are all highly sign, all reds are on the same strand). Also, this would also create problem in the read position rank tests, as the 'best' end of every other read is annotated as its 'good' end.

    So the question is: Am I missing something, or are the above conclusions correct?
    Also, is there a way to circumvent this?

    Thanks a bunch,
    Boel

  • #2
    With the newest version of BFAST, you can have the input reads on be the opposite strand to properly represent paired end reads. BFAST will not reverse compliment one end, but the conversion to FASTQ may do so (a new "-k" option avoids this).

    What is your input data, FASTQ files?

    You may get more traction at [email protected].

    Comment


    • #3
      Originally posted by nilshomer View Post
      With the newest version of BFAST, you can have the input reads on be the opposite strand to properly represent paired end reads. BFAST will not reverse compliment one end, but the conversion to FASTQ may do so (a new "-k" option avoids this).

      What is your input data, FASTQ files?

      You may get more traction at [email protected].
      Thanks.

      I started with qseq files, converted them with the ill2fastq.pl and then aligned with bfast (bfast+bwa-0.6.5). Did not use the -k option.

      Not sure that this does create a bias, thats what I am trying to figure out. Do you know?
      According to sam format all reads are on the same strand as the reference, which might make this a non-issue. But I must say that I am confused right now.

      Comment


      • #4
        The ill2fastq.pl script will reverse compliment one of the ends, so they are mapped onto the same strand. You can try with the newest release and the "-k" option, as well as supplying the proper pairing information (there is a new "postprocess" pairing option).

        Comment


        • #5
          Originally posted by nilshomer View Post
          The ill2fastq.pl script will reverse compliment one of the ends, so they are mapped onto the same strand. You can try with the newest release and the "-k" option, as well as supplying the proper pairing information (there is a new "postprocess" pairing option).
          I am correct in assuming that using bfast in the way I have does create a bias?

          Further, when looking at my reads in IGV most (70%) have a insert size that is positive, while my correct mean insert size should be around -12 (overlapping). Does this suggest that something has gone wrong in the mapping?

          Comment


          • #6
            Originally posted by Boel View Post
            I am correct in assuming that using bfast in the way I have does create a bias?

            Further, when looking at my reads in IGV most (70%) have a insert size that is positive, while my correct mean insert size should be around -12 (overlapping). Does this suggest that something has gone wrong in the mapping?
            Not sure as I don't have enough information. I have tried mapping with pairs that should have a mean insert of zero without problems. Please try the newest version of BFAST and post your results as well as enough information for us to debug.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Genetic Variation in Immunogenetics and Antibody Diversity
              by seqadmin



              The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
              11-06-2024, 07:24 PM
            • seqadmin
              Choosing Between NGS and qPCR
              by seqadmin



              Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
              10-18-2024, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 11-08-2024, 11:09 AM
            0 responses
            128 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 11-08-2024, 06:13 AM
            0 responses
            95 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 11-01-2024, 06:09 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-30-2024, 05:31 AM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Working...
            X