Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MAQGene -> SAMTools -> IGV

    Hi everybody !!!!

    We want to use this 3 programs (MAQGene -> SAMTools -> IGV) to display a lot of fragments of a mutant sequence of C. elegans aligned with a reference sequence.
    First we run MAQGene to have all the difference between our fragments and the reference. We have a txt file with all the informations like this :

    Code:
    variant_id	mutant_strain	dna	start	reference_base	sample_base	consensus_score	loci_multiplicity	mapping_quality	neighbor_quality	number_wildtype_reads	number_variant_reads	sequencing_depth	sample_reads	variant_type	indel_size	classes	descriptions	parent_features
    14	fr6_34	I	200948	A	T	186	0.81	63	62	0	8	8	@TttTttTt	point	0	SNP	none	{haw293}
    15	fr6_34	I	200949	C	T	191	0.81	63	62	0	8	8	@TttTttTt	point	0	SNP	none	{haw294}
    Then we get the map file and convert it into a sam file to be able to use SAMTools to generate an alignment.

    First I index the reference fasta file
    Code:
    [Elegans@localhost MaqToSam]$ ../samtools faidx c_elegans.WS200.dna.fa
    [Elegans@localhost MaqToSam]$ head c_elegans.WS200.dna.fa.fai
    I       15072421        3       50      51
    ---> Now I have a doubt, the fasta file have the reference sequence of 5 chromosom each with a name (I,II,III,...) It seems to take only I ? no ?

    So I try to convert the sam into a bam
    Code:
    [Elegans@localhost MaqToSam]$ ../samtools view -bS -T c_elegans.WS200.dna.fa fr6_34.sam -o fr6_34.bam
    [samopen] no @SQ lines in the header.                                                             
    [main_samview] random alignment retrieval only works for indexed BAM files.
    But more ofently I have this error message
    Code:
    [Elegans@localhost MaqToSam]$ ../samtools view -S fr6_34.sam
    [main_samview] fail to open file for reading.
    [Elegans@localhost MaqToSam]$ ll
    total 7677116
    -rw-rw-r-- 1 Elegans root      15373873 2009-12-01 11:28 c_elegans.WS200.dna.fa
    -rw-rw-r-- 1 Elegans Elegans         19 2009-12-09 15:03 c_elegans.WS200.dna.fa.fai
    -rwxrwxrwx 1 Elegans root    1705085078 2009-12-02 13:54 fr6_34.map
    -rwxrwxrwx 1 Elegans Elegans 6133132406 2009-12-09 13:27 fr6_34.sam
    Certainly rights problems.

    What is going wrong with the sam file ? Is it in a good format ? Is it the conversion map->sam that is not good ?

    Code:
    [Elegans@localhost MaqToSam]$ head fr6_34.sam
    HWI-EAS337:7:59:98:1562#0       65      I       1       0       76M     *       0       0       GCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCT :ACA6BB@CA9?7752A?<A<@>*4<B?8@==@A??=/758=8=?667B;<=A@815(:3&<;58A3%%%%%%%%%    MF:i:32 AM:i:0  SM:i:0  NM:i:0       UQ:i:0  H0:i:85 H1:i:85
    HWI-EAS337:7:1:605:1394#0       115     I       1       0       76M     *       0       -67     GCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCT 8??A?,,,B;8=C;7;6:@<>@<@CA>@A>@B@C>B@BBCAACBBA>C65:7@B?B@C?@AA>BBBB46@<AA69A    MF:i:20 AM:i:0  SM:i:0  NM:i:0       UQ:i:0  H0:i:85 H1:i:85
    HWI-EAS337:7:19:1100:983#0      115     I       1       0       76M     *       0       -75     GCCTAAGCCTATGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCT =98>:<<?<A<!8=AA::A@=?AB?AACBAABCB?BBB?B?BBBBBBC@BCBB@BBBCABCBBBCCCBBCBBCBBB    MF:i:20 AM:i:0  SM:i:0  NM:i:1       UQ:i:0  H0:i:85 H1:i:85
    HWI-EAS337:7:45:645:884#0       115     I       1       0       76M     *       0       -75     GCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCT :.:;33-=6?=5=<??;9ABB>3=BBA:4:@B=A;9@AAC?BBBBBBBCBBBBBAABBBBBBBBCBB>@B@BBABB    MF:i:20 AM:i:0  SM:i:0  NM:i:0       UQ:i:0  H0:i:85 H1:i:85
    HWI-EAS337:7:74:1510:749#0      179     I       1       0       76M     *       0       -75     GCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCT %%%%%%66=;6/;6;4-/7=:64<;=A@AA?AAABBB@BAAB@A@BA@@B??BBBBBBBB=@BBBBBBBBBBBBBB    MF:i:20 AM:i:0  SM:i:0  NM:i:0       UQ:i:0  H0:i:85 H1:i:85
    HWI-EAS337:7:74:1510:749#0      67      I       2       0       76M     *       0       75      CCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGNCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTA BCCCCBACCCCCBCCCCCACCCCCBBCCCB@=CCBBCBCCC?!<CCCA@=CCB@@?B@B=?=ACCA>>BB?=47>B    MF:i:20 AM:i:0  SM:i:0  NM:i:1       UQ:i:0  H0:i:85 H1:i:60
    HWI-EAS337:7:19:1100:983#0      131     I       2       0       76M     *       0       75      CCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTNAGCCTA BABBB@BB@BBABB@BB??A@BB@@@=BB?=A@@@;?<6A?<<@;>:7>732>%%%%%%%%%%%%%%%%!%%%%%%    MF:i:20 AM:i:0  SM:i:0  NM:i:1       UQ:i:0  H0:i:85 H1:i:85
    HWI-EAS337:7:43:1192:1944#0     163     I       2       0       76M     *       0       79      CCTAAGCCTAAGCCTAAGCCTAAGNCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCCAAGCCCAAGCCTCAGTCCA BBCBBBABA>9=BB@?9??@7>@3!;:2/6%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    MF:i:18 AM:i:0  SM:i:0  NM:i:6       UQ:i:20 H0:i:0  H1:i:85
    HWI-EAS337:7:64:839:1206#0      99      I       2       0       76M     *       0       81      CCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTA BBBBC:@BBBCBBBABB@BAABB;BB@BBABBABCB@A?>?>@B<@B@A;>9A;?=<AB7?>39>:5;5=94:A/;    MF:i:18 AM:i:0  SM:i:0  NM:i:0       UQ:i:0  H0:i:85 H1:i:62
    HWI-EAS337:7:45:645:884#0       131     I       2       0       76M     *       0       75      CCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCTAAGCCCAAGCCTAAGCCTA BABB=<@<@B?=BA?B=?@A?=A;@A@AB<8@;<5;??89959;27=69%%%%%%%%%%%%%%%%%%%%%%%%%%%    MF:i:20 AM:i:0  SM:i:0  NM:i:1       UQ:i:4  H0:i:85 H1:i:85
    If someone could help me it will be very good because I'm lost
    If you want more informations tell me !

    Bye

  • #2
    Download SAM tools for free. SAM (Sequence Alignment/Map) is a flexible generic format for storing nucleotide sequence alignment. SAMtools provide efficient utilities on manipulating alignments in the SAM format.


    For most unix commands, you MUST put all switches/options before the main arguments. Instead of doing this:

    Code:
    samtools view -bT ref.fa aln.sam -o aln.bam
    do this:

    Code:
    samtools view -bT ref.fa -o aln.bam aln.sam

    Comment


    • #3
      Hi Heng Li,
      OK thanks I don't know why I did that ! But I have the same error messages ....


      Code:
      [Elegans@localhost MaqToSam]$ ../samtools view -bST c_elegans.WS200.dna.fa -o fr6_34.bam ./fr6_34.sam
      [main_samview] fail to open file for reading.
      What kind of error send this type of message ?
      Rights problems ? I made a chmod 777 on the file and try to launch the command with root. --> no changes
      Not a good sam format ? So the maq2sam-long have a problem. Update MAQgene (I didn't make the last update) ? Rights problem when running maq2sam (map file are owned by apache but have 777 rights mode) ?
      Code:
      lrwxrwxrwx 1 apache apache      49 2009-09-02 19:12 fr6_34.map -> /var/www/html/maqgene/maqgene/work/1506135156.map
      Last edited by Fred13; 12-11-2009, 02:11 AM.

      Comment


      • #4
        Hi !!

        So I found the script lines that return this error message :

        Code:
        // open file handlers
              if ((in = samopen(argv[optind], in_mode, fn_list)) == 0) {
                    fprintf(stderr, "[main_samview] fail to open file for reading.\n");
                    goto view_end;
              }
        What are the possibility to make the function samopenn == 0 ??

        I found the samopen function !

        Code:
        samfile_t *samopen(const char *fn, const char *mode, const void *aux)
        {
              samfile_t *fp;
              fp = (samfile_t*)calloc(1, sizeof(samfile_t));
              if (mode[0] == 'r') { // read
                    fp->type |= TYPE_READ;
                    if (mode[1] == 'b') { // binary
                          fp->type |= TYPE_BAM;
                          fp->x.bam = strcmp(fn, "-")? bam_open(fn, "r") : bam_dopen(fileno(stdin), "r");
                          if (fp->x.bam == 0) goto open_err_ret;
                          fp->header = bam_header_read(fp->x.bam);
        There is no restriction with size file ???
        It's very strange I think I'm looking in the wrong way because
        Code:
        [Elegans@localhost MaqToSam]$ ll                                  
        total 15339124                                                    
        -rw-rw-r-- 1 Elegans root      15373873 2009-12-01 11:28 c_elegans.WS200.dna.fa
        -rw-rw-r-- 1 Elegans Elegans         19 2009-12-09 15:03 c_elegans.WS200.dna.fa.fai
        -rwxrwxrwx 1 Elegans root    1705085078 2009-12-02 13:54 fr6_34.map
        [COLOR="Red"]-rwxrwxrwx 1 Elegans Elegans 6133132406 2009-12-09 13:27 fr6_34.sam[/COLOR]
        [Elegans@localhost MaqToSam]$ ../samtools view -S fr6_34.sam
        [COLOR="red"][main_samview] fail to open file for reading.[/COLOR]
        Without any options (just sam input) I have the error. What could be problems with maq2sam-long ?

        help ...
        Last edited by Fred13; 12-11-2009, 02:38 AM.

        Comment


        • #5
          Error found- too late but might help other users

          Too late but might help other users

          It is "-t" option and not "-T" in the samtools view command.
          And also, you have used ref.fa instead of ref.fa.fai in the -t option of your samtools view command.


          Code:
          [Elegans@localhost MaqToSam]$ [B]../samtools view -bS [U][I][B]-T[/B][/I][/U] c_elegans.WS200.dna.[I][U]fa[/U][/I] fr6_34.sam -o fr6_34.bam[/B]
          [samopen] no @SQ lines in the header.                                                             
          [main_samview] random alignment retrieval only works for indexed BAM files.
          Last edited by naveennav; 01-17-2013, 12:05 PM.

          Comment


          • #6
            Thanks !
            To late for me but might help

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            21 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X