Sequences with pairwise BLASTN matches of <1e-20. The frequencies of G
Sequences with pairwise BLASTN matches of <1e-20. The frequencies of G+C in fourfold degenerate positions (GC3s) and (Nc'), a measure of the effective number of codons used in a gene which take the background nucleotide composition into account [41], were calculated using INCA [62]. A correspondence analysis on the relative synonymous codon usage (RSCU) values using the software CodonW [63] was performed to examine the variation of codon usage among genes. 245 of these 682 groups were found to have BLASTP matches to putative proteins from S. salmonicida [14] with E values <1e-40 spanning 100 or more aligned amino acid positions. In twelve cases two different S. barkhanus orfs matched the same S. salmonicida orf; only the pair with the lowest E value was retained for further analyses. The amino acid and nucleotide sequences for the aligned homologous regions of the 233 putative orthologous pairs of S. barkhanus and S. salmonicida sequences were extracted. GC3s and Nc' were calculated using INCA [62] for the orthologs. For each orthologous pair the amino acid sequences were aligned using ClustalW [64], and this alignment was used as a guide to align the nucleotide sequences using the software [65]. Synonymous (ds) and nonsynonymous (dN) substitution rates were calculating using the Yang and Nielsen method [66] using the PAML program package [67].Allelic sequence variationPhylogenetic analyses were performed on a putative selenophosphate synthetase (SelD) homolog. An automatic phylogenetic tree was generated using the Phylogenie package [58], as described previously [14]. 28 sequences from the obtained tree were selected to represent the diversity of SelD in the three domains of life. 160 unambiguously aligned amino acid positions were identified by eye. Using Modelgenerator [59], BLOSUM62 + was identified as the optimal substitution model. Maximum likelihood analysis with 500 bootstrap replicates was performed with RAxML, version 7.0.4 [60]. Bayesian analyses with two independent runs for 500,000 generations were performed with MrBayes, version 3.1.2 [61], using the default settings, except for the optimal is a software designed to identify single nucleotide polymorphism (SNP) within assemblies of sequences produced using the Sanger technology that takes quality values into account [42]. This tool was used to quantify the sequence heterogeneity within assembled clusters. The genes encoding enolase, ribosomal protein S2, glutamate dehydrogenase, heat shock protein 70 and pyruvate kinase were selected for further analyses of the observed intragenomic sequence heterogeneity. Verification and discovery of polymorphisms in fragments of the enolase, ribosomal protein S2, glutamate dehydrogenase, cytoplasmic heat shock protein 70 and pyruvate kinase in S. barkhanus and S. salmonicida were accomplished by cloning and sequencing of individual PCR clones. Low degeneracy primer-pairs targeting 500 bp regions in both Lenalidomide species were designed based on alignments of the genes (Additional file 9) and recommendations in the Phusion HotStart polymerase instruction manual. PCR reactions were performed in 1?Phusion HF buffer with 1.5 mM MgCl2, PubMed ID: 200 M dNTPs, 0.5 M of forward andRoxstr -Lindquist et al. BMC Genomics 2010, 11:258 12 ofreverse primers, 40 ng S. barkhanus or S. salmonicida genomic DNA and 0.8 U Phusion HS DNA polymerase (Finnzymes) in a total volume of 40 l. The.