metagenomics uncovers a new group of low gc and ultra-small marine actinobacteria - white polycarbonate panels
Branch lineage of marine bacteria with very low GC content (33%)
And the minimum free living cells currently described (cell volume ca. 0. 013u2005μm3)
Even smaller than "candidates" and "marine life.
These microorganisms are highly correlated with 16 S rna sequences retrieved from the Pacific and Atlantic by PCR 20 years ago.
The macro genome fosmids allows for virtual genome reconstruction, which also indicates very small genomes below 1 mb.
A new rhodopsin was detected, indicating that there is a light-and-light way of reproductive life.
It is estimated that they account for about 4% of the total number of cells found at the research site (
Maximum chlorophyll in deep Mediterranean)
In all available macro genomes of tropical and temperate light regions, similar numbers are estimated.
Their geographical distribution reflects the geographical distribution of picocyanobacteria, and there seems to be an association between these microbial populations. A new sub-
In order to designate these microorganisms, "candidate positive bacteria" was proposed ".
From ~ DNA of 6000 fosmids (
Each fosmid ~ 40kb kb)
24 batches are extracted and pooled, with approximately 250 fosmids per batch.
The sequences of these Illumina are read with PE 300 ymbp (
HiSeq 2000, Macrogen, Korea)
On a one-way street (
Total output 42 gb)
This is expected to provide nearly 175 x coverage per fosmid.
The sequence is trimmed by mass and the vector sequence is cropped.
Each batch was assembled separately using velvet, and gene prediction of assembled fosmids was performed using prodigal in macro genome mode, using tRNAscan-SE.
Ribosomal gene identification using four-
Align and meta _ rna.
Functional annotation by comparing predicted protein sequences with yelp NR database (
From ftp://ftp . ncbi. nih. gov/blast/db/)
The domain prediction of the fosmids described in this work is performed manually using nj-
CD search and hpred server.
Local BLAST search based on the latest yelp-
Execute NR database if necessary.
Use the wordfreq program in the EMBOSS package to calculate the four nucleotide frequency.
Perform principal component analysis using the FactoMineR package in R.
Rna sequences of all major actin strains defined using 178 types of strains, all known uncultured freshwater actin strains (72 sequences)
, The explosion hit closest to the Mediterranean actin sequence in the Mediterranean database (Available from)(27 sequences)
And the data set of the global observation system (Available from)(255 sequences)
Systematic Developmental Correlations of low GC actin sequences were collected to test.
Use ssu-to filter and trim all sequencesalign.
Only sequences with a length of more than 800 bp are retained.
The sequence uses muscle alignment, and the maximum likelihood tree is constructed using the gtr cat model and gamma approximation using FastTree2. Bootstrapping (1000 boot Belt)
Use the seqboot program in the PHYLIP package to complete. Assembled site-
Specific GOS scaffold was screened for the presence and strict cutting of the 16 s gene
Using the identity of> 98% and the length of> 800 u2005 bp to select the scaffold with the same pedigree as the Mediterranean actin 16 s sequence assembled from fosmids.
In addition, a comparison was constructed using the 16 S rna secondary structure to perceive ssu-
Align and the developing tree were rebuilt.
Similar results were obtained as described above.
For the purple red tree, select sequences based on existing literature, PFAM domain search, and BLAST search for nj-
NR and GOS data set macro genome reading.
Align the sequence using MUSCLE and build the maximum likelihood tree using RAxML, using JTT model a gamma approximation and 100 quick bootstrap inferences.
As there are several overlaps in the 43 actin overlap groups, some genes are represented more than once.
Prior to comparison with acI genome, USEARCH was used to cluster 1452 proteins from 43 actin overlap groups as 90%.
Clustering results in a smaller data set of 1177 proteins representing a non-
Redundant protein groups of marine bacteria.
The homologous was identified using reciprocal best blast hit analysis, and the group was compared with 1244 proteins from the acI genome.
Of these 1177 marine discharge bacteria genes, 418 genes share the same origin as those of freshwater discharge bacteria.
Two methods are used to estimate the size of the genome.
First, a set of 35 homologous gene markers previously described were used.
We identified 30 of these genes in 43 overlapping populations.
This shows that the genome is 85% complete.
In the second method, 4203 TIGRFAMs (
From ftp://ftp . jcvi.
Search in all known complete actin genomes (n = 232).
A group of 71 TIGRFAMs were identified in all known actin bacteria, forming a core set of genes.
This set of core genes was tested based on an almost complete genome of freshwater release bacteria SCGC AAA027-
L06, it is estimated to be 97.
5% was done by using 138 complete actin genomes.
We found 69 core TIGRFAMs in this genome, providing 97 estimates.
1%, consistent with previous estimates.
Actinomarina 'contains 48 core TIGRFAMs, indicating 67.
6% of the genome was restored.
Use TBLASTX for recruitment, consider strike only if there are at least 50 amino acids (aa)long with an e-