RRC ID 77065
Author Hata T, Satoh S, Takada N, Matsuo M, Obokata J.
Title Kozak Sequence Acts as a Negative Regulator for De Novo Transcription Initiation of Newborn Coding Sequences in the Plant Genome.
Journal Mol Biol Evol
Abstract The manner in which newborn coding sequences and their transcriptional competency emerge during the process of gene evolution remains unclear. Here, we experimentally simulated eukaryotic gene origination processes by mimicking horizontal gene transfer events in the plant genome. We mapped the precise position of the transcription start sites (TSSs) of hundreds of newly introduced promoterless firefly luciferase (LUC) coding sequences in the genome of Arabidopsis thaliana cultured cells. The systematic characterization of the LUC-TSSs revealed that 80% of them occurred under the influence of endogenous promoters, while the remainder underwent de novo activation in the intergenic regions, starting from pyrimidine-purine dinucleotides. These de novo TSSs obeyed unexpected rules; they predominantly occurred ∼100 bp upstream of the LUC inserts and did not overlap with Kozak-containing putative open reading frames (ORFs). These features were the output of the immediate responses to the sequence insertions, rather than a bias in the screening of the LUC gene function. Regarding the wild-type genic TSSs, they appeared to have evolved to lack any ORFs in their vicinities. Therefore, the repulsion by the de novo TSSs of Kozak-containing ORFs described above might be the first selection gate for the occurrence and evolution of TSSs in the plant genome. Based on these results, we characterized the de novo type of TSS identified in the plant genome and discuss its significance in genome evolution.
Volume 38(7)
Pages 2791-2803
Published 2021-6-25
DOI 10.1093/molbev/msab069
PII 6168432
PMID 33705557
PMC PMC8233501
MeSH Arabidopsis Epigenesis, Genetic Gene Expression Regulation, Plant* Gene Transfer, Horizontal* Genome, Plant* Models, Genetic* Open Reading Frames TATA Box Transcription Initiation Site*
IF 11.062
Arabidopsis / Cultured plant cells, genes rpc00008