| We have recently reported a genome-scale catalog of human protein-coding genes that contain “exceptionally
long” STRs (≥6-repeats) in their core promoter, which may be of selective advantage in this species. At the top
of that list, SCGB2B2 (also known as SCGBL), contains one of the longest CA-repeat STRs identified in a human
gene core promoter, at 25-repeats. In the study reported here, we analyzed the conservation status of this
CA-STR across evolution. The functional implication of this STR to alter gene expression activity was also analyzed
in the HEK-293 cell line. We report that the SCGB2B2 core promoter CA-repeat reaches exceptional lengths,
ranging from 9- to 25-repeats, across Apes (Hominoids) and the Old World monkeys (CA N 2-repeats were not
detected in any other species). The longest CA-repeats and highest identity in the SCGB2B2 protein sequence
were observed between human and bonobo. A trend for increased gene expression activity was observed from
the shorter to the longer CA-repeats (p b 0.009), and the CA-repeat increased gene expression activity, per se
(p b 0.02). We propose that the SCGB2B2 gene core promoter CA-repeat functions as an expression code for
the evolution of Apes and the Old World monkeys. |