Saturday, 14 December 2013

t7 promoter - What determines a successful protein expression in E. coli?

I found a very nice paper: Designing Genes for Successful Protein Expression, which covers most factors that determine protein expression. I post parts of it, because I am sure it will be useful to some of you.




Translation can be controlled at the level of initiation and elongation. Initiation of translation is primarily dependent on the sequence of the ribosome binding site (RBS) and early mRNA secondary structure. Other determinants of protein expression are less well understood but equally potent.



1. Initiation of translation



A key component affecting initiation of translation in prokaryotes is the RBS that occurs between 5 and 15 bases upstream of the open reading frame (ORF) AUG start codon. Binding of the ribosome to the Shine–Dalgarno (SD) sequence within the RBS localizes the ribosome to the initiation codon... Affinity of the RBS for the ribosome is a critical factor controlling the efficiency with which new polypeptide chains are initiated. This interaction is in competition with possible base-pairing interactions involving the RBS region that may form within the mRNA itself. Thus, SD sequences with weaker base pairing to the ribosome are more susceptible to interference from mRNA structure. However, some experiments suggest that SD sequences with too strong affinity can be deleterious, particularly at lowering temperatures, by stalling initial elongation. Also critical is the distance between the RBS and the start codon with 5-7 bases from the consensus SD AGGAGG being optimal.



Numerous lines of evidence suggest that the initial 15–25 codons of the ORF deserve special consideration in gene optimization. Studies have shown that the impact of rare codons on translation rate is particularly strong in these first codons, for expression in both Escherichia coli and Saccharomyces cerevisiae. In E. coli, peptidyl-tRNA drop-off during translation of the initial codons appears to be accentuated by the presence of rare NGG codons. These effects appear to be independent of local mRNA secondary structure. It is also true that expression may be recovered by 5' sequence replacement even for sequences that do not show especially strong mRNA structure or contain rare codons or other obvious deleterious elements in this region.



2. Codon bias



The second way in which host codon frequencies can be used is to match the host codon frequencies in the designed gene. This can be done simply by choosing each codon with a probability that matches the host codon frequency...Using sets of genes broadly varied in gene design features, Welch et al. found that variation in synonymous codon usage frequencies had a profound effect on the amount of protein produced in E. coli, independent of local 5’ sequence effects. Variation of at least two orders of magnitude in expression was seen due to substitution beyond the initial 15 codons of the ORF. This variation was strongly correlated with the global codon usage frequencies of the genes, although the codon frequencies found in the highest expressed variants did not correspond to those found in the genome or in highly expressed endogenous genes of E. coli. Multivariate analysis showed that the frequencies of specific codons for about six amino acids could predict the observed differences in expression. It is not clear what the biochemical basis is for this correlation.



3. mRNA structure and translational elongation



While much evidence suggests that mRNA structure can interfere with translational initiation in both prokaryotes and eukaryotes, the effects of structure on elongation are less well understood. This in part may be due to intrinsic helicase activity of ribosomes, which allows translation through even very strong hairpins and may preclude many structures from limiting the translation rate in either prokaryotes or eukaryotes. Perhaps more importantly, mRNA structure is difficult to predict, particularly for actively translated messages which are in continuous flux between various folded and unfolded states.



4. Protein-specific factors providing additional complexity



The protein may be particularly unstable in the host, especially if it is poorly folded due to inherent instability, lack of sufficient prosthetic factors, or improper post-translational modification... Expression of secreted and membrane proteins may be limited by mechanisms for directing these proteins to the membrane. It is even possible that the protein amino acid sequence may limit translational efficiency. For example, proline is thought to be slowly translated in E. coli, regardless of which codon is used.



Expression of the protein may be toxic to the cell leading to instability of the expression vector or host suppression of protein synthesis...A common strategy to reduce toxicity is to lower expression to tolerable levels. Promoters varied in strength can be valuable tools for finding an optimal expression rate for maximal yield...One potential way to avoid toxicity of some proteins is to direct expression to the periplasm or media. This may be accomplished by N-terminal fusion of a secretion signal sequence.




For more information, please read the whole paper. I also recommend reading Design parameters to control synthetic gene expression in Escherichia coli.

No comments:

Post a Comment