protein engineering


  • However, its major drawback is that detailed structural knowledge of a protein is often unavailable, and, even when available, it can be very difficult to predict the effects
    of various mutations since structural information most often provide a static picture of a protein structure.

  • While the sequence-conformation space that needs to be searched is large, the most challenging requirement for computational protein design is a fast, yet accurate, energy
    function that can distinguish optimal sequences from similar suboptimal ones.

  • [4] Computational protein design algorithms seek to identify novel amino acid sequences that are low in energy when folded to the pre-specified target structure.

  • Multiple sequence alignment[edit] Without structural information about a protein, sequence analysis is often useful in elucidating information about the protein.

  • Approaches Rational design[edit] Main article: Protein design In rational protein design, a scientist uses detailed knowledge of the structure and function of a protein to
    make desired changes.

  • Multiple sequence alignment utilizes data bases such as in order to cross reference target protein sequences with known sequences.

  • [5][page needed] Fragment based[edit] These methods use database information regarding structures to match homologous structures to the created protein sequences.

  • This model can be created using either molecular mechanics potentials or protein structure derived potential functions.

  • The first step in homology modeling is generally the identification of template sequences of known structure which are homologous to the query sequence.

  • Methods for protein structure prediction fall under one of the four following classes: ab initio, fragment based methods, homology modeling, and protein threading.

  • Next sequences are clustered using the mBed and k-means methods.

  • This method has been shown to be 5–10% more accurate than Clustal W.[5][page needed] Coevolutionary analysis[edit] Coevolutionary analysis is also known as correlated mutation,
    covariation, or co-substitution.

  • [5][page needed] T-Coffee[edit] This method utilizes tree based consistency objective functions for alignment evolution.

  • [5][page needed] Multiple sequence comparison by log expectation (MUSCLE)[edit] This method utilizes Kmer and Kimura distances to generate multiple sequence alignments.

  • These are scored based upon potential energy models of both query and template sequence.

  • This alignment is then subjected to manual refinement that involves removal of highly gapped sequences, as well as sequences with low sequence identity.

  • Following the development of a potential model, energy search techniques including molecular dynamic simulations, Monte Carlo simulations and genetic algorithms are applied
    to the protein.

  • [5][page needed] Ab initio[edit] These methods involve free modeling without using any structural information about the template.

  • [10] Generally, directed evolution may be summarized as an iterative two step process which involves generation of protein mutant libraries, and high throughput screening
    processes to select for variants with improved traits.

  • While more conservative than direct selection from deep sequence space, redesign of existing proteins by random mutagenesis and selection/screening is a particularly robust
    method for optimizing or altering extant properties.

  • Advantages of directed evolution are that it requires no prior structural knowledge of a protein, nor is it necessary to be able to predict what effect a given mutation will

  • Library size has also been reduced to more screenable sizes by the identification of key beneficial residues using algorithms for systematic recombination.

  • Finally a significant step forward toward efficient reengineering of enzymes has been made with the development of more accurate statistical models and algorithms quantifying
    and predicting coupled mutational effects on protein functions.

  • by error-prone PCR or sequence saturation mutagenesis, is applied to a protein, and a selection regime is used to select variants having desired traits.

  • Indeed, the results of directed evolution experiments are often surprising in that desired changes are often caused by mutations that were not expected to have some effect.

  • MnCl2 is added into the reaction mixture to promote random point mutations in the DNA strands.

  • This technique does not require prior knowledge of the protein structure and function relationship.

  • An added process, termed DNA shuffling, mixes and matches pieces of successful variants to produce better results.

  • Directed evolution makes it possible to identify undiscovered protein sequences which have novel functions.

  • Also computational approaches have showed large advances in the innumerably large sequence space to more manageable screenable sizes, thus creating smart libraries of mutants.

  • [5][page needed] Directed evolution methods can be broadly categorized into two strategies, asexual and sexual methods.

  • Single genes are used to create mutant libraries using various mutagenic techniques.

  • [5][page needed] Single primer reactions in parallel (SPRINP)[edit] This site saturation mutagenesis method involves two separate PCR reaction.

  • [5][page needed] Random mutagenesis methods altering the target DNA length[edit] These methods involve altering gene length via insertion and deletion mutations.

  • This method has been shown to produce proteins with new functionalities via introduction of new restriction sites, specific codons, four base codons for non-natural amino

  • This technique results in the generation of tandem repeats of random fragments of the target gene via rolling circle amplification and concurrent incorporation of these repeats
    into the target gene.

  • Site saturation mutagenesis[edit] Site saturation mutagenesis is a PCR based method used to target amino acids with significant roles in protein function.

  • The two most common techniques for performing this are whole plasmid single PCR, and overlap extension PCR.

  • This method begins with the generation of variable length DNA fragments tailed with universal bases via the use of template transferases at the 3′ termini.

  • [5][page needed] Overlap extension PCR requires the use of two pairs of primers.

  • [5][page needed] The dual approach to random chemical mutagenesis is an iterative two step process.

  • These techniques require and understanding of the sequence-function relationship for the protein of interest.

  • [5][page needed] Focused mutagenesis[edit] Focused mutagenic methods produce mutations at predetermined amino acid residues.

  • [5][page needed] Random priming in vitro recombination (RPR)[edit] This in vitro homologous recombination method begins with the synthesis of many short gene fragments exhibiting
    point mutations using random sequence primers.

  • These fragments are reassembled to full length parental genes using primer-less PCR.

  • The chimeric DNA of parental size is then amplified using end terminal primers in regular PCR.

  • Finally this method is independent of the length of DNA template sequence, and requires a small amount of parental DNA.

  • [5][page needed] Staggered extension process (StEP)[edit] This in vitro method is based on template switching to generate chimeric genes.

  • [5][page needed] Truncated metagenomic gene-specific PCR[edit] This method generates chimeric genes directly from metagenomic samples.

  • Next, specific primers are designed and used to amplify the homologous genes from different environmental samples.

  • This PCR involves homologous fragments from different parental genes priming for each other, resulting in chimeric DNA.

  • [5][page needed] DNA shuffling[edit] This in vitro technique was one of the first techniques in the era of recombination.

  • These techniques are often used to recombine two different parental genes, and these methods do create cross overs between these genes.

  • Next the products from this cleavage are ligated together, resulting in the insertion of the gene into the target plasmid.

  • This method is carried out until the parental length chimeric gene sequence is obtained.

  • [5][page needed] Mutagenic organized recombination process by homologous in vivo grouping (MORPHING)[edit] This method introduces mutations into specific regions of genes
    while leaving other parts intact by utilizing the high frequency of homologous recombination in yeast.

  • [5][page needed] Incremental truncation for the creation of hybrid enzymes (ITCHY)[edit] Fragments of parental genes are created using controlled digestion by exonuclease

  • [5] In vitro non-homologous recombination methods[edit] These methods are based upon the fact that proteins can exhibit similar structural identity while lacking sequence

  • Synthetic shuffling[edit] Shuffling of synthetic degenerate oligonucleotides adds flexibility to shuffling methods, since oligonucleotides containing optimal codons and beneficial
    mutations can be included.

  • This amplification products are then reassembled into full length genes using primer-less PCR.

  • [5][page needed] In vivo Homologous Recombination[edit] Cloning performed in yeast involves PCR dependent reassembly of fragmented expression vectors.

  • These resulting fragments are then ligated together to form full length genes.

  • [5][page needed] Y-Ligation based shuffling (YLBS)[edit] This method generates single stranded DNA strands, which encompass a single block sequence either at the 5′ or 3′
    end, complementary sequences in a stem loop region, and a D branch region serving as a primer binding site for PCR.

  • [5][page needed] Phosphoro thioate-based DNA recombination method (PRTec)[edit] This method can be used to recombine structural elements or entire protein domains.

  • [5][page needed] User friendly DNA recombination (USERec)[edit] This method begins with the amplification of gene fragments which need to be recombined, using uracil dNTPs.

  • [5][page needed] Golden Gate shuffling (GGS) recombination[edit] This method allows you to recombine at least 9 different fragments in an acceptor vector by using type 2 restriction
    enzyme which cuts outside of the restriction sites.

  • This method begins with the preparation of single stranded DNA fragments by reverse transcription from target mRNA.

  • [5][page needed] Recombined extension on truncated templates (RETT)[edit] This method generates libraries of hybrid genes by template switching of uni-directionally growing
    polynucleotides in the presence of single stranded DNA fragments as templates for chimeras.

  • This cycle is followed by template switching and annealing of the short fragments obtained from the earlier primer extension to other single stranded DNA fragments.

  • [5][page needed] Sequence independent site directed chimeragenesis (SISDC)[edit] This method results in a library of genes with multiple crossovers from several parental genes.

  • The first step in the process begins with amplification of fragments that need to be recombined along with the vector backbone.

  • These PCR product are converted to single strands via avidin-biotin binding to the 5′ end of the primes containing stem sequences that were biotin labeled.

  • These gene fragments are mixed and ligated in an appropriate order to form chimeric libraries.

  • This is followed by the incorporation of specific tags containing restriction sites followed by the removal of the tags by digestion with Bac1, resulting in genes with cohesive

  • Hybrids with free phosphorylated 5′ end in 3′ half strands are then ligated with free 3′ ends in 5′ half strands using T4 DNA ligase in the presence of 0.1 mM ATP.

  • These chimeras are fused via a linker sequence containing several restriction sites.

  • [14] Examples of engineered proteins Computing methods have been used to design a protein with a novel fold, named Top7,[15] and sensors for unnatural molecules.

  • [11] Directed evolution will likely not be replaced as the method of choice for protein engineering, although computational protein design has fundamentally changed the way
    protein engineering can manipulate bio-macromolecules.

  • Smaller, more focused and functionally-rich libraries may be generated by using in methods which incorporate predictive frameworks for hypothesis-driven protein engineering.

  • [11] Screening and selection techniques Once a protein has undergone directed evolution, ration design or semi-ration design, the libraries of mutant proteins must be screened
    to determine which mutants show enhanced properties.

  • [19] A protein cage, E. coli bacterioferritin (EcBfr), which naturally shows structural instability and an incomplete self-assembly behavior by populating two oligomerization
    states, is the model protein in this study.

  • To investigate the possibility of engineering EcBfr for modified structural stability, a semi-empirical computational method is used to virtually explore the energy differences
    of the 480 possible mutants at the dimeric interface relative to the wild type EcBfr.

  • [5]: 53  Enzyme engineering[edit] Enzyme engineering is the application of modifying an enzyme’s structure (and, thus, its function) or modifying the catalytic activity of
    isolated enzymes to produce new metabolites, to allow new (catalyzed) pathways for reactions to occur,[12] or to convert from some certain compounds into others (biotransformation).

  • [5][page needed] Cell free display systems have been developed to exploit in vitro protein translation or cell free translation.

  • Although experimental optimization may be produced using directed evolution, further improvements in the accuracy of structure predictions and greater catalytic ability will
    be achieved with improvements in design algorithms.

  • Semi-rational design Semi-rational design uses information about a proteins sequence, structure and function, in tandem with predictive algorithms.

  • Integration of sequence and structure based approaches in library design has proven to be a great guide for enzyme redesign.

  • This is done by repeatedly randomly perturbing the structure of the proteins around specified design positions, identifying the lowest energy combination of rotamers, and
    determining whether the new design has a lower binding energy than prior ones.

  • Further functional enhancements may be included in future simulations by integrating protein dynamics.


Works Cited

[‘1. “Protein engineering – Latest research and news | Nature”. Retrieved 2023-01-24.
2. ^ Woodley, John M. (May 2022). “Integrating protein engineering into biocatalytic process scale-up”. Trends in Chemistry. 4 (5): 371–373. doi:10.1016/j.trechm.2022.02.007.
S2CID 247489691.
3. ^ “Speeding Up the Protein Assembly Line”. Genetic Engineering and Biotechnology News. 13 February 2015.
4. ^ Farmer, Tylar Seiya; Bohse, Patrick; Kerr, Dianne (2017). “Rational Design Protein Engineering Through Crowdsourcing”.
Journal of Student Research. 6 (2): 31–38. doi:10.47611/jsr.v6i2.377. S2CID 57679002.
5. ^ Jump up to:a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq ar as at au av aw ax ay az ba bb bc bd be
bf bg bh bi bj bk Poluri, Krishna Mohan; Gulati, Khushboo (2017). Protein Engineering Techniques. SpringerBriefs in Applied Sciences and Technology. Springer. doi:10.1007/978-981-10-2732-1. ISBN 978-981-10-2731-4.
6. ^ Liu, Cassie J.; Cochran, Jennifer
R. (2014), Cai, Weibo (ed.), “Engineering Multivalent and Multispecific Protein Therapeutics”, Engineering in Translational Medicine, London: Springer, pp. 365–396, doi:10.1007/978-1-4471-4372-7_14, ISBN 978-1-4471-4372-7, retrieved 2021-12-08
7. ^
Holliger, P.; Prospero, T.; Winter, G. (1993-07-15). “”Diabodies”: small bivalent and bispecific antibody fragments”. Proceedings of the National Academy of Sciences. 90 (14): 6444–6448. Bibcode:1993PNAS…90.6444H. doi:10.1073/pnas.90.14.6444. ISSN
0027-8424. PMC 46948. PMID 8341653.
8. ^ Brinkmann, Ulrich; Kontermann, Roland E. (2017-02-17). “The making of bispecific antibodies”. mAbs. 9 (2): 182–212. doi:10.1080/19420862.2016.1268307. ISSN 1942-0862. PMC 5297537. PMID 28071970.
9. ^ Jäckel,
Christian; Kast, Peter; Hilvert, Donald (June 2008). “Protein Design by Directed Evolution”. Annual Review of Biophysics. 37 (1): 153–173. doi:10.1146/annurev.biophys.37.032807.125832. PMID 18573077.
10. ^ Shivange, Amol V; Marienhagen, Jan; Mundhada,
Hemanshu; Schenk, Alexander; Schwaneberg, Ulrich (2009). “Advances in generating functional diversity for directed protein evolution”. Current Opinion in Chemical Biology. 13 (1): 19–25. doi:10.1016/j.cbpa.2009.01.019. PMID 19261539.
11. ^ Jump
up to:a b c d Lutz, Stefan (December 2010). “Beyond directed evolution—semi-rational protein engineering and design”. Current Opinion in Biotechnology. 21 (6): 734–743. doi:10.1016/j.copbio.2010.08.011. PMC 2982887. PMID 20869867.
12. ^ “‘Designer
Enzymes’ Created By Chemists Have Defense And Medical Uses”. ScienceDaily. March 20, 2008.
13. ^ [Enzyme reactors at “Enzyme reactors”. Archived from the original on 2012-05-02. Retrieved 2013-11-02.] Accessed 22 May 2009.
14. ^ Sharma, Anshula;
Gupta, Gaganjot; Ahmad, Tawseef; Mansoor, Sheikh; Kaur, Baljinder (2021-02-17). “Enzyme Engineering: Current Trends and Future Perspectives”. Food Reviews International. 37 (2): 121–154. doi:10.1080/87559129.2019.1695835. ISSN 8755-9129. S2CID 213299369.
15. ^
Kuhlman, Brian; Dantas, Gautam; Ireton, Gregory C.; Varani, Gabriele; Stoddard, Barry L. & Baker, David (2003), “Design of a Novel Globular Protein Fold with Atomic-Level Accuracy”, Science, 302 (5649): 1364–1368, Bibcode:2003Sci…302.1364K, doi:10.1126/science.1089427,
PMID 14631033, S2CID 1939390
16. ^ Looger, Loren L.; Dwyer, Mary A.; Smith, James J. & Hellinga, Homme W. (2003), “Computational design of receptor and sensor proteins with novel functions”, Nature, 423 (6936): 185–190, Bibcode:2003Natur.423..185L,
doi:10.1038/nature01556, PMID 12736688, S2CID 4387641
17. ^ Khoury, GA; Fazelinia, H; Chin, JW; Pantazes, RJ; Cirino, PC; Maranas, CD (October 2009), “Computational design of Candida boidinii xylose reductase for altered cofactor specificity”, Protein
Science, 18 (10): 2125–38, doi:10.1002/pro.227, PMC 2786976, PMID 19693930
18. ^ The iterative nature of this process allows IPRO to make additive mutations to a protein sequence that collectively improve the specificity toward desired substrates
and/or cofactors. Details on how to download the software, implemented in Python, and experimental testing of predictions are outlined in this paper: Khoury, GA; Fazelinia, H; Chin, JW; Pantazes, RJ; Cirino, PC; Maranas, CD (October 2009), “Computational
design of Candida boidinii xylose reductase for altered cofactor specificity”, Protein Science, 18 (10): 2125–38, doi:10.1002/pro.227, PMC 2786976, PMID 19693930
19. ^ Jump up to:a b Ardejani, MS; Li, NX; Orner, BP (April 2011), “Stabilization of
a Protein Nanocage through the Plugging of a Protein–Protein Interfacial Water Pocket”, Biochemistry, 50 (19): 4029–4037, doi:10.1021/bi200207w, PMID 21488690
20. ^ Chowdhury, Ratul; Ren, Tingwei; Shankla, Manish; Decker, Karl; Grisewood, Matthew;
Prabhakar, Jeevan; Baker, Carol; Golbeck, John H.; Aksimentiev, Aleksei; Kumar, Manish; Maranas, Costas D. (10 September 2018). “PoreDesigner for tuning solute selectivity in a robust and highly permeable outer membrane pore”. Nature Communications.
9 (1): 3661. Bibcode:2018NatCo…9.3661C. doi:10.1038/s41467-018-06097-1. PMC 6131167. PMID 30202038.
Photo credit:’]