Self-assembled bionanostructures: proteins following the lead of DNA nanostructures
© Gradišar and Jerala; licensee BioMed Central Ltd. 2014
Received: 10 December 2013
Accepted: 29 January 2014
Published: 3 February 2014
Skip to main content
© Gradišar and Jerala; licensee BioMed Central Ltd. 2014
Received: 10 December 2013
Accepted: 29 January 2014
Published: 3 February 2014
Natural polymers are able to self-assemble into versatile nanostructures based on the information encoded into their primary structure. The structural richness of biopolymer-based nanostructures depends on the information content of building blocks and the available biological machinery to assemble and decode polymers with a defined sequence. Natural polypeptides comprise 20 amino acids with very different properties in comparison to only 4 structurally similar nucleotides, building elements of nucleic acids. Nevertheless the ease of synthesizing polynucleotides with selected sequence and the ability to encode the nanostructural assembly based on the two specific nucleotide pairs underlay the development of techniques to self-assemble almost any selected three-dimensional nanostructure from polynucleotides. Despite more complex design rules, peptides were successfully used to assemble symmetric nanostructures, such as fibrils and spheres. While earlier designed protein-based nanostructures used linked natural oligomerizing domains, recent design of new oligomerizing interaction surfaces and introduction of the platform for topologically designed protein fold may enable polypeptide-based design to follow the track of DNA nanostructures. The advantages of protein-based nanostructures, such as the functional versatility and cost effective and sustainable production methods provide strong incentive for further development in this direction.
The versatility of biopolymers can be used to rationally design new molecules and assemblies with structures and functionalities unseen in nature. The ability of biopolymers to self-assemble into complex shapes and structures defined at the nanometer scale, and our competence of sustainable large-scale production using cell factories makes them highly desirable for diverse technological applications. In the rapidly-growing research area of modern nanobiotechnology the natural components polypeptides and nucleic acids have been employed as building blocks for the assembling of new designed nanostructures and nanomaterials. Bionanotechnologists have in the last decades achieved important advances in protein-based and particularly DNA-based responsive nanostructures, which can now be designed to self-assemble into almost any selected shape.
Molecular self-assembly as the main organizing principle of biological systems is also a widely applied strategy in the nanotechnology as the driving force for the assembly of artificial nanostructures. In self-assembly the final structure is encoded by interactions of its building elements defined by their properties and the order of building blocks within the linear polymer. The shapes and functions of both, DNA- and protein-based nanostructures are encoded by the sequence of their constituents, nucleotides and amino acids. Additionally, the architecture of both type of the nanostructures can be affected also by the environmental factors, such as solvent, pH, temperature and building blocks concentration.
DNA nanostructures are based on the Watson-Crick nucleic base complementarity. There are only two different base pairs based on a specific pairwise interaction, where stacking with neighboring pairs underlies the formation of stable double-helical domains that serve as the nanostructural building blocks. Some of the most spectacular examples of the potentials of nanobiotechnology have been demonstrated by DNA-based nanostructures. In the nature the primary function of nucleic acids are the storage, processing and mediation of genetic information; however natural structures such as aptameres, telomeres and partially the ribosome as one of the key and most complex nanodevices are formed by nucleic acids assembled into 3D structures. The relevance of the physiological role of nucleic acids that perform their function in form of self-assembled noncoding RNA transcripts is still unknown. On the other hand artificial rationally designed DNA nanostructures, which utilize a narrower subset of interactions from aptameres, can adopt a huge diversity of 2D or 3D shapes [1–5].
However, a significant progress has been recently achieved in the development of strategies for building artificial self-assembled bionanostructures, and a range of both, DNA- and protein nanostructures rapidly increased in last two decades. In this review we mainly focus on protein-based nanostructure strategies, while DNA nanotechnology has been discussed in detail in many recent reviews [6–12].
In 1982, Seeman proposed to use DNA as the structural material for the bottom-up self-assembly  and he is accepted as the founder of the field of DNA nanotechnology. Since then, DNA-based self-assembly achieved spectacular results relying on the base-pairing specificity of nucleotides, using DNA synthesis technology, computer based design and, above all, imaginative design. Over the last three decades self-assembled DNA nanostructures have been extensively studied and several different approaches for building DNA nanostructures have been developed. Self-assembled DNA nanostructures range from 3D structures with a well-defined shape [2, 4, 14–17] to a variety of complex dynamic DNA devices [8, 18–20]. This avenue of research also spawned DNA computing [21, 22] and design of dynamic devices [8, 23, 24], which are however beyond the scope of this review.
several medium-sized DNA (few 10–100 nucleotides) oligonucleotides that form finite sized nanostructures ;
single long DNA scaffold (e.g. encompassing several 1000 nucleotides from the single stranded DNA phage) that is shaped into selected structure by the addition of short oligonucleotide clamps a.k.a. DNA origami technique, invented by Paul Rothemund . This approach can result in complex 2D or 3D shapes such as molecular raster images, box, sphere etc. [27–30];
large number of short DNA bricks (32 or 42 nucleotide long strands that form U-shaped brick) that fill the 2D plane or 3D space, where the selected structure is formed by the omission of appropriate DNA bricks from the assembly mixture. Almost any 2D or 3D shape can be formed by this approach [15, 31].
An important advantage of DNA-based nanostructures is that it is possible to address the selected positions within the 2D or 3D nanostructures at approximately 5 nm resolution and introduce oligonucleotides with selected functionalities, such as different organic compounds, fluorophores, metal binding groups, proteins etc. into those positions, thereby functionalizing DNA nanostructures [9, 32–36].
RNA has the distinct advantage that ssRNA could easily be produced in vivo in order to promote the self-assembly. This property was used to prepare RNA-based scaffolds with attached sites for functional proteins fused to specific sequence RNA binding domains. While those in vivo assembled structures were not well characterized, the scaffold strongly enhanced the reaction yield  similar to the DNA-based scaffolded enzymes, where the arrangement of enzymes had been linear . It is hoped that this in vivo approach will be further developed for in vivo applications. ssDNA could also be produced in vivo, demonstrated by the self-assembly of a tetrahedron . Isothermal DNA nanostructure assembly strategy has been developed that could further facilitate future DNA self-assembly in vivo .
DNA nanostructures were used to make devices that were functional in the cellular milieu; e.g. drug delivery container that encapsulates cargo, such as therapeutic antibodies, while opening of the container could be controlled by binding of the trigger signals to the aptamer lock that regulates opening of the container only if the triggering signals for both of the two locks are present . DNA origami seems to be stable in vivo indicating that it is relatively protected against nucleases. There are also reports on the use of DNA nanostructures as the constituents of vaccines [42–44]. However real applications of DNA nanostructures are at the moment quite rare and essentially all DNA nanostructures are prepared by chemical synthesis, which limits the technological applications due to the cost and scale of production.
Proteins provide masterful examples of complex self-assembling nanostructures with properties and functionalities beyond the reach of any human-made materials. It is estimated that there are only few thousand different protein folds in nature, and recently the number of new determined protein fold basically trickled to a halt despite determination of tens of thousands of new protein structures each year. So far folds of only few small protein domains can be accurately predicted [45–48] and design of completely new folds without resemblance to any of the existing native folds represents even a greater challenge .
Larger natural proteins have evolved through combinations of several smaller independently folding domains. Protein oligomerization based on the symmetric oligomerization domains is an important source of suprastructured proteins . Existing protein oligomerization domains have been recognized as suitable building blocks for the predictable bottom-up design of artificial protein nanostructures. Strategies that used modified natural domains, or genetically or chemically linked secondary structure elements for self-assembling, and resulted in formation of symmetric intermolecular protein assemblies, lattices and heterogeneous cage-like assemblies, are described in reviews [51–53]. Recently we presented a new approach where a single polypeptide chain composed of concatenated coiled-coil-forming peptides self-assembled into a new topological fold, asymmetric tetrahedron-like cage, which is defined and stabilized by the specific pairing of the coiled-coil-forming segments arranged in a precisely defined order rather than cooperative packing of hydrophobic protein core .
This approach provides the possibility to create smart bionanomaterials by regulating the assembly and disassembly. Self-assembly of the fusion protein composed of the dimerizing gyrase B domain and trimerization domain can be driven by the addition of a small molecule. The addition of pseudo-dimeric gyrase B ligand, coumermycin, induced formation of hexagonal assemblies and its dissociation by the subsequent addition of a monomeric ligand novobiocin, which competes for binding to the same gyrase B site as the pseudodimeric coumermycin .
The extended fusion strategy circumvented the problem of connecting two oligomerization domains in a fixed relative orientation which assured well-ordered self-assembled protein nanostructures . They showed that fusion protein can be made by selecting two or more connections between the adjacent oligomers if the two domains are joined along an axis of symmetry that both oligomerization domains share. However this symmetry-matching fusion protein strategy successfully manufactured linear filaments, two-dimensional lattices and large solid aggregates, but is not suitable for designing defined cage-like structures.
In the strategies described above the range of suitable protein domains is limited by restrictions regarding the symmetry axes of the natural domains. A step further towards the design of artificial protein nanostructures was done by engineering domain surfaces for weak non-covalent interactions in the self-assembling processes. The analysis of natural contact interfaces between protein domains disclosed the rules governing domain association. The contacting surfaces should be complementary and predominantly non-polar. The contribution of hydrogen bonds and salt bridges at the contact rim is negligible. Employing these rules it was demonstrated that a given protein can be engineered to form new contact interfaces that produced a number of novel assemblies . Algorithm Rosetta for modeling protein-protein interactions  enables de novo design of interacting interfaces which can drive the self-assembly of designed proteins into a desired symmetric architecture [46, 62]. In a recent study, a computational design of protein nanostructures with atomic level accuracy was described . Protein building blocks, based on natural trimeric protein domains were docked together symmetrically to the target packing arrangements and low-energy protein-protein interaction interfaces were designed between building blocks in order to drive the self-assembly (Figure 3b). The designed proteins assembled into cage-like nanostructures with either tetrahedral or octahedral point group symmetry which was confirmed by crystal structures.
The strategies employing oligomerizing protein domains for designing new protein structures, described above, are limited to homologues of known native protein folds. The next generation engineering approaches are based on modules that can be considerably smaller than the typical protein domain. The modules comprise interacting de novo designed secondary structure elements that are predictably combined with specified partners to form larger assemblies. De novo protein design refers to attempts to construct completely new protein sequences for the prescribed structures based on the principles defining the stability and selectivity of building modules; in de novo design the polypeptide sequence is selected by the designer.
Modularity and orthogonality are two foundation concepts of de novo design and engineering of new protein nanostructures. Instead of optimization of the numerous cooperative interactions that underpin the structures of natural proteins, the use of well-understood structural modules, which could be combined into complex nanostructures, was proposed. α-helices and β-strands represent attractive protein folding motifs to serve as building blocks for well-ordered and defined nanostructures with complex architecture [63–67].
The most studied module for building self-assembled protein nanostructures are interacting helical peptides and particularly coiled-coils. They are ubiquitous facilitators of inter- and intramolecular protein-protein interactions and comprise two or more intertwined α-helices that are encoded by the characteristic heptad sequence repeat, where residues are labeled with abcdefg. The non-covalent interactions that drive the formation of coiled-coils are the hydrophobic effects between amino acids at positions a and d that form a hydrophobic core of coiled-coil, and the electrostatic inteactions between the opposite charged residues at positions e and g. The rules governing coiled-coil formation, their oligomerization state and interaction partner specificity have been considerably established over the last decades [68, 69]. On the basis of those rules sets of orthogonal designed coiled-coils as the toolkit for the designed protein assemblies were developed [70–75]. Engineered coiled-coil polypeptides have been used to assemble different nanomaterials: nanofibres [76, 77], membranes , nanotubes , nanostructured films , spherical structures , responsive hydrogels [82, 83], spheres  etc. Homogeneous nanoparticles with regular polyhedral symmetry, about 16 nm in diameter, were prepared from single type of polypeptide chains where the two coiled-coil modules with different oligomerization states were joined by a short linker . In another study two oligomerizing coiled-coil peptides were tethered via disulphide bond close to their center. The self-assembled molecules spontaneously curved into the spherical cage-like particles, with a hexagonal-pattern of the cage surface and about 100 nm in diameter . Another example are discrete circular nanostructures of defined stoichiometry; trimers or tetramers of < 10 nm were observed when linker between two coiled-coil-forming segments comprising 6–10 residues. Larger colloidal-scale assemblies as well as flexible fibers were formed when shorter linkers limited flexibility between peptides .
Recent innovative approach to construct new engineered self-assembled protein nanostructures is based on the concatenated interacting dimerizing modules, comprise up to 45 amino acid residues . The tetrahedral nanostructure was built from only single polypeptide chain; this strategy may appropriately be called designed protein origami as opposed to native protein structures that fold into a defined 3D structure from a single chain.
Rather than folding the structure based on the interactions between residues in the hydrophobic core as for the native proteins, the modular topological design is based on pairwise interactions between concatenated secondary structure elements (coiled-coil-forming segments), whose folding and orthogonality is engineered independently. Orthogonality of used coiled-coil building modules ensures that each segment preferentially binds to its designated partner segment within the same polypeptide chain. The final topology is defined by the sequential order of coiled-coil segments. The topological fold comprises a cavity bounded by coiled-coil dimers as the edges of the polyhedron. This type of modular self-assembly therefore in many aspects resembles the principles of DNA nanostructures [2, 3, 26], where polyhedra had been constructed based on the complementary DNA segments.
The recent successes in the design of new bionanostructures based on DNA and protein demonstrates the potentials of this approach to engineer new functional nanostructures.
While DNA-based nanostructures are clearly ahead of the designed protein nanostructures in terms of the complexity of the designed structures so far they lacked tangible applications. Although it has been demonstrated that DNA-based nanostructures are functional in organisms, use of in vivo produced and assembled nucleic acid-based nanostructures would represent an important step ahead both for the production cost and new biological applications. Functionalization of nucleic acids could combine structural design with precisely addressed functionalities. However, proteins adopt much larger conformational variability than nucleic acids and provide more versatile functionality. De novo design of protein nanostructures has been limited to small number of application cases which predominatly utilizing repurposed natural protein domains. Nevertheless the design of protein assemblies has matured beyond the proof of principles and is ready to face more complex challenges. New emerging paradigms such as the topological protein folds open completely new avenues that seem not to have been adopted or perhaps even tested by nature. Future developments will demonstrate the potentials of different strategies, or their combinations, with respect to the precise engineering of nanostructures and the theoretical limitations of different platforms. The next stage will need to focus on application development. The potentials are numerous, from targeted drug and biomolecule delivery, vaccine design, tissue engineering, senzors design, biocatalysis to bionanomaterials science. The interdisciplinary approach of synthetic biology, combining structural biology, molecular biology, mathematics, engineering and many other disciplines, have the potential to join forces in this exciting opportunity.
We acknowledge Sabina Božič Abram and Iva Hafner Bratkovič for help in preparing structural images.
This work was supported by the EN-FIST Centre of Excellence and program and projects from the Slovenian Research Agency.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.