Preprint/Small beta barrel proteins: structure

Abstract
There are many folds named beta barrels: 53 folds in SCOPe 2.06 [11] carry a definition of barrel or pseudo-barrel and 79 X-groups appear under the architecture of beta barrel in the ECOD classification [12]. While not defined as such in the literature, here we define small beta barrels as domains, typically 60-120 residues long, with a superimposable core of approximately 35 residues, which belong to SCOP (v.2.06) folds b.34 (SH3), b.38 (SM-like), b.40 (OB), b.136 (stringent starvation protein), and b.137 (RNase P subunit p29). Given the structural and functional plasticity of small beta barrels, to provide focus, here we concentrates on the first three folds  (b.34, b.38 and b.40) which contribute the vast majority of structures and functions represented by small beta barrels. This article explores the anatomy of small barrel structure and how functional diversity is achieved.

Why small beta barrels are interesting?
Small beta barrels possess several intriguing features as discussed subsequently. When taken together these features make this protein structural domain an interesting study from the perspective of structure, function and evolution. Small beta barrels occupy an extremely broad sequence space. In other words, many small barrels with similar structures and functions have little or no detectable sequence similarity, yet the folding process is robust and insensitive to the majority of sequence variations. Consequently, small barrels can be tuned for a variety of functions through variations in sequence and structural modifications to the core structural framework as described herein. Small beta barrels interact with RNA, DNA and protein; sometimes with two partners simultaneously. Small barrels are evolutionary ancient, being found in  viruses, bacteria, archaea and eukaryotes and act as fundamental components in diverse biological processes. For example, RNA biogenesis (including splicing, RNAi, sRNA) [1–4], structural organization of DNA [5] initiation of signaling cascades through DNA recognition (recombination, replication and repair, inflammation response, telomere biogenesis) [6–8] The recognition of histone tails by small barrels lies in the heart of chromatin remodeling [9], while recognition of polyproline signature makes the small barrel an ultimate adaptor domain in regulatory cascades [10]. Small beta barrels function as a single domain protein, or as part of a multi-domain protein. They also function as quaternary structures through toroid rings assembled from individual small beta barrels.

What characterizes the structure of a small beta barrel?
In general, beta-barrels can be thought of as a beta sheet that twists and coils to form a closed structure in which the first strand is hydrogen bonded to the last [13,14]. This type of barrel is often described as consisting of a strongly bent antiparallel beta sheet [15–17], or as a beta sandwich [18]. Classically, barrels are defined by the number of strands n and the shear number S [14,19]. Shear number S determines the extend of the stagger of the beta sheet, or the tilt of the barrel with respect to its main axis. The extent of the stagger defines the degree of twist and coil of the strands and the internal diameter of the barrel  [13,14]. It is proposed that the increase in S (or tilt of the barrel) increases throughout evolution [20]. The subset of biologically relevant (n, S) pairs found in nature is rather limited, as was described by Murzin and co-workers: n<=S<=2n. Specifically, the combinations that are observed are S=8, n=4 to 8; S=10, n=5 to 10; S=12, n=12 [13]. Barrels can also be thought of as consisting of 2 beta sheets packed face-to-face and orthogonal to each other [21]. By this definition barrels have higher staggering and are flatter, so the two opposite sides pack together. As such small beta barrels are of this orthogonal type with low strand number, n, and high shear number S: n=4, S=8. In SCOPe version 2.06, b.34 and b.38 are defined as  n=4, S=8 with an SH3 topology, while the OB fold b.40 is defined as n=5, S=10. Usually the 4th strand, as defined in SCOP for b.34 and b.38, is interrupted by a 3-10 helix, resulting in total of 5 β-strands. Here (as in most publications) small beta barrels are defined as containing 5 strands and represented by two orthogonally packed sheets. In the following however we will consider the highly bent second strand β2 as composed of two parts β2N and β2C as each will participate in the two orthogonally packed beta sheets of this barrel.

What nomenclature is used to describe the structure of small beta barrels?
Over decades, several independent nomenclatures for the loops within sub-groups of small barrels have emerged. The three most prominent nomenclatures are as follows (Table 2). First, are SH3-like barrels involved in signal transduction through binding to polyPro motifs (b.34.2) as well as chromatin remodeling through recognizing specific modifications on histone tails by Chromo-like (b.34.13) and Tudor-like (b.34.9). Second, are Sm-like barrels involved broadly in RNA biogenesis (b.38.1) and third are OB-fold barrels involved in cellular signaling through binding to nucleic acids and oligosaccharides (b.40). To be consistent throughout this paper and inclusive of previous work, we cross-map the nomenclatures (Table 2) and use the nomenclature for the SH3-like fold throughout for either SH3-like (b.34) or Sm-like (b.38 or any other small barrel sharing the same topology, b.136, b.137 and b.41 for example). Given the extent of the existing literature, the OB-fold nomenclature is preserved with mapping to SH3-like when appropriate.

The reference structure used throughout this work is that of the Hfq protein with an Sm-like fold (SCOP b.38.1.2), as it represents the simplest version of a small beta barrel (Figure 1A). If one superimposes all small barrels and identifies Structurally Conserved Regions (SCRs) -  Hfq appears to be the most regular structural representative containing SCRs. Therefore, for simplicity and clarity of presentation, we use Hfq as a prototypical SH3-like fold representative, even though it is not assigned as such by SCOP. Other classifications group many of the folds sharing an SH3-like topology into a single category [12]. Hfq represents a structural framework of Sm-like as well as SH3-like folds. The rationale for using the SH3-like domain nomenclature is, firstly, it is entrenched in the literature and secondly, all small barrels discussed here (with the exception of OB), regardless of their SCOP nomenclature, have the same topology and a highly superimposable structural framework. Remarkably, OB, which has a different topology, is in some cases even more superimposable than some SH3-like folds due to similar positions of the α-helix in the OB-fold and Sm-like fold (discussed in ‘More structural variations’).

Only a few features are specific to the SH3-like small barrel structure. It consists of 5 beta-strands arranged in an antiparallel manner, a conserved Gly in the middle of the second β-strand (usually followed by a beta bulge) causes a strong bend in that strand (β2) dividing it into N-term (β2N) and C-term (β2C) sections. As such this then defines two orthogonal beta sheets comprising the beta barrel sandwich, thus converting it into a de facto 6-stranded barrel, with each beta sheet consisting of 3 beta-strands. A short 3-10 helix links strands β4 and β5; β4 and β5 strands straddle the barrel and belong to different beta sheets (as do β2N and β2C). This arrangement enables oligomerization of the barrels through interactions of β4-β5 of adjacent monomers - a critical feature in toroid formation (see below). Beta sheet A, also referred to as the Meander, is a contiguous 3-strand beta-sheet consisting of strands β2C, β3 and β4. Beta sheet B is non-contiguous and referred to as the N-C sheet since it connects the C-terminal strand to the N-terminal of the protein in an antiparallel fashion. Beta sheet B consists of strands β5, β1, β2N.

Topological descriptions
Over the years, several different topological descriptions have arisen for describing small barrels. These are presented in Fig.1 relative to the Hfq reference structure (Fig. 1A).

 Meander  (Fig. 1B) has the barrel subdivided into two beta sheets: Sheet A (Meander) consisting of β2C, β3 and β4 and sheet B (N-C) consisting of β1, β2N and β5.

 Proto-domains  (Fig. 1C) exhibit pseudo-symmetry within each protein domain and was noticed early on, for example the C2 symmetry in serine proteases six-stranded beta barrels [19]. However, to our knowledge it has never been described in smaller barrels. The small barrel is subdivided into two proto-domains related by a C2 symmetry operation. Some domains such as serine or aspartyl proteases are believed to have arisen from ancient duplications, where the sequence signal may be lost, but structural similarity is apparent. In the case of small barrels, proto-domain 1 consists of β1, β2N and β2C; proto-domain 2 consists of β3, β4 and β5. Even if purely geometrical, the C2 symmetry of the barrel is an intrinsic feature of small barrels.

 The KOW motif  (Fig. 1D) [22] is found in some RNA-binding proteins (mostly small barrels in ribosomal proteins), it consist of β1, β2 and the loops preceding β1 and following β2 covering a total of 27 residues. Its hallmark is alternating hydrophilic and hydrophobic residues with an invariant Gly at position 11 [22].

Functional motifs (Fig. 1E) are described in Sm-like proteins (b.38). The Sm1 motif consists of β1- β3; the Sm2 motif consists of β4 - β5 bracketing short (4 residues) 3-10 helix [23]. The Sm2 motif with its β4 - β5 strands straddling the barrel is a very significant feature, and possibly a signature of all small barrels with an SH3-like topology. In fact superimposition of this pattern alone leads to a good structural alignment of the entire structural framework of the small barrels

The hydrophobic core and structural framework
The hydrophobic core of the small barrel is minimalistic. It comprises the 6 elementary strands forming the conserved structural framework, (β1, β2N, β2C, β3, β4, β5) (Figure 2). These strands are short, comprising between 4 and 6 alternating inside/outside residues, unless bulges are present. Only two strands are completely saturated in terms of backbone hydrogen bonds: β1 and β3. The structural framework of β-strands is the key identifier of small barrels proteins, best represented by the Hfq barrel where all loops are reduced to tight beta turns. It is tolerant to diverse residue replacement, as long as a very small and tight hydrophobic core is preserved. Typically, between one or two inward-facing residues are contributed by each beta strand to the hydrophobic core. The two central strands β1 and β3 are contributing two residues each (in yellow and magenta in Figure 2) and the four lateral strands β2N, β2C, β4, and β5, usually contribute one residue each to the hydrophobic core. The β2N strand does not contribute consistently to the hydrophobic core, thus  the minimum hydrophobic core consist of 7 residue (Figure 2). The hydrophobic residue in β2C follows Gly (which bends the strand) and is positioned at the beginning of the characteristic beta bulge. The hydrophobic residues in β4 and β5 are adjacent to the 3-10 helix - immediately preceding (β4) and immediately following (β5) the helix. The minimal core is what defines the stable barrel fold leaving all outwardly facing residues to interact with ligands via their side chains. The hydrophobic core of 7 residues can be extended in a variety of ways (Table S1). Since the barrel is semi-open various decorations can add hydrophobic residues around the minimal core. For example, the N-terminal helix in Sm-like barrel extends β5-β2C of the otherwise open barrel. Similarly, the RT loop in SH3-like barrel extends the β2N,β3 side of the barrel.



The outward-facing residues on the ‘edge’ stands: β2C and β4 in Sheet A (Meander), β5 and β2N in Sheet B (N-C) have the potential to form hydrogen bonds with other β-strands, unless they are sterically obstructed by terminal decorations or long loops. Such strand-strand interactions can extend the β-sheet of the barrel (see Distal loop and Figure 3I) and enable formation of quaternary structures (see Oligomerization of the barrels and Figure 7).

Beyond the core: loops, decorations, extra modules
While the structural framework of the small barrel is a common denominator for different folds and the structures are easily superimposable on that framework, the other elements of the structure - loops, modules inserted within the loops, N-term and C-term extensions, are variable (Fig. 3). These elements delineate specific structural families and define the function of the small beta barrel.

Loop variations
Loops that connect the beta strands vary significantly in length and confer functional role(s) (see below). Overall there are 5 loops, the first precedes the first beta strand β1. A 6th loop is possible, the C-terminal loop, which connects the last beta strand to the C-terminal extension or to the 6th strand of the barrel, when present. Prior independent studies (Table 1) of each loop extension has led to independent naming schemes being used in the literature. Throughout this work we follow the annotation from the b.34 - SH3-like fold.

While there can be up to 6 loops in small beta barrels, the central four loops ( RT, N-src, Distal, 3-10 helix as defined for the SH3-like fold b.34.2) are always present. Out of these 4 loops, significant changes in the length of three; RT, N-src and Distal, are observed and can be linked to specific functions. The fourth loop is almost always a short 3-10 helix (1 turn) and is, on rare occasions, a distorted version as in RPP29 [24] or replaced by a longer loop as in TrmB proteins [25]. Elongation of the loops often results in formation of additional secondary structures - as described below.

N-src loop (Fig. 3D, 3G)
The elongated N-src loop is observed in two functional families. In the PAZ domain (b.34.14) of Piwi and Argonaute (RNA interference) the alpha/beta module, inserted into the N-src loop, is part of the aromatic pocket that secures the RNA molecule in place [4]. In the case of the Plus3 domain  (b.34.21) of Rtf1 (elongation of transcription), the elongated N-src loop contains two tiny (3-residue long) beta strands and is involved in binding single stranded DNA [26].

RT loop (Fig. 3H)
Long inserts into the RT loop, which connect strands β1 and β2, results in the classical SH3 (b.34.2) domain involved in signal transduction. The SH3 domain binds proline-rich sequences using the elongated RT loop (as well as N-src loop and 3-10 helix). The loop lies along the side of the barrel and caps one of its ends [27]; [28]. Various pairs of loops form various pockets. In the PAZ domain (b.34.14) of Piwi and Argonaute (RNA interference), aromatic residues of the elongated RT loop  (Fig. 3D) are part of the aromatic pocket formed between it and the alpha/beta module (inserted into the N-src loop, see below); this pocket laterally secures the RNA substrate [4].

Distal loop (Fig. 3I)
Elongation of the distal loop is observed in eukaryotic Sm proteins, which are part of the splicing machinery. An elongation in the distal loop results in elongation of the adjacent strands β3 and β4. These two long beta strands are now bent similar to that of β2 and can be seen as β3N and β3C, β4N and β4C [29]. Like β2, they participate in the formation of two sheets simultaneously. This results in a much larger hydrogen-bonded Sheet B - now containing 5 strands, β5, β1, β2N, β3C, β4N. The original Sheet A remains the same.

3-10 helix
Connects strands β4 and β5 and is short (4 residues) and inflexible. It ultimately determines the relative positions of the  β4 and β5 stands which frequently straddle the barrel. 3-10 helix is practically invariable in SH3-like folds but is absent in OB-fold for topological reasons (see below). In the cases of sac7d, sso7d and others histone-like small archaeal proteins a second 3-10 helix is found in the middle of β2 where typically a strand-bending Gly would be [5].

N- and C-term decorations, capping of the barrel, secondary structures in the loops
Alpha-helices and additional loops at the N- and C-termini are frequently observed in small beta barrels and sometimes termed ‘decorations’. Their position relative to the barrel core varies. In some cases they affect the ability of the barrel to oligomerize. The decorations, as any loop insertion, almost always have a functionally significant role, adapted to specific situations, as demonstrated in the following selected examples.

N-term a-helix (Fig. 3A)
in the Sm-like fold (b.38) is connected to the barrel via a short loop and has multiple  interactions with both RNA and proteins. The a-helix stacks on top of the open barrel and lays on the proximal face when the oligomeric ring is formed [30]. In bacterial Hfq (b.38.1.2), the a-helix interacts with sRNA molecules through its three basic residues (Arg16, Arg17, and Arg19) and an acidic Gln8 [23]. In lsm proteins the same a-helix interacts with proteins Pat1C in the lsm1-7 [31] ring and with prp24 in the lsm2-8 ring [32]. In the case of Sm proteins (b.38.1.1), the same a-helix interacts with the beta sheet of the adjacent protomers during ring assembly [29]. In the case of SmD2, the long N-term results in an additional helix (h0), which interacts with U1 RNA as it leads it into the lumen of the ring [33,34].

C-term a -helices (Fig. 3C, 3G)
can either augment existing binding, or interact with additional binding partners. In the case of the lsm1-7 ring (b.38.1.1), a long helix formed by the C-term tail of lsm1 lies across the central pore on the distal face of the ring, preventing the 3’-end of RNA from exiting through the distal surface [35].

N-term and C-term a-helices together (Fig. 3C, 3G)
can interact to form a supporting structure/subdomain around the barrel as in the case of the Plus3 (b.34.21) domain of Rtf1 [26], where 3 N-term a-helices and a C-term a-helix form a 4-helical cluster that packs against one side of the barrel. The role of these helices is not clear, but the conservation of many residues points to an unknown functional significance.

C-term tails
have the least spatial constraints among all the decorations. These can remain disordered and can vary in length significantly. Over 40 residues in SmD1 and SmD3 and over 150 residues in SmB/B’ [29]. In the case of Sm proteins (b.38.1.1), the C-terminal tails of SmB/B’, SmD1 and SmD3 carry RG-rich repeats which are critical for the assembly of the barrels into the toroid ring  [36–38]. Disordered C-term tails of Hfq (b.38.1.2) are proposed to extend out of the ring and be involved in interaction with various RNAs [39].

Small internal modules (Fig. 3D, 3G)
consist of short secondary or super-secondary structures (a/β or purely a) inserted within the loops. These typically form a pocket against the barrel and are an integral part of barrel function. Examples include an aββ module inserted into the N-src loop of the PAZ domain (b.34.14) [4] and a β hairpin extension module appearing in the N-src loop of the Plus3 domain (b.34.21) of Rtf1 [26]. Insertions can be entire modules. For example some interdigitated Tudors can be seen as a Tudor domain inserted in the N-src loop. An eTudor is a Tudor inserted in the (SH3 equivalent) distal loop of an OB fold.

Extra β-strand: RNase P subunit Rpp29 (Fig. 3C)

An addition of a 6th strand to the barrel has been observed in RNase subunit P29 (b.137.1), where, after an extra beta turn β5-β6, a 6th beta strand extends the antiparallel Sheet B (N-C) to 4 antiparallel strands: (β6, β5, β1, β2N)  [24,40].

Missing β-strands: chromo domain HP1 (Fig 3B)

In at least one case, that of the HP1 chromo domain (b.34.13.2), the complete barrel is formed only upon binding the peptide. HP1 exists as a 3-stranded sheet A (meander), it is the  β-strand of the ligand peptide that initiates formation of the second beta sheet (N-C) around it [18].

Extra β-strand: RNase P subunit Rpp29 (Fig. 3C)
An addition of a 6th strand to the barrel has been observed in RNase subunit P29 (b.137.1), where, after an extra beta turn β5-β6, a 6th beta strand extends the antiparallel Sheet B (N-C) to 4 antiparallel strands: (β6, β5, β1, β2N)  [24,40].

Missing β-strands: chromo domain HP1 (Fig 3B)
In at least one case, that of the HP1 chromo domain (b.34.13.2), the complete barrel is formed only upon binding the peptide. HP1 exists as a 3-stranded sheet A (meander), it is the  β-strand of the ligand peptide that initiates formation of the second beta sheet (N-C) around it [18].

OB fold (b.40) (Fig. 4): similar architecture, different topology.
Similar to the SH3-like barrel, the OB fold is a 5-stranded barrel, but with a somewhat different topology. One can relate the SH3-like and OB topologies through a (non-circular) permutation observed previously [41]. Our reference Hfq protein (Sm-like fold) lends itself perfectly to comparison with the OB fold, as both topologies lead to the same structural framework (Fig. 4). To avoid confusion, we use OB strand mapping when discussing OB folds, the mapping is indicated in Table 3A. [[File:Small barrels OB vs SH3 Fig4.png|thumb|477x477px| Figure 4. Comparison of OB and SH3-like folds mapping strands and loops. Coloring progresses from blue (N-terminus) to red (C-terminus).  A. SH3-like fold (Hfq b.38.1.2)  B. OB fold.  See S3 for alignment.

]]

The matching between the folds is particularly striking as both folds, OB-fold and Sm-like, have an N-terminal α-helix which is missing in the SH3-like fold. When starting with the Sm-like fold of Hfq, the permutation inserts the N-terminal α-helix and β1 after the Meander [β2C-β3-β4] and before β5, thus the initial topology [α-helix-β1]-[β2N-β2C-β3-β4]-[β5] results in the final topology [β2N-β2C-β3-β4]-[α-helix-β1]- [β5]. The renumbering of strands in this rearranged fold (now OB) will read [β1N-β1C-β2-β3 ]-[α-helix-β4]-β5]. The non-circular permutation preserves the Sheet A (Meander) in both topologies: [β2C-β3-β4] in Sm-like and [β1N-β1C-β2] in OB-fold. The structure alignment of  [β2N-β2C-β3-β4-β5] in SH3 and [β1N-β1C-β2-β3]+β5 in OB results in 1.37 Å2 RMS (based on the alignment of Hfq pdbid:1KQ1 and. verotoxin pdbid:1C4Q).

Of the five loops in the OB-fold (Table 3B), L12 can be clearly structurally mapped onto the N-src loop and L23 to the Distal loop of the SH3-like fold. There is no good structural correspondence between the other loops. The RT and  3-10 are unique to SH3-like topologies, while L3α, Lα4 and L45 are unique to the OB fold.

The hydrophobic core of the OB fold contains the 7 residues defined for the SH3-like fold, but is typically larger, by virtue of strands elongation (especially L12/N-src) and formation of possible hairpin within L45, which would then extend the Beta Sheet A by two strands.

Most notable loop variations in the OB-fold are similar to those of the SH3-like fold.

Insertion into the N-src loop
In the OB2 of BRCA2 there is an insertion of a Tower domain into the L12 loop (corresponding to the N-src loop); The Tower domain is implicated in DNA-binding. The Tower domain is a 154-residue long insert consisting of two long α-helices and a 3-helix bundle positioned between them [6,42]. In DBD-C of RPA70 a zinc-finger motif consisting of three short β-strands is inserted into the L12 loop [43].

Extension of the Distal loop
The DNA-binding domain of cdc13 contains a unique pretzel-shaped loop L23 (corresponding to the Distal loop), which significantly extends interactions of this barrel with DNA. A 30-residue long loop twists and packs across the side of the barrel and interacts with the L45 loop [44].

Change of internal α-helix and omega loop
In DBD-C of RPA70 the α-helix positioned between β3 and β4 is replaced by a helix-turn-helix, while in DBD-D of RPA32, the same α-helix is missing altogether and is replaced by a flexible loop [43].

Sequence variation and electrostatic charge
In addition to variations in structure, variations in sequence further distinguishes different barrels. Because small barrels are extremely insensitive to mutations (see discussion under “Folding of the small β-barrels”), a common evolutionary strategy is to modulate electrostatic interactions through changing the properties of the residues in loops, sheets and decorations. In some cases, it results in a switch between positively charged and negatively charged or between hydrophobic and polar/charged patches or entire sheets, resulting in different partners binding and different functions.

An interesting case is that of the HIN domains of AIM2 and p202 which are involved in the innate immune response (Figure 5) [45]. Each HIN domain consists of 2 tandem OB-fold barrels which are shown to binds ss-, ds- and quadruplex DNA with various affinities. Well studied case of  HIN domains binding dsDNA is that of innate immune response [7,46]. Even though there is 36% sequence identity between HIN domains in AIM2 and p202, the binding modes are completely different (Figure 5A,B), due to the variation in the distribution of electrostatic charges. In the case of AIM2 the binding occurs through positive charges on the convex surface of the barrel (Figure 5C,D). In the case of p202, the same surface is negatively charged and thus cannot interact with DNA. Instead the interaction occurs through the positively charged loops (of the second OB barrel) on the opposite side of the barrel (Figure 5E,F). The same loops carry hydrophobic residues in the case of AIM2 and thus cannot bind DNA  [7,46]. The differences in binding surfaces result in different strength of binding which allows the two proteins to act antagonistically. Another example of electrostatic variation are the five tandem small barrels containing the KOW motif which are present in transcription elongation factor Spt5 [47]. The distribution of electrostatic charge is very different in these barrels which reflects their functions. The KOW1-Linker has a very biased surface charge distribution. Its PCP (Positively Charged Patch) containing 6 basic residues can be mapped onto the KOW1 motif  and is responsible for its interaction with DNA. On the other hand, the surfaces of KOW2-KOW3, which act jointly as a rigid body, have an even distribution of charge throughout and are not interacting with DNA [47].

Perhaps the most classical case illustrating the electrostatic variation in small barrels is the distribution of charge within the RT loop of polyPro-binding SH3 domains. The acidic residues in the RT loop and the basic residue within the polyproline signal are the key electrostatic interactions in addition to hydrophobic interactions involving prolines themselves. The position of the basic residue (Arg) determines the orientation of polypro peptide binding [48]. The strength of binding can be modulated by the number of acidic residues and their positioning within the RT loop [49]. In at least one case, Nck interaction with CD3epsilon, binding can be switched on or off by simply phosphorylating a key residue (Tyr) which will cause electrostatic repulsion between it and acidic residues in the RT loop [50].

Joining barrels together
Small barrels tend to work together at different structural scales. Interactions between tandem barrels within one protein are common. Individual barrels can interact to form toroidal rings that function in many aspects of RNA biogenesis in all superkingdoms of life. Finally, small barrels can also form fibrils, which consist either of toroidal rings or individual small barrels. Each is described.

Tandem and embedded barrels
Several combinations of beta barrels positioned in tandem, or intertwined, are introduced.

SH3-SH3: tandem Tudors (Fig. 6A)
Two SH3-like barrels positioned in tandem typically form a barrel-to-barrel interface which can be constructed in various ways. In the case of 53BP1, H-bonding is formed between β2N of the first barrel and β5 of the second, thus  joining individual 3-stranded β-sheets into an extended 6-stranded β-sheet. The C-terminal a-helix further strengthens the connection by interacting with multiple β-strands of both barrels [51].

In the case of Spt5 which has 5 tandem KOW-containing Tudor domains, interactions between Tudor-2 and Tudor-3, which move as a single body, occurs through β5 of Tudor-2 and residues immediately following β5 in Tudor-3 [47]. In the case of KIN17, the interface is formed by N-terminal and C-terminal tails interacting with the linker connecting the two barrels [52]. Various linkers and sequences can lead to various tandem interfaces and different extended sheets, showing a remarkable plasticity. For example a Tandem Tudor (Agenet) would form a tandem through a b2N-b2N interface (Table 4).

SH3 barrel embedded in OB (eTud) (Fig. 6D)
SND1 contains 5 tandem OB-fold domains with the SH3-fold inserted into the L23 (Distal loop equivalent) of the OB barrel [57]. This combination of domains is typically referred to as extended Tudor (eTudor or eTud). In Drosophila there are 11 tandem extended Tudors referred to also as maternal Tudors [58]. The extended Tudor (eTudor maternal Tudor) domain consists of the two β-strands from OB, the linker (containing an 𝛂-helix) and 5 β-strands of the SH3-like fold (Tudor)  domain. Both parts of the split OB domain are essential for binding sDMA (symmetrically dimethylated Arginines) in the protein tails. The OB-fold (SN domain) and SH-fold (Tudor domain) interact as a unit  [57,59].

Oligomerization of the barrels
Single domain small barrel proteins frequently come together to form a quaternary structure. Beta strands of small barrels are typically calibrated (of equal length) and can dock sideways into each other to form backbone hydrogen bonds between lateral strands of adjacent barrels, thus lending themselves to dimerization or further oligomerization. Table 4 characterizes the combinations of beta strands H-bonding to each other that have been observed in forming dimers or oligomers (always in an antiparallel configuration).

The best known cases of oligomerization are the toroidal rings formed by SM-like proteins. Sm/lsm proteins are the fundamental components of the spliceosome. The bacterial counterpart, Hfq, is broadly involved in RNA biogenesis. The uniquely positioned  β4-(3-10)-β5 strands, which straddle the body of the barrel, lead to interactions between the β4 strand of one monomer and the β5 strand of the adjacent monomer (Fig. 7A), ultimately connecting between 5 and 8 monomers into a doughnut-shaped ring (Fig. 7B). This process connects a 3-stranded Sheet A of one monomer with a 3-stranded Sheet B, forming a 6-stranded sheet or blades, which connects two surfaces of the toroidal ring and can be seen in a similar way as a β-propeller architecture. The two faces of the toroidal ring are formed by the two beta sheets of individual barrels: Sheet A (Meander) forms the Distal face, while Sheet B (N-C) forms the Proximal face. Lateral region of the ring [30] - also referred descriptively as  outer rim [35] - consists of residues connecting two faces of the ring (Distal and Proximal) and facing outwards. In some cases (Hfq) it has a specific function as well [30,35].

A further oligomerization of the toroidal rings into long tubes is observed in bacterial Hfq and Archaeal Sm-like (Sm-AP) proteins. This formation arises through stacking the faces of the rings (Proximal-to-Proximal (check), in case of SmAP) or through stacking of slabs - each slab consisting of 6 hexameric rings (for Hfq) [15].

Another case of ring formation comes from the OB-fold: a 5-mer ring is formed via hydrogen bonding β1 of one monomer with β5 of the other [62].

Dimers are frequently formed between two tandemly repeated domains - as in the case of Tudor via β2-β5 interactions (53BP1 ) and Agenet via β2N-β2N’ interactions (FMRP, [60] ).

Not all small barrels are able to oligomerize through strand-strand hydrogen bonding of the backbone. Elongation of loops, or addition of N- or C-term decorations, often prevents the strand-to-strand interaction necessary for oligomer formation. For example, the RT loop in polyPro binding SH3 domain (b.34.2) physically covers strand β4 and thus precludes β4-β5 hydrogen bonding and toroidal ring formation. Indeed oligomeric structures are completely absent from the SH3-like fold b.34. Similarly, the elongation of the N-src loop in the case of the PAZ domain and the N-term and C-term extensions in the case of Plus3 domain of Rft preclude β4-β5’ formation.

Oligomerization is also possible through side chain interactions among loop residues, as in the cases of tetramer formation of HIN domains (b.40.16) [7] or through side chain interactions between strands as in dimer formation of viral integrase (b.34.7) [63,64].

Oligomerization of barrels also occurs in multi-domain proteins contributing to the formation of  large structures such as cell puncturing device in bacteriophage T4 (trimer; barrel is an N-terminal domain) [65] and in MscS mechanosensitive channel in E.coli (heptamer; barrel is a central domain) [66]

Fibril formation
Beta structures are prone to polymerization and formation of fibrils, leading to amyloid formations which could be functional or disease-causing. There are several ways in which fibrils can form, either starting from individual small barrels, or starting from toroidal rings.

A common pathway to fibril formation for SH3 polyPro-binding domains begins with domain swapping between two protomers, in which any loop (RT, N-src or Distal) can function as a hinge to partially open the beta barrel and exchange beta-strands with the other protomer. Such open interacting loop regions become rigid and may contain short β-strands. These β-strands then serve as a nucleation center for amyloid formation [67]. Alternatively, the hydrophobic strand β1 may not undergo typical pairing with β5 if the latter is disordered, thereby forming non-native contacts with β1 of other protomers and forming  aggregation-prone intermediates which lead to fibril formation [68].

Both of these pathways are strongly tied to the folding process. Mutations that destabilize folding are found predominantly in the open loops/hinges and unpaired strands. This  ultimately leads to non-native folds with swapped domains, polymerization through beta-strands and formation of fibrils.

Fibril formation from toroidal rings proceeds through an entirely different mechanism. In E.coli Hfq rings self-assemble into slab-like layers, each layer built of 6 hexameric rings. The fibrils are then built out of such layers [15]. The C-terminal fragment  of Hfq (constituting 30% of the protein) is intrinsically disordered and was shown to be critical for the assembly of fibrils into higher order cellular structures [69].

In archaea fibrils are formed by stacking hexameric rings of SmAP1 in a head-to-head manner thus forming 14-mer rings,  which ultimately self-assemble into striated bundles of polar tubes [15,70].

Folding of the small β-barrels
In-depth folding studies were done on SH3 domains that bind polyPro (b.34.2) and on OB domains in Cold Shock Proteins (b.40.4.5). The folding of the beta-barrels is simple and immutable - the same fold is achieved by domains with great sequence diversity and also when the sequences are permuted. For SH3 domains the folding proceeds through two-state kinetics: Unfolded (U) --> Folded (F). The high energy transition state is characterized by having multiple conformations of partially collapsed structure and is referred to as the Transition State Assembly (TSE). It has been consistently found that the partially folded states (TSE) are highly polarized - they contain the hydrophobic nucleus which includes most of Beta Sheet A or Meander, β2-β3-β4, while Beta Sheet B (or N-C) which includes β1-β5, is disordered in the TSE [68,71–73].

In the case of OB-fold (CSPA and CSPB), an intermediate state has been recently proposed, it too consists of the 3-stranded beta sheet: β1-β2-β3, which structurally correspond to the Meander in SH3 [74].

The robustness of the folding process is in large part due to the cooperatively, which stresses the significance of local interactions during folding: residues that initiate the folding process are local in the sequence (referred to as structure topology) [72,75,76]. The model of the hydrophobic zipper (HZ) [77] which begins with local interactions and eventually bring more distant residues together to form the hydrophobic core through beta-hairpin formation, supports these observations on small beta barrels. The HZ structure is formed by a group of neighboring residues (cooperativity), eliminating the reliance on specific residues for tertiary folding. Indeed the formation of a 3β-strand meander in WW proteins - a well-studied system - always initiates within one or both of the loops/turns [78–80]. The folding of CspA/CspB is also initiated within the loops - as would be expected, if interactions among the local residues drive the folding [74,81].

The significance of local interactions is also supported by circular permutation experiments of the alpha-spectrin SH3 domain [73]; [82]. In these experiments C- and N-termini are linked together and the sequence is cut open in one of the three loops, rearranging the linear order of secondary structures. The same fold is reached in all permuted structures, however, the order of folding is different, as the beta-hairpin formed by the linked ends (β1-β5) appears early in the folding process.

Perhaps the most poignant evidence of the resilience of the small beta barrel structure, as it relates to folding, is the unprecedented case of RfaH. In RfaH the C-terminal domain spontaneously switches from an alpha hairpin (when bound to N-term domain) to the small beta barrel structure (when released from interaction with N-terminal domain). Such a change in structure has far reaching functional consequences, such that RfaH has a role in both transcriptional elongation and initiation of translation [83].

Wikipedia pages that should link here

 * example

=References=