WikiJournal of Science/Arabinogalactan-proteins

AGP protein backbones and classification


The protein component of AGPs is rich in the amino acids Proline (P), Alanine (A), Serine (S) and Threonine (T), also known as ‘PAST’, and this amino acid bias is one of the features used to identify them. AGPs are intrinsically disordered proteins as they contain a high proportion of disordering amino acids such as Proline that disrupt the formation of stable folded structures. Characteristic of intrinsically disordered proteins, AGPs also contain repeat motifs and post-translational modifications. Proline residues in the protein backbone can be hydroxylated to Hydroxyproline (O) depending on the surrounding amino acids. The ‘Hyp contiguity hypothesis’ predicts that when O occurs in a non-contiguous manner, for example the sequence 'SOTO', such as occurs in AGPs, this acts as a signal for O-linked glycosylation of large branched type II arabinogalactan (AG) polysaccharides. Sequences that direct AG glycosylation (SO, TO, AO, VO) are called AGP glycomotifs (Figure 1).

All AGP protein backbones contain a minimum of 3 clustered AGP glycomotifs and an N-terminal signal peptide that directs the protein into the endoplasmic reticulum (ER) where post-translational modifications begin. Prolyl hydroxylation of P to O is fulfilled by prolyl 4-hydroxylases (P4Hs) belonging to the 2-oxoglutarate dependant dioxygenase family. P4H has been identified in both the ER and Golgi apparatus (GA). The addition of the glycosylphosphatidylinositol (GPI)-anchor occurs in most but not all AGPs.

AGP family of glycoproteins


AGPs belong to large multigene families and are divided into several sub-groups depending on the predicted protein sequence. "Classical" AGPs include the GPI-AGPs that consist of a signal peptide at the N-terminus, a PAST-rich sequence of 100-150 aa and a hydrophobic region at the C-terminus that directs addition of a GPI-anchor; non GPI-AGPs that lack the C-terminal GPI signal sequence, Lysine(K)-rich AGPs that contain a K-rich region within the PAST-rich backbone and AG-peptide that have a short PAST-rich backbone of 10-15 aa (Figure 2). Chimeric AGPs consist of proteins that have an AGP region and an additional region with a recognised protein family (Pfam) domain. Chimeric AGPs include fasciclin-like AGPs (FLAs), phytocyanin-like AGPs (PAGs/PLAs, also known as early-nodulin-like proteins, ENODLs) and xylogen-like AGPs (XYLPs) that contain lipid-transfer-like domains. Several other putative chimeric AGP classes have been identified that include AG glycomotifs associated with protein kinase, leucine-rich repeat, X8, FH2 and other protein family domains. Other non-classical AGPs exist such as those containing a cysteine(C)-rich domain, also called PAC domains, and/or histidine(H)-rich domain, as well as many hybrid HRGPs that have motifs characteristic of AGPs and other HRGP members, usually extensin and Tyr motifs. AGPs are evolutionarily ancient and have been identified in green algae as well as Chromista and Glaucophyta. Found throughout the entire plant lineage, land plants are suggested to have inherited and diversified the existing AGP protein backbone genes present in algae to generate an enormous number of AGP glycoforms.

AGP biosynthesis
After translation, the AGP protein backbones are highly decorated with complex carbohydrates, primarily type II AG polysaccharides. The biosynthesis of the mature AGP involves cleavage of the signal peptide at the N-terminus, hydroxylation on the P residues and subsequent glycosylation and in many cases addition of a GPI-anchor.

The structure of the AG glycans consists of a backbone of β-1,3 linked galactose (Gal), with sidechains of β-1,6 linked Gal and have terminal residues of arabinose (Ara), rhamnose (Rha), Gal, fucose (Fuc), and glucuronic acid (GlcA). Glycosylation of the AGP backbone is suggested to initiate in the ER with the addition of first Gal by O-galactosyltransferase, which is predominantly located in ER fractions. Chain extension then occurs primarily in the GA. The AG glycan moiety of AGPs is assembled by glycosyltransferases (GTs). O-glycosylation of AGPs is initiated by the action of Hyp-O-galactosyltransferases (Hyp-O-GalTs) that add the first Gal onto the protein. The complex glycan structures are then elaborated by a suite of glycosyltransferases, the majority of which are bio-chemically uncharacterized. The GT31 family is one of the families involved in AGP glycan backbone biosynthesis. Numerous members of the GT31 family have been identified with Hyp-O-GALT activity and the core β-(1,3)-galactan backbone is also likely to be synthesized by the GT31 family. Members of the GT14 family are implicated in adding β-(1,6)- and β-(1,3)-galactans to AGPs. In Arabidopsis, terminal sugars such as fucose are proposed to be added by AtFUT4 (a fucosyl transferase) and AtFUT6 in the GT37 family and the terminal GlcA incorporation can be catalysed by the GT14 family. A number of GTs remain to be identified, for example those responsible for terminal Rha.

Bioinformatic analysis predicts the addition of a GPI-anchor on many AGPs. The early synthesis of the GPI moiety occurs on the ER cytoplasmic surface and subsequent assembly take place in the lumen of the ER. These include the assembly of tri-mannose (Man), galactose, non-N-acetylated glucosamine (GlcN) and ethanolamine phosphate to form the mature GPI moiety. AGPs undergo GPI-anchor addition while co-translationally migrating into the ER and these two processes finally converge. Subsequently, a transamidase complex simultaneously cleaves the core protein at the C-terminus when it recognizes the ω cleavage site and transfers the fully assembled GPI-anchor onto the amino acid residue at the C-terminus of the protein. These events occur prior to prolyl hydroxylation and glycosylation. The core glycan structure of GPI anchors is Man-α-1,2-Man-α-1,6-Man-α-1,4-GlcN-inositol (Man: mannose, GlcN: glucosaminyl), which is conserved in many eukaryotes. The only plant GPI anchor structure characterized to date is the GPI-anchored AGP from Pyrus communis suspension-cultured cells. This showed a partially modified glycan moiety compared to previously characterized GPI anchors as it contained β-1,4-Gal. The GPI anchor synthesis and protein assembly pathway is proposed to be conserved in mammals and plants. The integration of a GPI-anchor enables the attachment of the protein to the membrane of the ER transiting to the GA leading to secretion to the outer leaflet of the plasma membrane facing the wall. As proposed by Oxley and Bacic, the GPI-anchored AGPs are likely released via cleavage by some phospholipases (PLs) (C or D) and secreted into the extracellular compartment.

AGPs functional roles
Human uses of AGPs include the use of Gum arabic in the food and pharmaceutical industries because of natural properties in thickening and emulsification. AGPs in cereal grains have potential applications in biofortification, as sources of dietary fibre to support gut bacteria and protective agents against ethanol toxicity.

AGPs are found in a wide range of plant tissues, in secretions of cell culture medium of root, leaf, endosperm and embryo tissues, and some exudate producing cell types such as stylar canal cells. AGPs have been shown to regulate many aspects of plant growth and development including male-female recognition in reproduction organs, cell division and differentiation in embryo and post-embryo development, seed mucilage cell wall development, root salt tolerance and root-microbe interactions (see Table 1). These studies suggest that they are multifunctional, similar to what is found in mammalian proteoglycans/glycoproteins. Conventional methods to study functions of AGPs include the use of β-glycosyl (usually glucosyl) Yariv reagents and monoclonal antibodies (mAbs). β-Glycosyl Yariv reagents are synthetic phenylazo glycoside probes that specifically, but not covalently, bind to AGPs and can be used to precipitate AGPs from solution. They are also used commonly as histochemical stains to probe the locations and distribution of AGPs. A number of studies have shown that addition of β-Yariv reagents to plant growth medium can inhibit seedling growth, cell elongation, block somatic embryogenesis and fresh cell wall mass accumulation. The use of mAbs that specifically bind to carbohydrate epitopes of AGPs have also been employed to infer functions based on the location and pattern of the AGP epitopes. Commonly used mAb against AGPs include CCRC-M7, LM2, JIM8, JIM13 and JIM14.

The function of individual AGPs has largely been inferred through studies of mutants. For example, the Arabidopsis root-specific AtAGP30 was shown to be required for in vitro root regeneration suggesting a function in regenerating the root by modulating phytohormone activity. Studies of agp6 and agp11 mutants in Arabidopsis have demonstrated the importance of these AGPs to prevent uncontrolled generation of the pollen-grain and for normal growth of the pollen-tube. The functional mechanisms of AGPs in cell signalling is not well understood. One proposed model suggests AGPs can interact and control the release of calcium from AG glycan (via GlcA residues) to trigger downstream signalling pathways mediated by calcium. Another possible mechanism, largely based on the study of FLAs, suggests the combination of fascicilin domain and AG glycans can mediate cell-cell adhesion. Functions attributed to AGPs are outlined in Table 1.

Acknowledgements
The authors would like to acknowledge the support of the La Trobe Institute for Agriculture and Food and a La Trobe Research Focus Area grant 2000004372.

Competing interests
The authors have no competing interests to declare.

Ethics statement
No animal or human research was performed.