Research Article

The pigment-protein network of a diatom photosystem II–light-harvesting antenna supercomplex

See allHide authors and affiliations

Science  02 Aug 2019:
Vol. 365, Issue 6452, eaax4406
DOI: 10.1126/science.aax4406

A light-harvesting array in diatoms

Photosynthetic organisms use huge arrays of pigments to draw light energy into the core of photosystem II. The arrangement of these pigments influences how much energy reaches the reaction center. Pi et al. determined the structure of photosystem II from a diatom in complex with an antenna of fucoxanthin–chlorophyll a/c binding proteins (FCPs) (see the Perspective by Büchel). The specialized pigments in this complex allow microalgae to harvest light within a wide range of the visible spectrum. The FCPs are arranged in a pattern analogous to light-harvesting complexes in plants.

Science, this issue p. eaax4406; see also p. 447

Structured Abstract


Photosystem II (PSII) is a pigment-protein complex and catalyzes light-induced water splitting in photosynthesis, converting light energy from the Sun into chemical energy and providing molecular oxygen to the atmosphere. To make full use of light energy, photosynthetic organisms have developed light-harvesting complexes (LHCs) to gather light energy and transfer it to photosynthetic reaction centers. Many LHCII subunits are associated with a core PSII, forming PSII-LHCII supercomplexes. LHC proteins vary across lineages of photosynthetic organisms and enable groups of organisms to cope with different light environments. In addition to light-harvesting, LHCs also have a role in dissipation of excess energy under strong light illumination so as to avoid damage to photosystems caused by intense light.


LHCIIs of green-lineage photosynthetic organisms bind chlorophyll (Chl) a/b as their main pigments, whereas some organisms of the red lineage bind Chl a/c as their main pigments. Diatoms are one of the main groups in the red lineage and contribute ~20% of all primary productivity on Earth. The light-harvesting antennas of diatoms are known as Chl a/c and fucoxanthin (Fx) binding proteins, or FCPs, and enable diatoms to efficiently use blue-green light available under water. The distinct pigment composition and organization of the PSII-FCPII supercomplex confers on diatoms the capacity to efficiently dissipate excess energy when necessary. A structure of PSII-FCPII of diatoms greatly expands our understanding of the energy harvesting and dissipation mechanisms in a dominant photosynthetic organism.


We used single-particle cryo–electron microscopy analysis to elucidate a structure of the PSII-FCPII supercomplex from the diatom Chaetoceros gracilis at a resolution of 3.0 Å. The supercomplex contains two protomers with 24 subunits in the PSII core and 11 subunits in FCPII, giving rise to a total of 70 subunits with an overall molecular weight of 1.4 MDa. The PSII core is largely similar to that of cyanobacteria and higher plants, but we found five extrinsic proteins that play a role in the oxygen-evolving reaction. Two additional transmembrane subunits located at the periphery of the PSII core help to connect the PSII core with the FCPII subunits. The major FCPII is organized into two tetramers: one is tightly associated whereas the other is moderately associated with the PSII core. In addition, three FCP monomers are associated with each PSII core; among them, two connect the moderately associated FCPII tetramer with the PSII core whereas one is associated at the periphery of the moderately associated FCPII tetramer. These arrangements differ from those found in the PSII-LHCII supercomplexes of the green-lineage organisms, and the locations of the tightly and moderately associated FCP tetramers are opposite to those of the strongly and moderately associated trimers found in PSII-LHCII. On the other hand, locations of the two monomeric FCPs (FCP-D and FCP-E) resemble those of CP24 and CP29 in higher-plant PSII-LHCIIs in that both of them connect the moderately associated FCP tetramer or LHCII trimer with the PSII core.


Our PSII-FCPII structure reveals the arrangement of a huge number of pigments (Chls a/c and Fxs) that contribute to energy transfer and dissipation in this supercomplex. Theoretical and time-resolved spectroscopic studies can be designed on the basis of this structure and, in combination with reexamination of existing results, will reveal more details of these reactions. The diatom PSII core also contains transmembrane and extrinsic subunits that may provide clues to changes occuring in the PSII core during evolution.

Model of a diatom PSII-FCPII supercomplex embedded in the thylakoid membrane.

The PSII-FCPII supercomplex contains 35 protein subunits and a number of pigments and cofactors and catalyzes light-induced electron transfer and water-splitting reactions. The latter occur in the lumen of the thylakoid membrane, leading to the generation of protons and molecular oxygen. Light is absorbed by the light-harvesting antenna pigments, among which Chl c and Fx enable absorption of blue-green light available under water. In addition, diadinoxanthin associated with FCPII plays important roles in photoprotection.


Diatoms play important roles in global primary productivity and biogeochemical cycling of carbon, in part owing to the ability of their photosynthetic apparatus to adapt to rapidly changing light intensity. We report a cryo–electron microscopy structure of the photosystem II (PSII)–fucoxanthin (Fx) chlorophyll (Chl) a/c binding protein (FCPII) supercomplex from the centric diatom Chaetoceros gracilis. The supercomplex comprises two protomers, each with two tetrameric and three monomeric FCPIIs around a PSII core that contains five extrinsic oxygen-evolving proteins at the lumenal surface. The structure reveals the arrangement of a huge pigment network that contributes to efficient light energy harvesting, transfer, and dissipation processes in the diatoms.

Oxygenic photosynthetic organisms use solar energy to convert CO2 and water into carbohydrates and molecular oxygen. Photosystem II (PSII) catalyzes a water-splitting reaction by using light energy to generate oxygen and reduced electron carriers. The structure of the PSII core and the catalytic center for water splitting has been determined at a high resolution from cyanobacteria (1, 2), showing that the PSII core is composed of 17 intrinsic and 3 extrinsic subunits, as well as a number of cofactors functioning in energy transfer, electron transfer, and water-splitting reactions.

As an adaptation to variable light energy density available on Earth’s surface, photosynthetic organisms have developed light-harvesting pigment-protein complexes (LHCs) capable of using different spectral regions, depending on the environment that the organisms inhabit (311). The LHC proteins associated with the PSII core are designated LHCII. The structure of a PSII-LHCII supercomplex from higher plants that use chlorophyll (Chl) a and b as their main pigments has been elucidated (10, 11). The plant PSII-LCHII exists as a dimer, in which the PSII core (C) is surrounded by two LHCII trimers: a strongly associated (S) trimer and a moderately associated (M) trimer, forming a C2S2M2-type structure. In addition, three LHCII monomers surround the PSII core. Two monomers mediate the association of the M trimer with the core, and the third monomer associates at the periphery of the core and the S trimer (10, 11).

Diatoms stem from red algae via secondary endosymbiosis and account for up to 40% of the net primary production in the ocean (1214). The oxygen-evolving PSII core has been isolated from a centric diatom Chaetoceros gracilis (15, 16). Five extrinsic subunits are associated with the oxygen-evolving complex (OEC); among these subunits, Psb31 is found only in the diatom PSII (17, 18). The diatom PSII core is associated with fucoxanthin (Fx)–Chl a/c binding proteins (FCPs). These pigments have prominent absorptions in the blue-green region, which are important for the light harvesting of diatoms and other related algae, because more blue-green photons are available under water than in the other wavelength regions (19). The exact composition and organization of FCPs associated with PSII, however, are unknown (2023). On the basis of genomic analyses, both centric and pennate diatoms contain Lhcf, Lhcr, Lhcx, and even Lhca-like subunits as their FCP constituents (24, 25), and FCPs associated with PSII from C. gracilis were classified into three types (FCP-A, FCP-B, and FCP-C) with different protein and pigment compositions (26). More than 10 lhcf genes have been found in a centric diatom (Thalassiosira pseudonana) whose complete genome has been sequenced (24).

The crystal structure of an isolated FCP subunit from the pennate diatom Phaeodactylum tricornutum showed a homodimeric assembly and the bindings of Chl c and Fx to be important for the blue-green light-harvesting and strong nonphotochemical quenching (NPQ) capabilities of diatoms (27). These properties have been extensively studied with biochemical and spectroscopic approaches (2831). Because no structures of the PSII-FCPII supercomplexes have been reported, the organization of FCPII around the PSII core and the mechanisms of energy transfer from FCPII to the PSII core and NPQ within the PSII-FCPII supercomplex are unclear. To address these questions, we determined the structure of PSII-FCPII from C. gracilis by cryo–electron microscopy (cryo-EM) at 3.0-Å resolution. This structure reveals the location and configuration of all five extrinsic proteins within the diatom PSII core. As in the plant PSII-LHCII complex, FCP antennas surround the PSII core, forming a sophisticated pigment-protein network. Our results provide a structural rationale for the efficient excitation energy transfer and quenching mechanisms in the PSII-FCPII supercomplex and the evolutionary changes of the extrinsic proteins involved in the OEC.

Overall structure of PSII-FCPII and the PSII core

We purified the PSII-FCPII supercomplex with FCP-A as its dominant antennas from C. gracilis (fig. S1, A and B). The subunit composition, spectroscopic properties, and pigment composition are shown in fig. S1, C to E. We determined the structure by cryo-EM at 3.0-Å resolution (Materials and methods, fig. S2 and S3, and tables S1 and S2). Two units of PSII-FCPII are present in the structure with twofold rotational symmetry (Fig. 1A), similar to that of the C2S2M2 supercomplex of higher-plant PSII-LHCII (10, 11). In contrast to the trimeric LHCII found in higher plants (3, 10, 11), the major FCPs in this supercomplex are organized in a tetramer. Each PSII-FCPII contains two FCP tetramers as the main antennas: one tetramer is attached to the CP47 side by PsbG, which we designate the strongly associated tetramer (ST) (monomers in ST are STm1 to STm4); the other tetramer is located next to CP43 via two FCP monomers (FCP-D and FCP-E) and is called MT (moderately associated tetramer; MTm1 to MTm4) (Fig. 1, A and B). The tetramer ST is named based on its rather close and direct contact with the PSII core, in contrast with the indirect connection of MT with the PSII core through two FCP monomers. An additional FCP monomer was associated at the outside of MT, which is designated FCP-F. Thus, 11 FCP subunits are associated with each PSII core monomer, and 22 FCP subunits constitute a PSII dimer, with a total molecular mass of ~1.4 MDa (including pigments and other cofactors).

Fig. 1 Overall structure of the PSII-FCPII supercomplex.

(A) View normal to the membrane plane from the stromal side. The dashed line divides the two PSII-FCPII monomer units. (B) One of the two PSII-FCPII monomers shown in (A) (upper side), with the PSII core and FCP subunits labeled. (C and D) View parallel and normal (from the lumenal side) to the membrane plane, respectively. Positions of the five extrinsic subunits are shown.

The diatom PSII core is composed of 19 intrinsic subunits (Fig. 1 and table S2). The root mean square deviations of the PSII core (without PsbG and Psb34) Cα between C. gracilis and a cyanobacterium (1), red alga (32), and higher plant (10) are 0.65, 0.62, and 0.99 Å, respectively, implying high conservation of the PSII core structure in both green and red lineages. Two previously unobserved subunits, designated PsbG and Psb34, are present in the diatom PSII core (Fig. 1B and fig. S3). The sequences of these two subunits could not be identified in the structure, so they are modeled as polyalanine (table S2). PsbG has one transmembrane helix and one long N-terminal loop (33). An unidentified chain W is found in an analogous position to Psb34 in the red algal PSII structure (32).

The PSII core contained 38 chlorophylls (Chls) a, 2 pheophytins, 10 β-carotenes, 2 hemes, 2 plastoquinones, a nonheme iron, a Mn4CaO5 cluster, and several lipids (table S3). The number of Chl a pigments found is three more than that of both cyanobacterial (1, 2) and higher-plant PSII core (10, 11), and these three additional Chls a are associated with each of the peripheral membrane-spanning subunits PsbG, PsbW, and PsbZ (figs. S3 and S4, A and B). Because these three subunits are located in the interface between the PSII core and FCPs (Figs. 1, A and B, and 2, A to C), they may function to mediate energy transfer and/or interactions between the FCPII and PSII core (fig. S4, C to E).

Fig. 2 Subunit interactions.

The middle circled image shows the overall structure of the dimeric PSII-FCPII. The labeled areas correspond to the enlarged figures in the following panels: (A) Location of PsbG and its interactions with ST and CP47. (B) Interactions between helix D and the C-terminal loop of FCP-D with the lumenal surface region of CP43, PsbW, D1, PsbQ′, and PsbO, which is identified by a rectangle in (E). (C) Interactions between FCP-E and MTm4 or PsbZ. (D) Interactions between MTm1 and MTm2 mediated by Fxs and Chls. (E) Interactions between FCP-D and MTm4 or MTm1 and between FCP-D and CP43. (F) Interactions between the peripheral FCP-F and MTm1 or MTm2. The letters A, B, and C inside each panel indicate the transmembrane helixes A, B, and C, respectively.

Five extrinsic proteins are associated at the lumenal side of the PSII core (Fig. 1, C and D). PsbO, PsbU, and PsbV bind to the same positions as the corresponding subunits in the cyanobacterial PSII (fig. S5A). PsbQ′ binds to the same position as that in the red algal and higher-plant PSII, with its N-terminal loop structure more similar to the PsbQ of higher-plant PSII (fig. S5, B to D). The PsbQ′ loop domain extends into a crevice formed by CP43, PsbE, PsbF, PsbJ, PsbK, PsbV, and PsbY (Fig. 3A), enabling it to form hydrogen bonds with these subunits. These bonds may strengthen its binding to the PSII core and increase the structural stability of OEC in this region.

Fig. 3 Interactions among extrinsic subunits and the PSII core.

(A) Location of the N-terminal loop of PsbQ′ within a crevice created by CP43, PsbK, PsbY, PsbJ, PsbE, and PsbF. (B) Location of the N-terminal of Psb31 (the boxed region) in a pocket created by PsbE, PsbH, D2, and CP47, which are shown as surface models. (C) Interactions of the C-terminal loop of Psb31 with adjacent subunits. The boxed area is enlarged in (D) and shows the loop region interacting with the Yz proton channel. (D) Interactions of the amino acid residues in the C-terminal loop of Psb31 with the Yz proton channel (hydrogen bond network). The channel is connected by hydrogen bonds between adjacent amino acid residues and water molecules (red dots) based on the high-resolution structure of the cyanobacterial PSII core (1, 37). Psb31 is connected to this network at the lumenal surface of the complex.

The fifth extrinsic protein, Psb31 (fig. S3), is found only in diatoms (1618, 34) and has an overall structure similar to that of PsbQ (fig. S5D) (17). It is located at the same site as PsbTn in higher-plant PSII-LHCII (10, 11) (fig. S5C). PsbQ mitigates the chloride requirement for oxygen evolution and maintains the stability of the Mn4CaO5 cluster (3537), functions that may be shared by Psb31. The N-terminal region of Psb31 is anchored by a pocket formed by the D2, CP47, PsbH, and PsbE subunits (Fig. 3B), consistent with cross-linking experiments (18, 38). The lumenal loop domains of CP47 and D2 also have close interactions with Psb31 (Fig. 3, B and C). No direct interactions are observed between Psb31 and other extrinsic subunits, consistent with an indirect role of Psb31 in stabilizing the binding of PsbU and PsbV to PSII (38). However, the C-terminal residue Asp164 contacts a hydrogen bond network that begins at the YZ proton channel of OEC (Fig. 3D) (1, 37). This network may function in transferring protons from the Mn4CaO5 cluster to the lumenal surface (39, 40), and Psb31 may thus protect the Mn4CaO5 cluster and optimize the proton-excretion process.

Structures of FCPs

Both FCP tetramers ST and MT are formed from a single FCP monomer designated as FCP-A. FCP-A is encoded by lhcf8 (fig. S6A) on the basis of cDNA sequencing, and this sequence is modeled into the cryo-EM map. Helix D and the extended C-terminal region are present in the map, but the corresponding amino acid residues were not found in the cDNA sequence we obtained for FCP-A, so the sequence of this region from lhcf8 of T. pseudonana was used in the model (Materials and methods, fig. S6, and table S2). Among the three FCP monomers, FCP-D corresponds to lhca2 (fig. S3) and is an Lhca-like subunit. The sequences of FCP-E and FCP-F could not be identified from the map, and therefore the sequence and structure of Lhcf4 from P. tricornutum (27) was used to model their structures. In the FCP-A, FCP-D, and FCP-E apoproteins, the hydrophilic helix D and the subsequently extended C-terminal region located at the lumenal side (Fig. 4, A to C, and figs. S6A and S7, A to C) are similar to those of Lhca (8, 9) and Lhcb (3) but are absent in Lhcf4 of P. tricornutum (27). The C-terminal loop of FCP-D is longer and oriented toward the lumenal surface (Figs. 2B and 4C and fig. S7B). These helices D and extended C-terminal loops participate in the interactions between FCP-A monomers within the tetramers and between FCP-D and the PSII core (Fig. 2, C to F). Therefore, structural differences between helix D and the extended C-terminal loop region in different monomers reflect the different interactions of the antenna subunits involved.

Fig. 4 Structures and pigments in the tetrameric and monomeric FCPs.

(A to C) Structures and arrangement of pigments in FCP-A, FCP-E, and FCP-D. (D) Clustering of Chls in the stromal layer of FCP-E. (E) Bridging of two stromal Chl clusters by an additional Chl601 in FCP-D. (F and H) Chl-clustering patterns in the stromal (F) and lumenal (H) layers of the tetramer ST, with top view from the stromal side. (G) Distribution of Chls into two layers in the ST tetramer. Chl a, Chl c, and Fx are shown in a stick mode and colored as green, orange, and purple, respectively. The structure of MT is similar to that of ST, so it is not shown here.

The average total pigments within FCPs of each PSII monomer include 77 Chls a, 29 Chls c, 62 Fxs, and 1 diadinoxanthin (Ddx) (table S3). Unlike the isolated FCP dimer from P. tricornutum (27), the horizontal Fx304 and Ddx308 located close to the interface of the two monomers are absent in all four forms of the FCP monomers (fig. S7, D to H). Chl 405 near the lumenal side was assigned as Chl c in all of the FCP-A subunits, according to the high Chl c ratio in FCP-A (41) (fig. S7I), in contrast to the Chl a found at this position in the isolated FCP dimer from P. tricornutum (27). These differences in composition may result in functional differences in energy transfer and quenching, which could reflect interspecies variation or adaptations.

Each FCP-A in the two tetramers contains six Chls a, three Chls c, and six Fxs in common; however, one additional Chl a410 is found at the lumenal layer above helix E in each monomer of the tetramer ST (fig. S7J). One Ddx was assigned as Ddx616 in FCP-D on the basis of its clear cryo-EM map (fig. S3). Helix D and loop A-C differ between FCP-A and the dimeric FCP (Lhcf4) from P. tricornutum (fig. S7A), possibly reflecting differences in energy harvesting and quenching or structural requirements of the tetramer. Variations between FCP-A subunits in the tetramer may also suggest that different FCP monomers are incorporated into the tetramers, although we could not identify them in the present structure.

FCP-E bridges MT and PsbZ (Fig. 1, A and B) and contains two extra Chls (Chl a400 and a411) located at the interfaces between FCP-E and PsbZ or MT, which likely mediate energy transfer between MT and the PSII core (fig. S7K). FCP-D is the largest FCP subunit in this supercomplex, owing to long loops at both N and C terminals and a longer helix C (fig. S7B). FCP-D is located at the center of a triangle formed by CP43 and two FCP tetramers (the MT from the same PSII unit and ST from the adjacent unit), and therefore may function as a linker to stabilize the dimeric PSII-FCPII configuration (Figs. 1, A and B, and 2, B and E; and fig. S9A). FCP-D is an Lhca-like subunit (fig. S6) and exhibits structural similarity to Lhca proteins, including an additional Chl 601 similar to that found in Lhca2 of PSI-LHCI (8) (fig. S7L) and fewer carotenoid-binding sites (Fx615, Ddx616, and Fx617) (fig. S7M). The different pigment compositions and structures of the C-terminal helix D found in FCP-A, FCP-D, and FCP-E suggest that each has distinct roles in assembly of and energy transfer in the PSII-FCPII supercomplex.

Organization of FCP tetramers and their interactions with the core

Monomers within the ST and MT tetramers are assembled in a “head-to-tail” manner similar to that in the LHCII trimer (Fig. 4, F to H). The stromal A-C loops of each monomer interact to yield a constraint ring, and the overall shape of the tetramer resembles a bowl with its bottom oriented toward the stromal side, implying that the stromal regions play a dominant role in the assembly of the tetramer (Fig. 4, F to H). Interactions between two adjacent FCP-A monomers occurs in two layers (Figs. 2D and 4, F to H). At the stromal side, Chl 401 and a head of Fx307 from one monomer interact with the A-C loop of the adjacent monomer. At the lumenal side, hydrogen bonds and hydrophobic interactions are found between Fx307, helix D of one monomer, and Fx302, loop B-C from the adjacent monomer. These interactions are not found in LHC trimers (3) or in the free, dimeric FCP (27). The three pigments (Chl 401, Fx307, and Fx302) located in the interface region of the two monomers may facilitate energy transfer and/or quenching between the adjacent monomers.

The ST tetramer is associated with the PSII core via the PsbG subunit (Figs. 1B and 2A). The transmembrane helix of PsbG has intensive hydrophobic and van der Waals interactions with the helix C of one monomer of ST (STm4) as well as with a helix from CP47. Multiple hydrogen bonds are found between the hairpin loop of PsbG exposed to the stromal side and the corresponding loop regions of CP47 and STm4, which resemble a rope to fasten the two subunits together.

The MT tetramer is associated with the PSII core through interactions between CP43 and FCP-D (Figs. 1, A and B, and 2E) and between PsbZ and FCP-E (Figs. 1, A and B, and 2C). FCP-D also interacts with ST from the adjacent PSII-FCPII monomer, which helps stabilize the PSII-FCPII dimer. The C-terminal loop and helix D of FCP-D interact with PsbO, PsbQ′, PsbW, and CP43 (Fig. 2B), which resembles the interactions of the N-terminal region of CP29 with the PSII core in the PSII-LHCII supercomplex and helps bind the extrinsic PsbO and PsbQ′ subunits to the PSII core. FCP-E is located at a position resembling that of CP26 in the PSII-LHCII supercomplex to connect MT to the PSII core. Helix C of FCP-E has hydrophobic interactions and hydrogen bonds with Fx306 and Chl 403 of MTm4, whereas its A-C loop is hydrogen bonded with the helix of PsbZ (Fig. 2C and fig. S4E).

Distribution of chlorophylls and energy transfer pathways

Due to the different structure and organization of FCPII and LHCII, different strategies are adopted for the energy transfer from the peripheral antennas to the PSII core between PSII-FCPII and PSII-LHCII. In PSII-FCPII, direct energy transfer pathways are found between ST and the PSII core, whereas energy transfer from MT to the core is mediated by FCP-E and FCP-D (Fig. 5A). In analogy to the plant C2S2M2 PSII-LHCII supercomplex, Chls in FCPs can be categorized into stromal and lumenal layer groups (Fig. 4G). Coupled Chls have generally lower energy levels and may be important for collecting the photons within FCPs and transferring energy to the PSII core. The extent of coupling between Chls depends not only on their distances but also on the relative orientations of their chlorin rings. In the following text, we discuss the possible energy transfer pathways among Chls mainly based on their distances, as the effects of relative orientations of the chlorin rings of Chls require more extensive and quantitative treatment.

Fig. 5 Overall distributions of pigments and energy transfer pathways in PSII-FCPII.

(A) Distribution of Chls and possible energy transfer pathways in PSII-FCP for the stromal side layer (green, left panel) and lumenal side layer (blue, right panel), with top views normal to the membrane plane. Dπ values (π-π distances) for the adjacent Chl pairs are labeled in red (angstroms) for the critical Chl pairs. (B and C) Distributions of pigments (Chls and carotenoids) in a monomeric PSII-FCPII, with the top view from the stromal side (B) and the side view (C). Chl a, Chl c, Fx, β-carotene, and Ddx are colored as green, yellow, purple, cyan, and blue, respectively.

Among the coupled Chls, the Chl a401/c408 cluster in the stromal layer is located at the interface of FCP monomers within each tetramer (Fig. 4F and fig. S8A), and thus may serve as a linker for energy transfer among the monomers. Chl a406 and c403 appear at the interface between MT and FCP-E, FCP-D, and PSII (fig. S8, E and F), indicating their roles for the inward energy transfer from the MT tetramer to the PSII core.

In both ST and MT tetramers, two stromal Chl clusters with maximal Mg-to-Mg distance (DMg) of ~12 Å (Fig. 4F)—namely, Chl a401/a407/c408 and Chl a402/c403/a406—are conserved in each monomer (Fig. 4, D and F), which are similar to those seen in the FCP dimer (27). Among them, Chl a401 is connected to Chl a406 and a407 from the neighboring monomer with a minimal π-to-π distance (Dπ) of ~14.9 and 12.4 Å (fig. S8A and table S4), suggesting that Chl a401 is the exit and entrance of energy exchange between the adjacent monomers. As a result, an efficient energy equilibrium is formed among the eight Chl clusters in the stromal layer within the tetramer. At the lumenal layer, a different cluster consisting of Chl a404/c405/a410 or Chl a404/c405/a410 plus Chl a409 from the adjacent monomer is found, with a maximal DMg of 15.5 Å due to the distinct head-to-tail association pattern (Fig. 4H and table S4). Among them, Chl a404 and a409 are connected to Chl a406 and a401 with Dπ less than 10 Å, providing an efficient energy channel between the lumenal and stromal layers (fig. S8B). The lumenal Chl a410 is specific to ST and absent in MT, and this Chl is linked to the stromal Chl a406 with a shorter Dπ, providing a better alternative for energy transfer from the lumenal layer to the stromal layer. This suggests that Chl a410 is an ideal collection point for the energy harvested in the lumenal layer (fig. S8C) in ST. In summary, tetramerization of FCP-A brings Chls in each monomer closer together than those observed in the FCP dimer, likely enabling efficient energy transfer. The four FCP monomers in the tetramer are not perfectly symmetric, probably owing to different interactions with other FCP monomers and with the PSII core. The energy captured by the extra FCP-F is delivered through Chl a405 to MTm2 a406 (Dπ of 8.2 Å; table S4) and further transferred to the PSII core (fig. S8D).

In contrast to the Chl a401/c408 pair, the Chl a406/c403 pair is responsible for energy transfer from MT to the outward FCP-E and FCP-D, mostly along with subunit interactions mediated by helix C (fig. S8, E and F), which is similar to the “tail-to-tail” interaction pattern observed in the FCP dimer (27). Slightly different from MT, the additional Chls a400/a411 of FCP-E gives rise to two larger Chl clusters consisting of four Chls (a400/a402/c403/a406 and a401/a407/c408/a411, respectively) at the stromal layer, facilitating the efficient energy transfer via FCP-E (Fig. 4D and fig. S4E) to the PSII core. The interfacial Chl a406MT4/a406FCP-E pair with Dπ of 7.5 Å (fig. S8E and table S4) connects their corresponding clusters and channels the energy out of MT to the Chl a400/a402/c403/a406 cluster of FCP-E, which is further delivered to its Chl a401/a407/c408/a411 cluster. The Chl a102 of PsbZ serves as a pivot to accept energy from Chls a411FCP-E and a404FCP-E (with Dπ of 6.6 and 8.0 Å) and subsequently transfer them to the lumenal Chl a503 of CP43 (with a Dπ of ~10.9 Å) (fig. S4E).

Compared with that of FCP-E, energy transfer via FCP-D is likely more efficient. At the interface between MT and FCP-D, the Chl a406MTm1/a609FCP-D pair with a Dπ of 4.3 Å connects the Chl a402/c403/a406MTm4 and a602/a603/a609FCP-D clusters (Fig. 4E and fig. S8F), providing a main energy transfer pathway in addition to a secondary pathway mediated by Chl a401MTm4/a603FCP-D (Fig. 5A and fig. S8F). Not only are the strongly coupled Chls a603/a609FCP-D with a Dπ of 4.0 Å the energy inlet of FCP-D (fig. S8F), they may also serve as red forms of Chl, in agreement with the uphill energy transfer to the PSII core predicted by spectral experiments (42). Energy received by FCP-D is subsequently directed to the stromal layer of CP43 via Chl a601FCP-D/a512CP43 with a Dπ of 9.0 Å (fig. S8F and table S4). The additional Chl a601 bridges the two stromal Chl clusters inside FCP-D, improves the internal energy equilibrium efficiency (Fig. 4E), and additionally serves as one primary energy transfer pathway to CP43 via the Chl a611FCP-D/a506CP43 pair (Fig. 5A). In general, the energy captured by the moderately associated tetramer MT is transferred to the stromal and lumenal Chl layer of CP43 via FCP-D and FCP-E (and PsbZ), respectively, which is analogous to the energy transfer of S-LHCII trimer to PSII core via CP29/CP26 in C2S2M2 of higher plants.

At the interface between ST and CP47, Chl a406STm4 is found to be strongly coupled with Chl a101 of PsbG with a Dπ of 3.8 Å (Fig. 5A and fig. S4C). This Chl a406STm4/a101PsbG pair may thus provide a main energy transfer pathway from ST to CP47. Chl a101PsbG is connected to the stromal Chls a607 and a610 of CP47 with a Dπ of 14.1 and 14.3 Å, whereas the extra Chl a410 nearby helix E of ST has a Dπ of 8.9 Å with Chl a602CP47 in the lumenal layer, implying the equivalent importance of both pathways toward the PSII core (Fig. 5A and fig. S4C). Energy exchange pathways between ST and FCP-D from the adjacent PSII-FCPII monomer are also observed, thus providing the possibility of energy equilibration between the two PSII-FCPII monomers (Fig. 5A and fig. S8, G and H). Another weak pathway from ST to CP43 of the adjacent PSII core is mediated by Chl a410STm3/a201PsbW and Chl a201PsbW/a507CP43 with Dπ of 20.8 and 7.6 Å, respectively (Fig. 5A and fig. S4D).

The pathways via Chls of PsbG, PsbW, and PsbZ discussed above resemble those of PsaJ and PsaK in PSI-LHCI (fig. S10) (8, 9, 43), which offer additional energy transfer channels when compared with those of PSII-LHCII, conferring diatoms more-diversified energy-delivering pathways from the antenna to the core in the PSII-FCPII supercomplex.

Carotenoid arrangement and possible adaption mechanisms

A number of Fxs were found in the peripheral FCPs (Fig. 5B). These pigments enhance the light-harvesting capacity in the blue-green region and enable efficient NPQ in diatoms as an adaptation to high-light conditions (44, 45). Fxs in FCPs are arranged in close proximity to Chls, forming a complementary network for photon capturing and energy transfer (27, 4649). Some Fxs exist at the interfaces between FCP subunits or between FCP and the PSII core (Figs. 2D and 5B), suggesting that they may function to either link the different subunits or mediate energy transfer among the different subunits. These Fxs are found to have larger conformational differences, such as the end groups of Fx305 and Fx307 located in the FCP tetramer at the lumenal side (figs. S7, E, G and H, and S9F). A large change was found for the cyclohexane end group of Fx306MTm4, which is rotated by ~37° relative to the corresponding pigment in other monomers (fig. S7D), making its polyene inserted into two helices C connecting the tetramer and FCP-E. This Fx306 is located close to a large Chl cluster consisting of Chl a406/c403/a402MTm4 and a406/c403/a402/a400FCP-E (fig. S9B), where Chl a406FCP-E is coupled with Chl a400FCP-E and a406MTm4 with Dπ at 4.8 and 7.6 Å. This suggests that Chl a406FCP-E may be at a lower energy level suitable for energy trapping. Therefore, the small Dπ between Fx306MTm4 and Chl a406FCP-E (3.6 Å) suggests that this Fx may function as a possible quencher at the bottleneck of the energy transfer pathway through FCP-E (fig. S9B) (41, 44).

Only one Ddx can be assigned in the 616 site of FCP-D (fig. S3), whose position resembles that of lutein 621 in LHCII (fig. S9C). The Ddx616 is close to the Chl a602/a603/a609 cluster in the stromal side and Chl 604 in the lumenal side (fig. S9D). Chl a609 in FCP-D is strongly coupled with Chl a603 with a parallel Qy orientation, giving rise to a lower energy level. This Chl cluster is thus ideal for energy trapping and possible energy dissipation (fig. S9E). In view of the close proximity of Ddx616 with Chl a603/a609, this Ddx may serve as an efficient energy quenching site in FCP-D (5052). No additional Ddx was identified in other subunits of FCP. This may be due to the limited resolution of the cryo-EM map obtained here.

In conclusion, the cryo-EM structure of diatom PSII-FCPII revealed the position of five extrinsic proteins in the PSII core, as well as a previously unidentified oligomeric form of FCP and its distinctive association patterns with the PSII core. In a PSII-FCPII unit, a strongly associated and a moderately associated FCP tetramer were found, but their locations are opposite to the S and M trimers of LHCII in green-lineage organisms. In addition, three FCP monomers are associated with each PSII core to help energy harvesting, transfer, and quenching. The whole complex exhibited a sophisticated pigment-protein network, ensuring a highly efficient energy transfer and regulation of light harvesting in diatoms, which forms the molecular basis for the biodiversities and adaptions of photosynthetic organisms during evolution.

Materials and methods

Purification and characterization of PSII-FCPII

Cells of C. gracilis (CCMA-116) were obtained from the Center for Collections of Marine Algae at Xiamen, China, and were cultured in an artificial seawater medium at 25°C under continuous light illumination of 40 μmol photon m−2 s−1 with bubbling of air containing 3% CO2. After growth for 7 days, the cells were harvested by centrifugation at 5000 × g for 10 min under dim light at 4°C, and all of the subsequent steps were performed on ice or at 4°C under green light, unless otherwise stated.

Cells collected were suspended in an MMB buffer containing 30 mM 2-morpholinoethanesulfonic acid (MES) at pH 6.5, 5 mM MgCl2, and 1 M betaine, and rapidly disrupted by one freeze-thawing cycle with a combination of liquid nitrogen/water (room temperature) in the presence of 3 mM phenylmethylsulfonyl fluoride and 1 μg/ml DNase I. After incubation on ice for 20 min, the disrupted cells were centrifuged at 2000 × g for 5 min to dispose the unbroken cells, and the supernatant was then centrifuged at 100,000 × g for 30 min to collect the thylakoid membranes. The pellet obtained was suspended and homogenized in an MBNC buffer containing 30 mM MES at pH 6.5, 1 M betaine, 10 mM NaCl, and 5 mM CaCl2 to a chlorophyll (Chl) concentration of at least 1.2 mg Chl/ml. The thylakoids obtained were solubilized at 1 mg Chl/ml with 0.7% dodecyl-α-d-maltopyranoside (α-DDM) for 10 min with stirring at 300 rpm. The sample was centrifuged at 40,000 × g for 10 min to remove unsolubilized debris, and then loaded to sucrose density gradient ultracentrifuge tubes containing 0 to 1.1 M sucrose in the MBNC buffer supplemented with 0.01% α-DDM. After centrifugation at 200,000 × g for 17 hours, a band containing the PSII-FCPII fraction (fig. S1A) was collected, which was then concentrated by a Millipore ultra-filtration tube (100-kDa cutoff). For cryo-EM analysis, the sucrose in the sample was removed by gel filtration (Superose 6 Increase 10/300 GL, GE Healthcare) with the MBNC buffer in which, the concentration of NaCl was increased to 100 mM and 0.01% α-DDM was supplemented. Finally, the sample was concentrated to 4 mg Chl/ml and stored in liquid nitrogen before being used for the cryo-EM analysis.

Transcriptome sequencing

The cells of C. gracilis were grown for 7 days as described above, and harvested by centrifugation at 5000 × g for 10 min. The cells harvested were frozen immediately in liquid nitrogen, and then disrupted by grinding in liquid nitrogen in a mortar. Total RNA was extracted with a TRIzol Reagent according to the manufacturer’s instructions (Invitrogen, CA, USA), and their purity and integrity were examined by gel electrophoresis, Qubit 3.0 Fluorometer (Life Technologies, CA, USA) and Agilent 2100 RNA Nano 6000 Assay Kit (Agilent Technologies, CA, USA), respectively. Sequencing libraries were generated using NEBNext Ultra RNA Library Prep Kit for Illumina (NEB, MA, USA) following the manufacturer’s recommendations. Briefly, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads. Fragmentation was carried out using divalent cations under elevated temperature in NEBNext First Strand Synthesis Reaction Buffer. The first strand cDNA was synthesized using random hexamer primers and RNase H, and the second strand cDNA was subsequently synthesized using DNA polymerase I and RNase H. Further purification with QiaQuick PCR kits (QIAGEN, Germany) and modifications including terminal repair, A-tailing and adapter addition were performed. The aimed products were retrieved by agarose gel electrophoresis and PCR was performed to obtain the final cDNA library. Sequencing was performed using Illumina HiSeq 2000 according to the manufacturer’s instructions (Illumina, CA, USA). The sequences were assembled de novo with Trinity (53). Briefly, reference-free overlapping and connections were examined to combine the original reads into contigs, which were further assembled into unigenes by means of paired-end assembly and gap filling. The unigenes obtained were mapped to the whole genome of T. pseudonana (24) from the National Center for Biotechnology Information (NCBI) to facilitate the translation into amino acid sequences. Finally, functional annotations were carried out according to homology searches of translated unigenes against the selected amino acid sequences from the T. pseudonana genome by the Basic Local Alignment Search Tool (BLASTP) (54). Sequences of most of the PSII core subunits were obtained (table S2), whereas sequences of only several fragments of the FCP proteins were obtained. Blast search of the transcriptome FCP genes obtained against the T. pseudonana sequences yielded 2 fragments of lhcf genes and 3 fragments of lhca genes, as well as some fragments of 9 lhcr genes, 3 lhcx genes, together with 14 PSII core subunit genes (D1, D2, CP43, CP47, PsbE, PsbK, PsbM, PsbO, PsbU, PsbV, PsbQ′, Psb31, PsbT, and PsbW). Among the FCP sequences obtained, two longest ones used for building the FCP-A and FCP-D structures were aligned in fig. S6A.

Biochemical characterization

Pigment composition of the isolated PSII-FCPII was analyzed by high-performance liquid chromatography (HPLC) as described in (55) with slight modifications. Briefly, pigments in the supercomplex were extracted by 90% (v/v) acetone and injected into a C18 reversed-phase column (Alltima C18 5u, 250 mm by 4.6 mm, GRACE) in a separation module (Waters 2690/5, USA) equipped with a photodiode array detector (Waters 2998, USA). The pigments are identified based on the absorption spectrum and elution time of each peak. The column was pre-equilibrated with solvent A (acetonitrile:H2O = 9:1) before injection and the elution was conducted with a linear gradient of 0 to 100% solvent B (ethyl acetate) at 1 ml/min for 20 min. The 100% solvent B was continued further for 2 min before the column was re-equilibrated for 5 min by the solvent A for the next injection. The results showed that the supercomplex contains Chl a, Chl c, Fx, Ddx, and β-carotene (fig. S1E). Identification of each pigment was performed according to (56).

Protein composition was analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis with a 16 to 22% gradient gel (57). The bands in the gel were identified by mass spectrometry as described previously (5). Absorption spectra were measured at room temperature with a UV-Vis spectrophotometer (UV-2700, Shimadzu, Japan). Fluorescence spectra were recorded with a fluorescence spectrophotometer (F-4500, Hitachi, Japan) at 77 K with a slit width of 5 nm and an exciting wavelength of 436 nm, at a Chl concentration of 10 μg/ml.

Oxygen-evolving activity was measured by a Clark-type oxygen electrode under saturating light at 25°C in a buffer containing 30 mM Mes-NaOH (pH 6.5), 0.4 M sucrose, and 5 mM CaCl2 at 10 μg Chl/ml. Phenyl-p-benzoquinone (0.5 mM) was used as the electron acceptor. The oxygen-evolving activity of the purified PSII-FCPII was determined to be ~1100 μmol O2 (mg Chl)−1 h−1.

Cryo-EM data collection

An aliquot of 4 μl of homogeneous PSII-FCPII sample at a Chl concentration of 3.5 mg Chl/ml was applied to a glow-discharged Quantifoil holey carbon grid (R 1.2/1.3, 400 mesh), and blotted for 5 s with a blotting force of level 0, and plunged into liquid ethane at ~100 K using an FEI Vitrobot at 100% humidity and 8°C. A total of 1151 super-resolution images were captured with an FEI 300 kV Titan Krios electron microscope equipped with a K2 Summit direct electron detector (Gatan), at a nominal magnification of 22,500×, yielding a final pixel size of 1.30654 Å. The defocus range is between −1.5 and −2.5 μm. Each exposure of 8 s was fractionated into 32 movie frames, leading to a total dose of ~50 e2. All movie frames were corrected by the program MotionCor2 (58).

Data processing

The defocus value and the parameters of astigmatism were determined by CTFFIND4 (59) using the corrected micrographs without dose-weighting. Images with poor thon rings were removed prior to subsequent data process. A subset of 3015 particles of the PSII-FCP supercomplex were manually selected with RELION (60) from nearly 200 micrographs and processed by reference free 2D classification in RELION. Six 2D class averages were selected as references for automatic particle-picking of all 1151 micrographs in RELION from which, a total of 196,802 particles were automatically picked. All these particles were manually screened to remove those from micrographs with overlapped and mistakenly picked particles. Reference free 2D classification in RELION was performed for the screened particles, and 154,696 particles from the 2D average were subjected to 3D classification without imposing symmetry. An atomic model of the PSII-LHCII complex (PDB code 3JCU) (11) was low-pass filtered to a resolution of 60 Å as an initial model for 3D classification. In total 42,033 particles were selected from the 3D classification for further 3D autorefinement. Further improvements of the density map were achieved by a set of soft-mask for the dose-weighted micrographs and by applying a twofold symmetry. The final resolution of the density map is 3.02 Å based on the gold-standard of FSC = 0.143 criteria (fig. S2B) (61), although the map densities in the regions between the MT tetramers and FCP-F monomer as well as some other FCP regions have a lower resolution based on the local resolution estimation (fig. S2C) (62). The lower resolution of the regions of some FCP-A monomers and the FCP-F monomer may be due to the flexible interactions between FCP-A and FCP-F monomer. To improve the resolution of FCP tetramers and the FCP-F monomer, the same particles were first auto-refined without imposing symmetry. A focused refinement was performed with a soft mask for the regions of MT tetramers, ST tetramers, FCP-D, and the FCP-F monomer following the asymmetry reconstruction, which improved the resolution of this local map to 3.97 Å.

Model building and refinement

For model building of the PSII-FCP supercomplex, the structures of spinach PSII (PDB codes 3JCU) (11), Thermosynechococcus vulcanus PSII (TvPSII, PDB ID 3WU2) (1), C. gracilis Psb31 (PDB ID 4K7B) (17) and FCP (PDB ID 6A2W) (27) of P. tricornutum were first manually placed and then rigid-body fitted into the 3.0-Å resolution cryo-EM map with UCSF Chimera (63). The amino acid sequences were then mutated to the corresponding sequences of C. gracilis obtained from the transcriptome sequencing and mass spectrum performed in the present study, which include D1, D2, CP43, CP47, PsbO, PsbU, PsbV, PsbQ′, Psb31, PsbK, PsbT, and PsbW (table S2). Most of the subunits in the PSII core were identified and have high sequence homologies with other species, and only Psb34 and PsbG were assigned as polyalanines (table S2). The sequences of FCP-A comprising the tetramers were identified to be encoded by a lhcf8 gene, based on high-quality cryo-EM maps of subunits located close to the PSII core (MTm1, MTm4, and STm3 monomers). The structure of both ST and MT were built with the FCP-A sequence, however, we cannot exclude the possibility that the sequences of the monomers comprising the two tetramers may be slightly different due to the limited resolution as well as the very similar sequences between different FCP genes.

The monomeric FCP subunit with the largest molecular weight was determined to be a Lhca-like protein named as FCP-D here. However, the sequences of other two monomeric FCP (FCP-E and FCP-F) subunits cannot be determined from their cryo-EM maps, and therefore the sequence of Lhcf4 from P. tricornutum whose structure has been determined recently (27) was used to build their structures. Based on the Lhcf4 structure and the cryo-EM density of the FCP-F monomer, we can assign the location and orientation of the transmembrane helices and pigments of FCP-F unambiguously. In addition, the cryo-EM maps clearly showed the presence of a helix D and an extended C-terminal region in FCP-A and FCP-E, although both lhcf8 gene from C. gracilis and lhcf4 gene from P. tricornutum that are used to build the structures of these FCP subunits lacked the corresponding sequences. Thus, the sequences of lhcf8 in these regions from T. pseudonana were used to model the structures of helix-D and extended C-terminal region of FCP-A (fig. S6 and table S2).

Additional adjustments of the backbone and side chain structures were performed manually with COOT (64). The model of the whole PSII-FCP supercomplex was refined in real space against the cryo-EM map by Phenix (65). The Phenix refined models were edited in Coot to resolve a few atomic clashes and geometry problems. The edited models were then refined again using Phenix. Several iterations of these two steps were performed to reach the final atomic model. During real space refinement, the distances between each atom of the OEC and between the central magnesium ion of chlorophyll molecules and the coordinating ligands were restrained according to the values obtained from the high-resolution crystal structures.

The proteins, Chls and carotenoids assigned in the present structure are summarized in tables S1 and S2. The pigments in FCP-A/E/F were numbered following that used in the crystal structure of an FCP dimer (27) which was based on the number of coordinating residues. The pigments in FCP-D were numbered as those in plant LHCI (8) [and LHCII (3)], since it is an Lhca-like protein. The present results showed that one Chl c is found in FCP-A and assigned as Chl 405, and one additional Chl site is assigned close to helix E as Chl a410 in the FCP-A monomers of the ST tetramer. This Chl a410 site was clearly identified in the map of STm3 and other ST monomers, but absent in the MTm1 and MTm4 monomers, because of their close interactions with the helix E region of the FCP-D monomer. In the MTm2 and MTm3 monomers, it is also hard to find Chl a410 because of the poor local density maps. One or two extra Chls were determined in FCP-D and FCP-E as possible energy transfer mediators. Ddx616 in FCP-D and its orientation were determined (fig. S3), however, the possible Ddx site in FCP-A was still unclear. Because various subunits are connected to the FCP-A tetramers, the monomeric FCP-As show larger heterogeneities especially at the Fx306 binding site.

Supplementary Materials

Figs. S1 to S10

Tables S1 to S4


References and Notes

Acknowledgments: We thank J. Lei and the staff at the Tsinghua University Branch of the National Center for Protein Sciences Beijing for providing facility support, the “Explorer 100” cluster system of the Tsinghua National Laboratory for Information Science and Technology for providing computation resources, and J. Qin and Y. Wang from the National Center for Protein Sciences (The PHOENIX Center, Beijing) for the mass spectrometry analysis. Funding: This work was supported by the National Key R&D Program of China (2017YFA0503700 to J.-R.S., 2016YFA0501101 and 2017YFA0504600 to S.-F.S), and the National Natural Science Foundation of China (31861143048 and 31670745 to S.-F.S), a Strategic Priority Research Program of CAS (XDB17000000), a CAS Key Research program for Frontier Science (QYZDY-SSW-SMC003), a National Basic Research Program of China (2015CB150100 to T.K.), and JSPS KAKENHI no. JP17H06433 (to J.-R.S.). Author contributions: T.K., S.-F.S., and J.-R.S. conceived the project; S.Z. and W.W. performed the sample isolation and characterization; X.P. took the cryo-EM images, processed the cryo-EM data, and built the structure model; X.P., W.W., and S.Z. analyzed the transcriptome sequencing results and refined the structure model; S.Z., W.W., X.P., T.K., S.-F.S., and J.-R.S. jointly wrote the manuscript; and all authors joined the discussion of the results. Competing interests: The authors declare that they have no competing interests. Data and materials availability: Atomic coordinates and cryo-EM map have been deposited in the Protein Data Bank and the Electron Microscopy Data Bank under IDs 6JLU and EMD-9839, respectively. All other data are presented in the main text or supplementary materials.
View Abstract

Stay Connected to Science

Navigate This Article