Computational Analysis of Chromophore Tripeptides FollowingFusion of Enhanced Green Fluorescent Protein and Cell-penetrating Peptides

Cell-penetrating peptides (CPPs) are small peptides that can transfer other materials into a cellular compartment. In this research, we studied the effect of fusion of new CPPs to the N-terminal of enhanced Green Fluorescent Protein eGFP on the ability of the latter to fluoresce. Results showed that the recombinant protein CPPs-eGFP could be successfully expressed in Escherichia coli. In contrast to E. coli expressing wild-type eGFP, which could fluoresce under ultraviolet (UV) or visible light, E. coli expressing CPPs-eGFP lost their ability to fluoresce. PyMol, a molecular visualization system, revealed that fusion of the new CPPs to the N-terminal of eGFP alters interactions between chromophoreforming tripeptides and the adjacent amino acids of other tripeptides. Disrupting peptide interactions induced structural changes in eGFP that caused it to lose its fluorescence ability. We suggest performing computational analyses to predict the biological function of new fusion proteins prior to starting laboratory work.


Introduction
The successful delivery of a material, especially one used for gene therapy, DNA/mRNA vaccination, genome editing, and many other biological applications, into the intracellular compartment is an important endeavor [1]. Viral vectors are the most well-developed vehicles used to deliver extracellular materials. The use of viral vectors ensures that the extracellular material is effectively distributed into the intracellular compartment. However, these vectors may also induce an immune response that could affect its transport efficiency. Some viral vectors may even cause severe side effects. To overcome those obstacles, researchers over the last 20 years have sought to develop vehicles based on small peptides. The first peptide to deliver a material larger than itself is one derived from transactivator of transcription (Tat) protein, a human immunodeficiency virus accessory protein [2]. Small peptides that can transfer other materials into a cellular compartment are called cell-penetrating peptides (CPPs) [3].
In vitro and in vivo studies have demonstrated the obstacles that must be overcome by CPPs. The presence of protease in the plasma, cell membrane, endosomal environment, and nuclear membrane could hinder the effectiveness of CPPs in delivering their cargo to the intracellular compartment [4]. Newer CPPs have been developed to avoid such issues [5,6]. These CPPs, such as ALMR and SIMR, are designed to deliver nucleic acids into the nucleus of non-dividing cells [5,6]. In a previous study, these CPPs protected DNA from plasma-nuclease degradation, delivered DNA across the membrane cells, escaped from the endosomal compartment, and crossed the nuclear membrane [6]. The ability of these new CPPs to deliver protein cargos into the intracellular compartment must be investigated further. The discovery of CPPs that can deliver proteins or molecules into an intracellular environment provides new opportunities for the development of medical treatment using proteins or molecules previously considered incompatible for therapy [7,8].
Proteins may be incorporated into CPPs via their fusion and expression in a suitable system, such as prokaryotes [9]. A previous study reported the ability of prokaryotic expression systems to express CPPs fused to many reporter proteins, such as GFP [10]. Fusion of CPPs to the C or N-terminal of a protein could alter the structure and biological function of the latter [11]. In this research, we studied the effect of fusing ALMR and December 2020  Vol. 24  No. 4 SIMR to the N-terminal of eGFP on the ability of the latter to fluoresce. Analyses of the Escherichia coli expression, biological properties, and structures of the resulting proteins were also performed.
Protein expression. ALMR-eGFP and eGFP were expressed in E. coli DH5α, and SIMR-eGFP was expressed in E. coli BL21 [Novagen]. Protein expression was conducted using the method described in QIAexpressionist [12]. One bacterial colony was grown in LB broth media [HiMedia] containing 100 μg/ml ampicillin. After overnight incubation at 37 °C, the starter culture was used to inoculate a larger volume of Terrific broth containing 100 μg/ml ampicillin at a 1:10 ratio. After 2 hours of incubation at 37 °C, IPTG was added to the bacterial cultures at a final concentration of 1 mM. The cultures were incubated for another 4 hours, and the GFP fluorescence of the bacterial pellets was observed by direct visualization with the naked eye and shortwave UV light. ALMR-eGFP, SIMR-eGFP, and eGFP were analyzed using SDS-PAGE.
Bacterial lysis. Bacteria expressing eGFP proteins were lysed under native conditions following the methods described in QIAexpressionist [1210]. The bacterial pellet was diluted in native buffer (50 mM NaH2PO4 [Applichem], 300 mM NaCl, 10 mM imidazole, pH 8), and the bacterial suspension was sonicated over six cycles of bursting; each burst lasted 20 seconds, and the interval between bursts was 10 seconds. After sonication, the bacterial suspension was centrifuged at 8000 rpm for 30 minutes at 4 °C. The supernatant was stored at −30 °C. Bacteria expressing ALMR-eGFP and SIMR-eGFP were lysed under denaturing conditions by using denaturant buffer (100 mM NaH2PO4 [Applichem], 10 mM TrisCl [Thermo Scientific], 6 M guanidine hydrochloride [Bio Basic Inc. pH 8) [13]. After incubation in a rotary shaker for 1 hour at room temperature, the bacteria were centrifuged at 8000 rpm for 30 minutes at 4 °C to separate proteins and cell debris. The supernatant was stored at −30 °C.
Protein purification. Recombinant proteins were purified by IMAC according to the principles of histidine-NiNTA binding [14] by using a commercial kit from Qiagen. Purification was conducted as described by the manufacturer. Recombinant proteins were desalted using PD10 columns (GE Healthcare) following the manufacturer's recommendation.
Western blot analysis. Western blot analysis was conducted following the methods described by Ni et al. [15]. The proteins obtained by SDS-PAGE were transferred to a nitrocellulose membrane, which was subsequently blocked with 1% skim milk (BioRad) and incubated in PBS-diluted primary antibody (rabbit polyclonal antibody against GFP; VPRVC FKUI) at a 1:10 ratio (v/v) at room temperature. The membrane was washed thrice with PBS-Tween and then added with the secondary antibody (biotinylated anti-rabbit IgG). Following the washing steps described above, the membrane was incubated with streptavidin HRP for 1 hour at room temperature and washed thrice with PBS. Protein bands were visualized by adding Immunostar chemiluminescent substrate (Invitrogen) to the membrane. Western blot bands were captured using an LA 4000 instrument (Thermo Scientific).
Protein structure analysis. RaptorX software was used to obtain the tertiary structure and 3D model of the proteins [16]. PyMOL Molecular Graphics System version 1.7.x was used to visualize the predicted structures of the proteins [17].  conditions, but neither ALMR-eGFP nor SIMR-eGFP could be purified (unpublished data). This finding may be attributed to the burial of 6×histidine in these proteins. Purification of ALMR-eGFP and SIMR-eGFP was performed under denaturing conditions ( Figure 3). However, nonspecific bands could be observed in the purified-ALMR-eGFP and SIMR-eGFP ( Figure 3A). Purified-eGFP ( Figures 3B and 3C) did not show nonspecific bands. Western blot analysis was used to verify the recombinant proteins on the basis of their reactivity to a specific antibody. The results showed that ALMR-eGFP, SIMR-eGFP, and eGFP react to rabbit polyclonal antibody against eGFP. In these proteins, the polyclonal antibody reacted with only a single band protein, which indicates that nonspecific proteins copurified by NiNTA are not reactive to antibodies against GFP (Figure 4).

Result and Discussion
PyMol revealed that ALMR-eGFP and SIMR-eGFP have structures resembling that of eGFP ( Figure 5). eGFP has a unique barrel shape formed by 11 β-sheets and a coaxial α-helix traversing the center of the βbarrel. Differences in the diameters of the β-barrels of ALMR-eGFP, SIMR-eGFP, and eGFP were observed.
The diameters of the β-barrels of ALMR-eGFP, SIMR-eGFP and eGFP were 19.7, 19.3, and 19.4 Å, respectively. The structure of the tripeptide in ALMR-eGFP is different from those in SIMR-eGFP and eGFP. Specifically, the tripeptide in ALMR-eGFP forms a loop structure whereas the tripeptides in SIMR-eGFP and eGFP WT form an α-helical structure (Figures 5a and  5b).
The interactions of tripeptides with adjacent amino acids and the orientation of some amino acids in ALMR-eGFP and SIMR-eGFP differed from those in eGFP. In eGFP, Ser 65 and Tyr 66 interact with His 148 and Glu 222 , which are located on β-sheets, while Gly 67 interacts with Gln 94 and Arg 96 , which are also located on β-sheets (  ALMR and SIMR are new CPPs that bind and deliver DNA into the nucleus of dividing and non-dividing cells [5,6]. The ability of these CPPs to deliver extracellular proteins to intracellular compartments remains debated.
In this study, we fused ALMR and SIMR to the Nterminal of eGFP. GFP and its variants are reporter proteins widely used to study biological processes in many species [18,19]. In this study, we found that fusion with ALMR and SIMR alters the GFP structure and causes it to lose its ability to fluoresce.
All of the proteins used in this study were fused to 6×histidine to assist in their purification. Addition of 6×histidine alone to the N-terminal of eGFP does not alter GFP fluorescence (Figure 1). This finding is consistent with the results of Deng and Boxer in 2020 [20]. Purification of ALMR-eGFP and SIMR-eGFP was performed under denaturing conditions in which eGFP may be unable to fluoresce. Thus, the proteins were desalted using a PD10 column to reduce the effects of the denaturant. The diluted denaturant in solution did not affect the fluorescence of ALMR-eGFP and SIMR-eGFP. This finding indicates that the structures of AMLR-eGFP and SIMR-eGFP had changed during their expression in E. coli.
PyMol computational analysis allowed the intensive study of the structures of ALMR and SIMR upon fusion with eGFP. Fusion of ALMR and SIMR to the Nterminal of eGFP did not affect the formation of 11 βsheets and a central coaxial helix to build a cylindrical β-barrel structure resembling that of eGFP [21]. However, this fusion induced changes in the structure of the latter that caused it to lose its fluorescence.
The chromophore tripeptide, which comprises amino acid numbers 65-67, of Aequorea victoria's GFP plays an important role in its fluorescence [24]. Many proteins in nature contain the tripeptide sequence, but most of them cannot fluoresce. This finding highlights the crucial role of other amino acids in the generation of chromophores [20,21]. Some studies have demonstrated the role of the interaction of tripeptides with adjacent and remote amino acids from other tripeptides in the formation of chromophores [24,25]. A limitation of our study is that our computational analysis focuses on interactions between the amino acids of a tripeptide and those of another adjacent tripeptide. Alterations in these interactions affect GFP fluorescence [24].
The tripeptide Thr 65 Tyr 66 Gly 67 is located at the α-helix at the center of the β-barrel structure [20]. This rigid βbarrel structure makes up a protein matrix that surrounds the tripeptide [24,26,27], protects it from nonradiative deactivation by oxygen and light in the environment, and ensures its flexibility [22,26,27]. In ALMR-eGFP, the structure of the tripeptide changes from α-helical to β-sheets. This change affects the interaction between a chromophore-forming tripeptide and its adjacent amino acids. Glycine has a H atom on its side chain that confers it with flexibility [29]. The interaction of Gly 67 with Thr 65 forms a kinked internal α-helix that places Gly 67 close to Thr 65 for nucleophilic attack during chromophore synthesis [25]. In ALMR-eGFP, Gly 67 lose its interaction with Thr 65 . The interaction of Glu 222 and Thr 65 determines the ability of GFP to adsorb light at 400 nm [20]. This crucial interaction is found in ALMR-eGFP; thus, ALMR-eGFP can absorb light at 400 nm but fails to emit light or synthesize chromophores at 509 nm. The proximity of backbone atoms in Thr 65 and Tyr 66 determines the cyclization of the imidazole ring, which is a critical step in eGFP fluorescence [23]. Changing the orientation of His 148 in the imidazole ring in ALMR-eGFP abolishes the His 148 -Tyr 66 interaction. The anionic interaction between His 148 and Try 66 stabilizes the interactions of the tripeptide with crucial amino acids, namely, Gln 94 , Arg 96 , and Glu 222 , in adjacent tripeptides [23]. Loss of this interaction in ALMR-eGFP destabilizes the tripeptide orientation and structure.
Fusion of SIMR to the N-terminal of eGFP triggers the formation of a cavity that leaves the tripeptide directly exposed to oxygen and light in the environment.. The fluorescence of GFP begins with the folding of the protein, which promotes the cyclization of Thr 65 and Gly 67 . This process induces the formation of an imidazoline-5-one intermediate structure followed by low oxygenation of the Tyr 66 side chains [25,30]. However, excess oxygen causes photobleaching of the protein [25]. SIMR-eGFP may absorb light at 400 nm because of the occurrence of Glu 222 and Thr65 interactions. In SIMR-eGFP, the chromophore is formed, but excessive exposure to light and oxygen causes GFP photobleaching. SIMR-eGFP also shows a loss of the His 148 -Tyr 66 interaction, which stabilizes the interaction of the tripeptide with the adjacent amino acids of other tripeptides.

Conclusion
Using PyMol, we found that fusion of ALMR and SIMR to the N-terminal of eGFP induces structural changes in the latter and renders it unable to fluoresce. We recommend performing predictions of the biological function of a new fusion protein by using computational analysis prior to starting laboratory work to produce recombinants.