SARS-CoV-2的近期起源(BY Google translate)
克里斯蒂安·安德森(Kristian G.Andersen)1,2 *,安德鲁·朗巴特(Andrew Rambaut)3,伊恩·利普金(W.Ian Lipkin)4,爱德华·霍姆斯(Edward C.Holmes)5和罗伯特·F·加里(Robert F.Garry)6,7
1美国斯克里普斯研究所(Scripps Research Institute)免疫学和微生物学系,美国加利福尼亚州92037。
2Scripps Research Translational Institute,拉荷亚,CA 92037,美国。
7 Zalgen Labs,LCC,美国马里兰州日耳曼敦。
克里斯蒂安·安德森(Kristian G.Andersen),免疫学和微生物学系, 斯克里普斯研究所,拉荷亚,CA 92037,美国。
SARS-CoV-2是已知感染人类的??冠状病毒科的第七名成员。其中三种病毒,SARS CoV-1,MERS和SARS-CoV-2,可以引起严重的疾病。四,HKU1,NL63,OC43和229E与轻度呼吸道症状有关。本文中,我们回顾了从可用基??因组序列数据的比较分析中可以推断出SARS-CoV-2的起源和早期进化的方法。特别是,我们提供了有关SARS-CoV-2基因组中显着特征的观点,并讨论了可能出现这些特征的场景。重要的是,该分析提供了证据,表明SARS-CoV-2不是实验室构建物也不是有意操纵的病毒。
下文描述的α-和β-冠状病毒(Coronaviridae家族)的基因组比较确定了SARS-CoV-2基因组的两个显着特征:(i)基于结构模型和早期生化实验,SARS-CoV-2似乎针对与人ACE2受体结合; (ii)SARS-CoV-2的高度可变的穗状蛋白(S)通过插入十二个核苷酸在S1和S2边界具有一个多碱基(弗林蛋白酶)切割位点。另外,该事件导致在多碱基切割位点附近获得了三个预测的O-连接的聚糖。
SARS-CoV和SARS相关冠状病毒的刺突蛋白中的受体结合域(RBD)是病毒基因组中变化最大的部分。 RBD中的六个残基似乎对于与人ACE2受体结合并确定宿主范围至关重要。使用基于SARS-CoV的Urbani应变的坐标,它们是Y442,L472,N479,D480,T487和Y4911。 SARS-CoV-2中的相应残基为L455,F486,Q493,S494,N501和Y505。与它最密切相关的病毒Ra Rh13相比,SARS-CoV-2中的这六个残基中有五个是突变的,它是从Rhinolophus affinis bat采样而来的,与?96%相同(图1a)。根据建模1和生化实验3,4,SARS-CoV-2似乎具有一种RBD,它可能与人,非人灵长类,雪貂,猪和猫以及其他具有高受体的物种对ACE2的亲和力很高。同源性1。相比之下,SARS-CoV-2在与SARS样病毒相关的其他物种(包括啮齿动物和蜂巢)中与ACE2的结合效率可能较低。
SARS-CoV-2 S蛋白中第486位残基的苯丙氨酸(F)对应于SARS-CoV Urbani菌株中的L472。值得注意的是,在SARS-CoV细胞培养实验中,L472突变为苯丙氨酸(L472F)5,据预测,苯丙氨酸对于SARS-CoV RBD与人ACE2受体6的结合是最佳的。然而,蝙蝠的几个SARS样冠状病毒中也存在此位置的苯丙氨酸(图1a)。尽管这些分析表明SARS-CoV-2可能能够以高亲和力结合人ACE2受体,但相互作用并不被认为是最佳的。此外,SARS-CoV-2的RBD中的几个关键残基与先前描述的最适合人ACE2受体结合的残基不同6。与这些计算预测相反,最近的结合研究表明,SARS-CoV-2
SARS-CoV-2的第二个显着特征是在刺蛋白的两个亚基S1和S2(图1b)8,9的连接处的刺蛋白中有一个预测的多元切割位点(RRAR)。除了两个碱性精氨酸和一个在切割位点的丙氨酸外,还插入了一个脯氨酸。因此,完全插入的序列是PRRA(图1b)。脯氨酸插入产生的强烈转角预计会导致在多碱基切割位点侧翼的S673,T678和S686中添加O-连接的聚糖。以前在相关谱系Bβ冠状病毒中未发现多碱基切割位点,这是SARS-CoV-2的独特特征。一些人β冠状病毒,包括HCoV-HKU1(谱系A),具有多碱基切割位点,以及在S1 / S2切割位点附近具有预测的O-连接聚糖。
尽管尚不清楚SARS-CoV-2中多碱基切割位点的功能后果,但SARS-CoV的实验表明,在S1 / S2交界处改造这样的位点可增强细胞-细胞融合,但不会影响病毒的进入10。多元裂解位点允许弗林蛋白酶和其他蛋白酶有效裂解,并且可以在选择快速复制和传播病毒的条件下(例如高密度鸡群)在禽流感病毒血凝素(HA)蛋白的两个亚基的连接处获得)。 HA在细胞-细胞融合和病毒进入中的功能与冠状病毒S蛋白相似。通过插入或重组获得HA中的多元切割位点,可将低致病性禽流感病毒转化为高致病性形式11-13。在细胞培养物中或动物反复传代后,还观察到流感病毒HA获得了多价切割位点14,15。同样,无毒的新城疫病毒分离株在鸡的连续传代过程中通过在融合蛋白亚基的交界处逐渐获得一个多价裂解位点而成为高致病性16。三种预测的O-连接聚糖的潜在功能尚不清楚,但是它们可以产生一个“粘蛋白样结构域”,该结构域可以屏蔽SARS-CoV-2穗蛋白上的潜在表位或关键残基。需要进行生化分析或结构研究,以确定是否利用了预测的O-连接的聚糖位点。
图1.(a)SARS-CoV-2刺突蛋白接触残基的突变。将SARS-CoV-2的突突蛋白(上图)与最密切相关的SARS样CoV和SARS-CoV-1进行比对。与ACE2受体接触的刺突蛋白中的关键残基在SARS-CoV-2和SARS-CoV Urbani菌株中均标有蓝色框。 (b)获得多元裂解位点和O-连接的聚糖。多元裂解位点标记为灰色,三个相邻的预测的O-连接的聚糖标记为蓝色。多元裂解位点和O-连接的聚糖都是SARS-CoV-2特有的,以前在谱系B beta冠状病毒中没有见过。显示的序列来自NCBI GenBank,登录号为MN908947,MN996532,AY278741,KY417146,MK211376。穿山甲冠状病毒序列是从SRR10168377和SRR10168378(NCBI BioProject PRJNA573298)18,19产生的共有序列。
到目前为止,所有测序的SARS-CoV-2基因组都具有很好的适应性RBD和多碱基切割位点,因此是从具有这些特征的共同祖先获得的。穿山甲中存在一种与SARS-CoV-2中非常相似的RBD,这意味着即使我们还没有确切的非人类祖细胞病毒,它也可能已经存在于跳跃到人类的病毒中。 。这使得多价切割位点插入发生在人与人之间的传播过程中。以甲型流感病毒HA基因为例,需要特定的插入或重组事件才能使SARS-CoV-2成为流行病原体。
在全球COVID-19公共卫生紧急情况中,有理由怀疑流行病的起因是什么。对动物病毒如何越过物种边界来如此有效地感染人类的??详细了解将有助于预防未来的人畜共患病事件。例如,如果SARS-CoV-2已预先适应另一种动物,那么即使目前的流行病得到控制,我们也有未来再发生事件的风险。相反,如果我们描述的适应性过程发生在人类中,那么即使我们重复了人畜共患病的转移,除非发生相同系列的突变,它们也不太可能起飞。此外,鉴定出SARS-CoV-2的最亲近的动物亲属将大大有助于病毒功能的研究。确实,RaTG13 bat序列的可用性促进了此处进行的比较基因组分析,有助于揭示RBD中的关键突变以及多碱基切割位点的插入。
我们感谢所有为SAISA-CoV-2基因组序列贡献到GISAID数据库( 25)和为 16( 4)。我们感谢惠康基金会的支持。 ECH由ARC澳大利亚获奖者奖学金(FL170100022)支持。 NIGA授予1U19AI135995-01支持KGA。 AR得到了Wellcome Trust(协作者奖206298 / Z / 17 / Z – ARTIC网络)和欧洲研究理事会(授权协议号725422 – ReservoirDOCS)的支持。
Kristian G. Andersen1,2*, Andrew Rambaut3, W. Ian Lipkin4, Edward C. Holmes5 & Robert F. Garry6,7
1Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA.
2Scripps Research Translational Institute, La Jolla, CA 92037, USA.
3Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK.
4Center for Infection and Immunity, Mailman School of Public Health of Columbia University, New York, New York, USA.
5Marie Bashir Institute for Infectious Diseases and Biosecurity, School of Life and Environmental Sciences and School of Medical Sciences, The University of Sydney, Sydney, Australia.
6Tulane University, School of Medicine, Department of Microbiology and Immunology, New Orleans, LA, USA.
7Zalgen Labs, LCC, Germantown, MD, USA.
*Corresponding author:
Kristian G. Andersen
Department of Immunology and Microbiology,
The Scripps Research Institute,
La Jolla, CA 92037,
Since the first reports of a novel pneumonia (COVID-19) in Wuhan city, Hubei province, China there has been considerable discussion and uncertainty over the origin of the causative virus, SARS-CoV-2. Infections with SARS-CoV-2 are now widespread in China, with cases in every province. As of 14 February 2020, 64,473 such cases have been confirmed, with 1,384 deaths attributed to the virus. These official case numbers are likely an underestimate because of limited reporting of mild and asymptomatic cases, and the virus is clearly capable of efficient human-to-human transmission. Based on the possibility of spread to countries with weaker healthcare systems, the World Health Organization has declared the COVID-19 outbreak a Public Health Emergency of International Concern (PHEIC). There are currently neither vaccines nor specific treatments for this disease.
SARS-CoV-2 is the seventh member of the Coronaviridae known to infect humans. Three of these viruses, SARS CoV-1, MERS, and SARS-CoV-2, can cause severe disease; four, HKU1, NL63, OC43 and 229E, are associated with mild respiratory symptoms. Herein, we review what can be deduced about the origin and early evolution of SARS-CoV-2 from the comparative analysis of available genome sequence data. In particular, we offer a perspective on the notable features in the SARS-CoV-2 genome and discuss scenarios by which these features could have arisen. Importantly, this analysis provides evidence that SARS-CoV-2 is not a laboratory construct nor a purposefully manipulated virus.
The genomic comparison of both alpha- and betacoronaviruses (family Coronaviridae ) described below identifies two notable features of the SARS-CoV-2 genome: (i) based on structural modelling and early biochemical experiments, SARS-CoV-2 appears to be optimized for binding to the human ACE2 receptor; (ii) the highly variable spike (S) protein of SARS-CoV-2 has a polybasic (furin) cleavage site at the S1 and S2 boundary via the insertion of twelve nucleotides. Additionally, this event led to the acquisition of three predicted O-linked glycans around the polybasic cleavage site.
The receptor binding domain (RBD) in the spike protein of SARS-CoV and SARS-related coronaviruses is the most variable part of the virus genome. Six residues in the RBD appear to be critical for binding to the human ACE2 receptor and determining host range1. Using coordinates based on the Urbani strain of SARS-CoV, they are Y442, L472, N479, D480, T487, and Y4911. The corresponding residues in SARS-CoV-2 are L455, F486, Q493, S494, N501, and Y505. Five of these six residues are mutated in SARS-CoV-2 compared to its most closely related virus, RaTG13 sampled from a Rhinolophus affinis bat, to which it is ~96% identical2 (Figure 1a). Based on modeling1 and biochemical experiments3,4, SARS-CoV-2 seems to have an RBD that may bind with high affinity to ACE2 from human, non-human primate, ferret, pig, and cat, as well as other species with high receptor homology1. In contrast, SARS-CoV-2 may bind less efficiently to ACE2 in other species associated with SARS-like viruses, including rodents and civets1.
The phenylalanine (F) at residue 486 in the SARS-CoV-2 S protein corresponds to L472 in the SARS-CoV Urbani strain. Notably, in SARS-CoV cell culture experiments the L472 mutates to phenylalanine (L472F)5, which is predicted to be optimal for binding of the SARS-CoV RBD to the human ACE2 receptor6. However, a phenylalanine in this position is also present in several SARS-like CoVs from bats (Figure 1a). While these analyses suggest that SARS-CoV-2 may be capable of binding the human ACE2 receptor with high affinity, the interaction is not predicted to be optimal1. Additionally, several of the key residues in the RBD of SARS-CoV-2 are different to those previously described as optimal for human ACE2 receptor binding6. In contrast to these computational predictions, recent binding studies indicate that SARS-CoV-2 binds with high affinity to human ACE27. Thus the SARS-CoV-2 spike appears to be the result of selection on human or human-like ACE2 permitting another optimal binding solution to arise. This is strong evidence that SARS-CoV-2 is not the product of genetic engineering.
The second notable feature of SARS-CoV-2 is a predicted polybasic cleavage site (RRAR) in the spike protein at the junction of S1 and S2, the two subunits of the spike protein (Figure 1b)8,9. In addition to two basic arginines and an alanine at the cleavage site, a leading proline is also inserted; thus, the fully inserted sequence is PRRA (Figure 1b). The strong turn created by the proline insertion is predicted to result in the addition of O-linked glycans to S673, T678, and S686 that flank the polybasic cleavage site. A polybasic cleavage site has not previously been observed in related lineage B betacoronaviruses and is a unique feature of SARS-CoV-2. Some human betacoronaviruses, including HCoV-HKU1 (lineage A), have polybasic cleavage sites, as well as predicted O-linked glycans near the S1/S2 cleavage site.
While the functional consequence of the polybasic cleavage site in SARS-CoV-2 is unknown, experiments with SARS-CoV have shown that engineering such a site at the S1/S2 junction enhances cell–cell fusion but does not affect virus entry10. Polybasic cleavage sites allow effective cleavage by furin and other proteases, and can be acquired at the junction of the two subunits of the haemagglutinin (HA) protein of avian influenza viruses in conditions that select for rapid virus replication and transmission (e.g. highly dense chicken populations). HA serves a similar function in cell-cell fusion and viral entry as the coronavirus S protein. Acquisition of a polybasic cleavage site in HA, by either insertion or recombination, converts low pathogenicity avian influenza viruses into highly pathogenic forms11-13. The acquisition of polybasic cleavage sites by the influenza virus HA has also been observed after repeated forced passage in cell culture or through animals14,15. Similarly, an avirulent isolate of Newcastle Disease virus became highly pathogenic during serial passage in chickens by incremental acquisition of a polybasic cleavage site at the junction of its fusion protein subunits16. The potential function of the three predicted O-linked glycans is less clear, but they could create a “mucin-like domain” that would shield potential epitopes or key residues on the SARS-CoV-2 spike protein. Biochemical analyses or structural studies are required to determine whether or not the predicted O-linked glycan sites are utilized.