An Epidemiological Analysis of Severe Acute Repiratory Syndrome Coronavirus 2 Genome Sequencing: A Hospital-based Retrospective Study
Correspondence Address :
Dr. Pratiti Datta,
Scientist B-Medical, Department of Microbiology, VRDL, Diamond Harbour Government MCH, Diamond Harbour, West Bengal, India.
E-mail: pratitidatta.000@gmail.com
Introduction: Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) is a positive-sense enveloped single-stranded Ribonucleic Acid (RNA) virus. Structural proteins help the virus package its RNA, while other proteins enable viral replication by facilitating host entry. Through constant mutation, the virus changes its emergence profile, and combinations of mutations can lead to increased transmissibility and receptor binding capacity, altering its surface structure. Whole genome sequencing is an important tool for studying these changes. In this study, the authors report on the genome sequencing of patients who tested positive for SARS-CoV-2 using real-time Reverse Transcriptase-Polymerase Chain Reaction (RT-PCR) tests.
Aim: To identify the different lineages circulating in specific districts of West Bengal, India, and perform an epidemiological analysis of the patients to control disease severity during the second wave of Coronavirus Disease-2019 (COVID-19).
Materials and Methods: This retrospective hospital-based study was conducted at the Virus Research and Diagnostic Laboratory (VRDL) at RG Kar Medical College and Hospital, West Bengal, India from January 2021 to October 2021. Data collection started in January 2021 and was analysed in October 2021. Nasopharyngeal and oropharyngeal swabs were taken from patients with SARS-CoV-2, and the samples were tested using RT-PCR. Positive samples were sent to the Regional Virus Research and Diagnostic Laboratory-National Institute of Cholera and Enteric Diseases (NICED) for sequencing. Samples were collected from patients in different districts of West Bengal who were reported to the VRDL of RG Kar Medical College and Hospital. A total of 172,550 samples were tested for SARS-CoV-2, and out of 13,764 positive samples, 230 were sent for genome sequencing. The primary inclusion criteria were SARS-CoV-2 positive patients with CT values between 25 and 30 who were vaccinated. Patient information, address, gene variant, and gene mutation of the samples were analysed. Statistical analysis was performed using t-test with Statistical Package for Social Sciences (SPSS) software.
Results: Whole-genome sequencing helped identify new trends and their prevalence in specific areas, aiding in prevention efforts. The most common type of mutation observed after double vaccination was the delta variant (B.1.617.2), followed by the kappa variant (B.1.617.1) and the alpha variant (B.1.1.7).
Conclusion: Epidemiological genome sequencing studies help to identify emerging and changing viral trends, contributing to the mitigation of the spread of new variants. The delta, kappa, and alpha variants were the three primary sequences discovered in this study. The identification of these lineages facilitates the design of novel vaccines and diagnostic medications. Continuous monitoring and analysis of sequences from new cases in India and other affected countries are crucial for understanding the genetic evolution and substitution rates of SARS-CoV-2.
Epidemiology, Gene sequence, Lineage
The SARS-CoV-2 virus is a positive-sense enveloped single-stranded RNA virus encoded with 29 proteins and host-derived membrane (1). Structural proteins assist in packaging the virus’s RNA, while other proteins enable viral replication by facilitating host entry (1). Among the structural proteins, there are several important proteins: the Spike protein (S), which resembles a crown; the hydrophobic Membrane protein (M); the Envelope protein (E); and the Nucleocapsid protein (N), which functions as an RNA binding protein (2),(3). Through constant mutation, the virus undergoes changes in its emergence profile. While a single mutation may not always benefit the virus, the combination of mutations can lead to increased transmissibility and receptor binding capacity, altering the surface structure (1).
Whole-genome sequencing plays a crucial role in assessing these mutations and understanding the changing trends of the virus. Limited studies have evaluated demographic parameters in relation to various gene mutations of SARS-CoV-2. Particularly for countries like India with a large population base, the variation in illness presentations and genetic alterations has posed challenges for healthcare administration and resource allocation throughout the pandemic. In this study, the authors analysed the diagnostic and genetic diversity to extract relevant data from the confusion surrounding COVID-19 in the Indian context. The objective of the study was to investigate the changing trends of SARS-CoV-2 and the mutational status of different lineages in various districts of West Bengal, India. Analysing the results will aid in identifying the circulating lineage in specific areas of West Bengal and preventing the spread of mutant viruses.
The aim of this study was to identify the different lineages circulating in specific districts of West Bengal and conduct an epidemiological analysis of the patients to control disease severity. The various stages of the COVID-19 pandemic raise different key public health issues, some of which require specific genomic profiling techniques. The primary objective of this study was to enhance diagnosis and develop protective measures. The secondary objective was to evaluate disease epidemiology.
This was a retrospective hospital-based study conducted at the VRDL of RG Kar Medical College and Hospital in West Bengal, India, from January 2021 to October 2021, spanning a period of 10 months. A total of 172,550 samples were tested for SARS-CoV-2. Data collection started on January 1, 2021, and data analysis began at the end of October 2021. Out of the 13,764 positive samples, 230 samples were selected for genome sequencing.
Inclusion criteria: The primary inclusion criteria were SARS-CoV-2 positive patients with a CT value between 25 and 30 who had been vaccinated. Patient information, including age, address, gene variant, and gene mutation, was analysed.
Exclusion criteria: Positive samples of COVID-19 co-infected with other COVID-19 symptoms were excluded from the study. Samples with CT values lower than 25 or higher than 30 were also excluded.
Both nasopharyngeal and oropharyngeal swabs were taken from SARS-CoV-2 positive patients, and the samples were tested using RT-PCR. Positive samples that met the inclusion criteria were sent to the regional VRDL laboratory for sequencing. Samples were collected from patients in different districts of West Bengal who were reported to RG Kar Medical College and Hospital. The samples were stored at -80°C and packed using a triple-layer packaging method before being sent to the regional laboratory for sequencing, maintaining the cold chain.
RNA extraction and sequencing: The types of mutations, age, gender, and distribution area were analysed. Viral nucleic acid was extracted from nasopharyngeal and oropharyngeal swabs using the Magmax Viral extraction kit (Thermo Fisher Scientific) and eluted with a 50 μL elution buffer. Real-time RT-PCR was performed using the covipath real-time PCR kit, targeting the ORF and N regions. For each 96-well plate, one positive control and one negative control were selected. Demographic information was collected from all individuals who tested positive for SARS-CoV-2, including details on vaccination status, previous infection history, co-morbidity status, onset of illness, and hospitalisation status.
Sequence analysis: The results of whole-genome sequencing were analysed, including the patient’s age, address, gene variant, and gene mutation information.
Statistical Analysis
The data analysis was done by t-test with SPSS software version 9.4.
Clinical variables of genome sequencing samples: (Table/Fig 1) shows that almost half of the reported patients were asymptomatic. Among the symptomatic patients, the most common symptoms were loss of smell and taste and low-grade fever. Patients with co-morbidities included 14 (6.08%) with diabetes, 28 (12%) with asthma, 18 (7.8%) with hypertension, and 2 (0.86%) with renal disease (Table/Fig 2). Sixty patients (26.08%) developed COVID-19 after vaccination.
Observed Mutations
The observed mutations were as follows: 1) 501Y.V1; 2) B.1.617.2; 3) B.1.617.1; 4) B.1.1.7; 5) AY.4; 6) AY.3; 7) AY.39; 8) AY.20; 9) AY.114; 10) AY.89; 11) AY.122; 12) AY.102; 13) AY.45; 14) AY.107; 15) Double Mutant (E484Q, L452R). The most common variant was the Delta (B.1.617.2), followed by Kappa (B.1.617.1) and Alpha (B.1.1.7). The variant of concern was the Double Mutant (E484Q, L452R), which was observed in two cases. Some mutations in the spike protein may make the virus more transmissible, cause more severe illness, or evade immunisations. (Table/Fig 3) shows the distribution of the different lineages, with the B.1.617.2 variant being the most prominent.
(Table/Fig 3) shows that the most prevalent lineages are B.1.617.2 and B.1.617.1. (Table/Fig 4) shows that the B.1.617.2 variant has a slightly higher prevalence in males, as well as the 501Y.V1, AY.122, AY.102, and E484Q, L452R variants. The delta variant appears to be more prevalent in males, while the alpha and kappa variants are almost equally distributed among males and females.
Area distribution: The study found that the most common variant was Delta (B.1.617.2), and the Kappa (B.1.617.1) variant in the Paschim Burdwan and Purba Medinipur districts of West Bengal. The delta sub-variant, including 501Y.V1, AY.4, AY.3, AY.39, AY.20, AY.114, AY.89, AY.122, AY.102, AY.45, AY.107, and the Double Mutant (E484Q, L452R), was also notable in this study. (that was mostly found in only the districts of Howrah, Kolkata, and North 24 Parganas. However, one case of the double mutant subvariant of the alpha (B.1.1.7) was found in the Paschim Burdwan district (Table/Fig 5).
Vaccination against SARS-CoV-2 in West Bengal was initiated in January 2021. The present study showed that 60 (26.08%) of vaccinated individuals were infected with COVID-19. The most notable finding of this study was that the vaccinated individuals were infected with the Delta variant and its subvariant.
(Table/Fig 6) show that 501Y.V1, AY.3, AY.89, AY.122, and AY.102 are most prevalent in the middle age group (41-60 years). Ay.45 mutations are most prevalent in the paediatric group, and AY.107 and E484Q, L452R are more prevalent in the 19-40 years age group.
Whole-genome sequencing helps in identifying new trends, their prevalence in a specific area, and prevention (2),(3). During the second wave of COVID-19, there were mainly three strains - alpha, kappa, and delta. One variant of alpha was noticed during the early wave, and various subgroups of delta strains were noticed at the end of the second wave. There are two major lineages, A and B (4),(5),(6). The previous lineage was A and shared two nucleotides (positions 8,782 in ORF1ab and 28,144 in ORF (7). The observed mutations were: 1) 501Y.V1; 2) B.1.617.2; 3) B.1.617.1; 4) B.1.1.7; 5) AY.4; 6) AY.3; 7) AY.39; 8) AY.20; 9) AY.114; 10) AY.89; 11) AY.122; 12) AY.102; 13) AY.45; 14) AY.107; 15) Double Mutant (E484Q, L452R).
The variations were increasing at the late stage of the second wave. The mutation of concern was the Double Mutant (E484Q, L452R). The most common type of mutation noticed after double vaccination was Delta (B.1.617.2), followed by Kappa (B.1.617.1) and Alpha (B.1.1.7). There are various subfamilies of delta variants.
Supporting this study, another study (8) states that decreases in serum antibody titers after vaccination against delta were higher than those against Alpha but lower than those against Beta. The number of doses and the amount of time since vaccination are associated with the decline in the protective efficacy of existing vaccines against the Delta variant. A study (9) suggests that the delta variant significantly reduces effectiveness after one dose of the vaccine compared to those carrying the alpha variant (48.7%; 95% CI, 45.5 to 51.7). The outcomes were comparable for the two vaccines. Two doses of the vaccine were effective for people with the alpha variant, with an effectiveness of 93.7% (95% CI, 91.6 to 95.3). For people with the delta variant, the effectiveness was 88.0% (95% CI, 85.3 to 90.1). For individuals who had received only one dose of the vaccine, the effectiveness against the alpha variant was 74.5% (95% CI, 68.4 to 79.4), and for the delta variant, it was 67.0% (95% CI, 61.3 to 71.8). A study suggests that individuals who were not vaccinated were significantly more likely to become infected, require hospitalisation, and die from COVID-19 compared to those who have received atleast one dose of the vaccine (10). Cox models show that full vaccination provides 88% protection against infection, 94% against hospitalisation, and 95% against death.
The B.1.617.2 lineage was observed in the maximum number of cases, with 156 out of the 230 samples belonging to this lineage. B.1.617.2 may induce cell-cell fusion in the respiratory tract and possibly have higher pathogenicity, even in vaccinated individuals with neutralising antibodies (4). This variant was first identified in India in December 2020. It has a transmissibility that is 40% to 60% higher than other variants (11). Full vaccination reduces the hospitalisation and mortality rate, but the vaccine’s neutralisation capacity was higher for the alpha and beta strains (12). Other subfamilies of the delta variant include AY.4, AY.3, AY.39, AY.20, AY.114, AY.89, AY.122, AY.102, AY.45, AY.107. The study stated that the delta variant spread rapidly across different continents, with higher distribution observed compared to the alpha variant (12).
The delta plus variant (AYAY.4): The Delta plus variant was first found in India in April 2021. It has also been detected in nine other countries, including the USA, UK, Portugal, Switzerland, Japan, Poland, Nepal, Russia, and China. This variant includes an additional transformation called K417 Non in the spike protein. It has been suggested that this change slightly reduces the binding affinity for Angiotensin Converting Enzyme (ACE 2). Currently, it is unclear whether this additional change is causing increased severity, transmissibility, or immune evasion compared to the Delta variant. Eight cases were reported positive for this variant (12),(13).
Subvariant to delta (AYAY.39): The frequency of this subvariant was currently low. Vaccine effectiveness does not seem to be different when compared to other delta variants. Around six cases were reported out of the 230 samples. Subvariant to Delta (AY.3): This is a subvariant of Delta (B.1.617.2). This particular version shares the same spike mutations as the regular Delta version. There are three mutations in the subvariant family: K417N, L452R, and P681R. AY.3 lacks the K417N mutation. Only one case tested positive for this variant (13).
Subvariant of delta (AYAY.20): It possesses two mutations, namely spike L452R and P681R, which make this variant more contagious and enhance its immune escape capacity. Only two cases tested positive for this variant.
Among the sub-Delta family group, one case of AY.114, one case of AY.89, two cases of AY.122, two cases of AY.102, and one case of AY.45 were reported.
The second most common variant was B.1.617.1 (Kappa). Among the 230 samples, 30 samples were B.1.617.1.
In India, the prevalence of B.1.617.1 among the sequenced diseases uploaded to the Global Initiative on Sharing All Influenza Data (GISAID) had increased to approximately 50% in late March 2021 but began to decline in April 2021. The proportion of B.1.617.2 among the sequenced viruses uploaded to GISAID has been increasing since early March and became the dominant variant reported in mid-April 2021.
B.1.617.1 is defined by the spike protein amino acid changes L452R, E484Q, D614G, P681R, and Q1071H (some viruses also carry V382L) (14). This lineage has been classified as a Variant of Interest (VOI) by the European Centre for Disease Prevention and Control (ECDC) and the World Health Organisation (WHO), and as a Variant Under Investigation (VUI) by the UK. B.1.617.2 is characterised by spike protein changes T19R, ?157-158, L452R, T478K, D614G, P681R, and D950N. B.1.617.2 is rapidly expanding within the United Kingdom and has been identified in several other countries worldwide (15). This variant has been classified as a VOI by ECDC and WHO due to its assessed transmissibility being atleast as high as that of VOC B.1.1.7. The next most common variation was B.1.1.7, with 15 tests being positive for it. Alpha variants are associated with epidemics that are rapidly spreading in the UK and other regions (16).
501Y.V1 is the variant of the Alpha virus (B.1.1.7). SARS-CoV-2, or Severe Acute Respiratory Syndrome, is causing widespread illness worldwide (12). A variation of SARS-CoV-2 (20I/501Y.V1) recently discovered in the United Kingdom has a single change from N501 to Y501 in the receptor-binding domain (Y501-RBD) of the Spike protein of the virus (13).
This variant is much more contagious than the original version (N501-RBD) (13). A mutated version of the RBD binds to human ACE2 approximately 10 times more tightly than the native version (N501-RBD) (12). Modeling analysis showed that the N501Y mutation could potentially allow an aromatic ring-ring interaction and an additional hydrogen bond between the RBD and ACE2 (12). Sera from people immunised with the Pfizer-BioNTech antibody still effectively block the binding of Y501-RBD to ACE2, although with slightly compromised efficacy compared to their ability to inhibit binding to ACE2 of N501-RBD. This raises concerns about the effectiveness of therapeutic anti-RBD antibodies used to treat COVID-19 patients. However, a therapeutic antibody, Bamlanivimab, still binds to Y501-RBD as efficiently as it binds to N501-RBD. The variant may be associated with an increased risk of death compared to other variants. The viral load of this variant is 3-10 times higher than another variant (6).
E484Q: This amino acid change has not been associated with any alteration in receptor binding, unlike the E484K mutation found in VOCs B.1.351 and P.1, which have been linked to immune evasion and potentially reduced antibody effectiveness (17). According to research published in a preprint by Chen Z et al., the E484Q mutation may reduce clinical efficiency (5). Another preprint by Ranjan P et al., found lower binding potency against the antibody (CR3022) and higher affinity for the ACE2 receptor with the E484Q and L452R mutations compared to the wild-type (not characterised), suggesting decreased antibody effectiveness (18).
Jin JM et al., concluded that there is emerging evidence of increased B.1.1.7 transmissibility and they discovered increased viral load as a proxy for B.1.1.7 in the data. In this hospitalised population, they found no link between the variation and severe illness (19). Studies in hamsters have shown high viral shedding of alpha variant viruses, and increased viral load may partially explain the increased infection rate among humans (20).
Youk J et al., compared the two variants and found that B.1.617.2 had fitness benefits across physiologically related systems, including HAE and 3D airway organoids, compared to B.1.1.7 (17),(21). Yadav PD et al., hypothesised that B.1.617.2 raw virus particles contained spikes that were cleaved at a higher rate than B.1.1.7, which was assumed to be involved in the mechanism of increased infectivity. They also observed that B.1.617.2 has more replication and spike intrusion than B.1.617.1. This may explain the advantages of B.1.617.2 (6).
Gender distribution among the various lineages: The study evaluated that among 230 cases, 90 cases of the B.1.617.2 (delta variant) were positive in male patients and 66 cases were positive for female cases. Among the 30 positive samples of the kappa variant (B.1.617.1), 15 cases were detected in males and 15 cases were detected in females. Fifteen cases of the Alpha variant (B.1.17) were positive, with eight cases in males and seven cases in females. The subfamilies of the Alpha variant (501Y.V1) were reported in two cases, predominantly in male patients. Among all the subfamilies of the delta variant, AY.4 was predominant in females. AY.3 variants were found in one male sample. AY.39 variants were predominant in males, AY.114 was found in only one female sample, AY.102 variants were found in two male cases, and AY.122 was reported in two male cases. This study report shows that males are more affected by delta variants compared to females.
Jin JM et al., concluded that, according to the clinical severity classification, men were more likely to develop more severe cases than women, and among the deceased patients, the number of men is 2.4 times that of women (19). Women and men were equally susceptible, but males died at a higher rate (22).
Demographic variation of various lineages: The maximum number of delta variants were reported in Purba Medinipur, Paschim Burdwan, and North 24 Parganas. Out of 156 samples, 15 samples were from Howrah and 16 from Kolkata, 2 samples were positive from Hooghly, and three cases were from South 24 Parganas. The B.1.1.7 lineage was reported only in Purba Medinipur and Paschim Burdwan. All delta subvariants were reported in Kolkata and Howrah. Of the total delta variant cases, 31% affected individuals aged 19-40 years, 18% affected individuals aged 41 to 60 years, 9.1% affected individuals older than 60 years, and 9.1% affected paediatric patients less than 18 years. Among the total delta variant cases, 78% were in the age group of 19-40 years. The 1.617.1 lineage affected 6% in the 19-40 years age group, 5% in the 41-60 years age group, 0.8% in the older age group, and 0.4% in the younger paediatric and adolescent age group. Among the total cases of 1.617.1, 50% were in the younger age group of 19-40 years, and paediatric cases were the least. B.1.1.7 equally affected the middle age group. Only two cases were reported in the older age group, and no cases were reported in the paediatric and adolescent age group. The sub-delta variant was observed in the 20 and older age group, but only one AY case. A total of 45 variants were reported in the paediatric age group. WHO has marked the delta variant as a concerning variant among other COVID-19 variants during the late stage of the 2nd wave, as there was a rapid increase in this variant. More than 77% of delta subvariant infections occurred after completing the vaccination schedule. Out of 230 cases, only 16 cases required hospitalisation, and two deaths were reported. These statistics signify that although the delta variant affects vaccinated individuals, vaccination significantly reduces the mortality and morbidity rate. The present study suggested that the B.1.617.1 lineages were highest at the end of February 2021 and periodically declined by the end of April 2021.
Patel SK et al., stated that circulatory levels of ACE2 have been shown to be higher in men than in women and in patients with diabetes or cardiovascular disease (21).
This study provided a proper view of the various mutations of SARS-CoV-2 during the 2nd wave. A future study is needed with phylogenetic analysis. The incredible progress being made in genomics and the lessons learned from the fight against SARSCoV-2 have the potential to significantly reduce human dangers in the future and improve preparedness for epidemics.
Limitation(s)
The need for extensive research and meta-analysis was subsequently recognised, and the important results from those studies were made available to the general public in the form of illustrious articles in order to justify the cause of viral spread, potential preventive measures, and future approaches to be adopted. Changes are required in order to examine increasingly massive datasets quickly during public health emergencies and, where possible, to increase the amount of automation.
In this work, the entire SARS-CoV-2 genome from 230 samples was analysed between January 2021 and October 2021. Numerous mutations in the alpha, beta, kappa, and delta lineages have been shown to be in circulation. Among all these lineages, Delta was the most prevalent. The epidemiological genome sequencing study helps to highlight the emerging changes in viral trends and aids in the prevention of newly arrived variants. So far, managing the pandemic has faced significant difficulties that have been partially overcome by studying the virus through genomic sequencing.
DOI: 10.7860/JCDR/2023/64648.18782
Date of Submission: Apr 11, 2023
Date of Peer Review: Jun 22, 2023
Date of Acceptance: Oct 06, 2023
Date of Publishing: Dec 01, 2023
AUTHOR DECLARATION:
• Financial or Other Competing Interests: None
• Was Ethics Committee Approval obtained for this study? No
• Was informed consent obtained from the subjects involved in the study? Yes
• For any images presented appropriate consent has been obtained from the subjects. Yes
PLAGIARISM CHECKING METHODS:
• Plagiarism X-checker: Apr 14, 2023
• Manual Googling: Jul 06, 2023
• iThenticate Software: Oct 03, 2023 (12%)
ETYMOLOGY: Author Origin
EMENDATIONS: 8
- Emerging Sources Citation Index (Web of Science, thomsonreuters)
- Index Copernicus ICV 2017: 134.54
- Academic Search Complete Database
- Directory of Open Access Journals (DOAJ)
- Embase
- EBSCOhost
- Google Scholar
- HINARI Access to Research in Health Programme
- Indian Science Abstracts (ISA)
- Journal seek Database
- Popline (reproductive health literature)
- www.omnimedicalsearch.com