COVID-19: How scientists found the fingerprint behind South Africa’s SARS-CoV-2 virus

COVID-19: How scientists found the fingerprint behind South Africa’s SARS-CoV-2 virusCross-section of a cell. PHOTO: NHGRI
News & Features

A team of local scientists have successfully put together the first genetic fingerprint – or genome sequence – of SARS-CoV-2 (the virus that causes COVID-19) found in South Africa. Together with the National Institute of Communicable Diseases (NICD), the University of the Western Cape’s South African National Bioinformatics Institute (SANBI) cracked the code that could unlock the origins of the country’s outbreak, and help healthcare workers and government better track and trace the spread of the virus.

Spotlight spoke to Peter van Heusden, SANBI researcher and co-author of the new report on the sequence, and Dr Mushal Allam, a medical scientist with the NICD’s Centre for Respiratory Diseases and Meningitis, who worked on the sequencing, about what this accomplishment means for the future of South Africa’s fight against the SARS-CoV-2 virus.

“It’s been very fulfilling to be able to use these skills and to try and proactively assist with the outbreak,” van Heusden said. “This is a whole society problem that we have to face with every set of skills that we’ve got. It did feel really fulfilling that this training I’ve put myself through for several years was able to be used for something more than making a research paper.”

“I’ve been very actively networking with my colleagues around the continent and around the world to make sure information flows as quickly as possible, in fact as soon as this hit the radar screens in January, I was already talking to [Allam] asking when we are going to be able to sequence this thing,” he said.

First the background

If you took biology in high school, you might remember learning about DNA and RNA. DNA being our genetic material, or basically the book containing all of our traits (like eye and hair colour) and RNA being the messenger between our DNA and the protein factories in our cells called ribosomes.

DNA is stored in the cell’s nucleus, kind of like the brain of the cell. The RNA or messenger RNA (mRNA) takes information from our DNA in the nucleus, straight to the ribosomes which are in the cytoplasm (that yellow looking solution in our cells). Think of the RNA as someone delivering specific instructions to the ribosome’s factory workers on how to make proteins that our bodies need.
Normally, our cells are constantly producing new, good proteins, after all, that is the ribosome’s job. However, when we become infected with a virus like SARS-CoV-2, something different happens in our cells. SARS-CoV-2 is a single strand RNA, and when it reaches our cells, it sneaks into the cytoplasm and behaves just like mRNA. But, it delivers bad instructions to our ribosome protein factories, resulting in the production of proteins that make us sick.
SARS-CoV-2 may act like messenger RNA, but it delivers the wrong message to our bodies. This is what makes the virus so dangerous.

How to sequence a genome

For this genome, the NICD used a sample from one of South Africa’s first confirmed cases infected with the SARS-CoV-2 virus, a patient from KwaZulu-Natal who had returned from Italy. In the sample, scientists had both human DNA and RNA as well as the SARS-CoV-2 RNA. In order to sequence the virus, they first had to separate it from everything else.

“Currently we’re using a big, difficult method to sequence the virus,” said Allam.
Using what are called sequencing kits, Allam and his colleagues had to separate the human DNA from the virus, and deplete the human RNA, leaving just SARS-CoV-2 to work with.

Afterwards, the virus RNA is checked for quality, and then the sequencing process begins. With the NICD’s current technology using an Illumina sequencer (Illumina is a brand name), the sequence is delivered in small parts of about 150 base letters at a time. After the sequencing is finished, scientists have to assemble the sequence to construct the 29 000 base letter RNA-strand of the virus.

“This bug is not easy to sequence,” Allam told Spotlight, adding that most of the reads are contaminated by human DNA and RNA. Allam said that since the outbreak, the NICD has prioritised two sequencing machines just for SARS-CoV-2. With this technology it can take 5 to 6 days to sequence one sample at a price of R7 000 per sample.

The future of SARS-CoV-2 genome sequencing in SA

According to Allam South Africa was well-equipped to sequence during the pandemic in terms of facilities, but cost, timeliness and the procurement of SARS-CoV-2 sequencing kits were big challenges.
However, he said that work is underway to cut the cost of sequencing for one sample down to between R400 and R500, and that new, more time-efficient sequencing machines were in South Africa’s very near future.

“The WHO and the Africa CDC labs have asked [the NICD] to sequence some genomes from African countries which don’t have these technologies,” said Allam.
Other countries including the Democratic Republic of Congo, Nigeria and Senegal have also sequenced SARS-CoV-2 genomes.
According to the Global Initiative on Sharing All Influenza Data’s (GISAID) website, over 5000 SARS-CoV-2 genomes have been sequenced to date, and about 62 are from Africa, said Allam.

Using genomics to map a pandemic

The sequencing data compiled by the NICD was shared with a project called Nextstrain, which then created what’s called a phylogenetic tree, or a family tree of viruses. This tree can help scientists better understand where each of these genomes come from and how they are related.
For example, South Africa’s SARS-CoV-2 genome sequence has six unique differences compared to the original genome from Wuhan, and scientists can now see, using phylogenetics, that it’s likely that South Africa’s SARS-CoV-2 came from Europe or North America, which has very similar sequences.

By constructing this so-called family tree, not only can scientists track the origins of an outbreak, but they can develop a better sense of what the pandemic looks like overall.

Mutations don’t mean different strains

When talking about differences in the sequence, otherwise called mutations, it’s important to note that this does not mean there are different strains of the virus.
“Differences seen in the collection on GISAID are minor, less than 100 bases across the whole 29 kilobase genome. There have been some more dramatic differences noted, such as deletions in parts of the genome, but these mostly seem to be related to culturing the virus in cells,” said van Heusden.

Remember, SARS-CoV-2 is a single RNA-strand made up of 29 000 base letters.
Three of these letters at a time make up a single amino acid, and a group (or chain) of amino acids make a protein. But, different combinations of these three letters doesn’t mean a different amino acid will be produced. For example, if you have two parts of red paint and one part yellow, by mixing them together you will still get the colour orange, no matter what order you mix them together.

For South Africa’s SARS-CoV-2 genome, there were only two instances where different amino acids (or paint colours) were produced (this is called non-synonymous), but they showed no significant difference to protein structure – in other words – how the virus affects our bodies.
Taking all of this into consideration, with the data that’s available, van Heusden said scientists are confident that currently there is no other strain of SARS-CoV-2.

Can sequencing be used to help create a vaccine?

“The genomics let us understand [the virus’] proteins,” said van Heusden.
“We can understand the life cycle, though it’s not strictly alive, and we can understand the way that its proteins work and interact with our proteins, and that’s important for [treatment] therapies. We can also understand from the diversity of genomics, which parts of the virus are stable and which parts are not. When you’re making a vaccine that’s very important information.”

Van Heusden said that Moderna, the first company to do a human clinical trial to protect against SARS-CoV-2, was using a new approach to making a vaccine.
“Normally what you do is use a weakened or killed version of the virus, or just bits, so actual virus. What Moderna is doing is actually making their vaccine [so that it] would actually inject RNA into a cell which will be made into [a new kind of] protein, which will then [cause] an immune-response. It won’t be like an active virus, it’s just part of the virus that will be made by the vaccine.”
This is called an mRNA vaccine.

Van Heusden noted that while sequencing is invaluable information for vaccine research and production, the primary use is to assist in tracing the virus and mapping the pandemic as a whole. He added that the sequencing of South Africa’s SARS-CoV-2 virus could help the government to improve contact tracing and testing.

*Like what you reading? Sign up for our newsletter and stay informed.