At the heart of each coronavirus is its genome, a twisted strand of nearly 30,000 “letters” of RNA. These genetic instructions force infected human cells to assemble up to 29 kinds of proteins that help the coronavirus multiply and spread.
Diagram of the
CORONAVIRUS
GENOME
CORONAVIRUS
RNA genome
Start of
genome
30,000
RNA letters
Diagram of the
CORONAVIRUS
GENOME
RNA
genome
CORONAVIRUS
Start of
genome
30,000
RNA letters
CORONAVIRUS
RNA
genome
Diagram of the
CORONAVIRUS
GENOME
Start of
genome
30,000
RNA letters
As viruses replicate, small copying errors known as mutations naturally arise in their genomes. A lineage of coronaviruses will typically accumulate one or two random mutations each month.
Some mutations have no effect on the coronavirus proteins made by the infected cell. Other mutations might alter a protein’s shape by changing or deleting one of its amino acids, the building blocks that link together to form the protein.
Through the process of natural selection, neutral or slightly beneficial mutations may be passed down from generation to generation, while harmful mutations are more likely to die out.
Mutations In the B.1.1.7 Lineage
A coronavirus variant first reported in Britain has 17 recent mutations that change or delete amino acids in viral proteins.
The variant was named Variant of Concern 202012/01 by Public Health England, and is part of the B.1.1.7 lineage of coronaviruses.
Amino acid
deletion
Outer ring:
CORONAVIRUS
GENOME
Amino acid
deletions
B.1.1.7
CORONAVIRUS
Connecting rays:
MUTATIONS
Red letters:
AMINO ACID
SUBSTITUTIONS
Start of
genome
30,000
RNA letters
Amino acid
deletion
Outer ring:
CORONAVIRUS
GENOME
Amino acid
deletions
RNA
genome
B.1.1.7
CORONAVIRUS
Rays:
MUTATIONS
Red letters:
AMINO ACID
SUBSTITUTIONS
Start of
genome
30,000
RNA letters
Amino acid
deletion
Amino acid
deletions
B.1.1.7
CORONAVIRUS
MUTATIONS
Red letters:
AMINO ACID
CHANGES
Start of
genome
30,000
RNA letters
Notable mutations in the B.1.1.7 lineage are listed below. Six other mutations, not shown in the diagram above, do not change an amino acid.
Eight Spike Mutations
Researchers are most concerned about the eight B.1.1.7 mutations that change the shape of the coronavirus spike, which the virus uses to attach to cells and slip inside.
Each spike is a group of three intertwined proteins:
Building one of these spike proteins typically takes 1,273 amino acids, which can be written as letters:
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
Spike proteins in the B.1.1.7 lineage have two deletions and six substitutions in this sequence of amino acids.
H69–V70 deletion
Coronavirus
spike gene,
B.1.1.7
lineage
N501Y mutation
Written as letters, a B.1.1.7 spike protein looks like this:
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAI[Deletion]SGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGV[Deletion]YHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPT[Y]GVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDI[D]DTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNS[H]RRARSVASQSIIAYTMSLGAENSVAYSNNSIAIP[I]NFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDIL[A]RLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITT[H]NTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
These mutations alter the shape of the spike protein by changing how the amino acids fold together into a complex shape.
The Spike N501Y Mutation
Scientists suspect that one mutation, called N501Y, is very important in making B.1.1.7 coronaviruses more contagious. The mutation’s name refers to the nature of its change: the 501st amino acid in the spike protein switched from N (asparagine) to Y (tyrosine).
H69–V70 deletion
Coronavirus
spike gene,
B.1.1.7
lineage
N501Y mutation
The N501Y mutation changes an amino acid near the top of each spike protein, where it makes contact with a special receptor on human cells.
Location of the N501Y mutation
on one of the three spike proteins
Location of the N501Y mutation
on one of the three spike proteins
Because spike proteins form sets of three, the mutation appears in three places on the spike tip:
Top view of the coronavirus spike,
showing the N501Y mutations
Top view of the coronavirus spike,
showing the N501Y mutations
In a typical coronavirus, the tip of the spike protein is like an ill-fitting puzzle piece. It can latch onto human cells, but the fit is so loose that the virus often falls away and fails to infect the cell.
The N501Y mutation seems to refine the shape of the puzzle piece, allowing a tighter fit and increasing the chance of a successful infection.
Receptor on
a human cell
N501Y
Not attached
to receptor
N501Y
Attached
Receptor on
a human cell
N501Y
Not attached
to receptor
N501Y
Attached
Researchers think the N501Y mutation has evolved independently in many different coronaviruses lineages. In addition to the B.1.1.7 lineage, it has been identified in variants from Australia, Brazil, Denmark, Japan, the Netherlands, South Africa, Wales, Illinois, Louisiana, Ohio and Texas.
In addition to N501Y, the B.1.1.7 has 16 other mutations that might benefit the virus in other ways. It’s also possible that they might be neutral mutations, which have no effect one way or the other. They may simply be passed down from generation to generation like old baggage. Scientists are running experiments to find out which is the case for each mutation.
The Spike H69–V70 Deletion
H69–V70 deletion
Coronavirus
spike gene,
B.1.1.7
lineage
N501Y mutation
One mysterious mutation in the B.1.1.7 lineage deletes the 69th and 70th amino acids in the spike protein. Experiments have shown that this deletion enables the coronavirus to infect cells more successfully. It’s possible that it changes the shape of the spike protein in a way that makes it harder for antibodies to attach.
Location of the H69–V70 deletion
Location of the H69–V70 deletion
Researchers call this a recurrent deletion region because the same part of the genome has been repeatedly deleted in different lineages of coronaviruses. The H69–V70 deletion also occurred in a variant that infected millions of mink in Denmark and other countries. Scientists are beginning to identify a number of these regions, which may play an important role in the virus’s future evolution.
The Spike Y144/145 Deletion
Y144/145
deletion
Coronavirus
spike gene,
B.1.1.7
lineage
N501Y mutation
In another recurrent deletion region, a number of coronavirus lineages are missing either the 144th or 145th amino acid in the spike protein. The name of the mutation comes from the two tyrosines (Y) that are normally in those positions in the protein.
Like the H69–V70 deletion, Y144/145 occurs on the edge of the spike tip. It may also make it harder for antibodies to stick to the coronavirus.
Location of the Y144/145 deletion
Location of the Y144/145 deletion
The Spike P681H Mutation
H69–V70 deletion
Coronavirus
spike gene,
B.1.1.7
lineage
P681H mutation
This mutation changes an amino acid from P to H on the stem of the coronavirus spike:
Location of the P681H mutation
Location of the P681H mutation
When spike proteins are assembled on the surface of a coronavirus, they’re not yet ready to attach to a cell. A human enzyme must first cut apart a section of the spike stem. The P681H mutation may make it easier for the enzyme to reach the site where it needs to make its cut.
Like N501Y, the P681H mutation has arisen in other coronavirus lineages besides B.1.1.7. But it’s rare for one lineage to carry both mutations.
The ORF8 Q27stop Mutation
Q27stop mutation
R52I mutation
ORF8 is a small protein whose function remains mysterious. In one experiment, scientists deleted the protein and found that the coronavirus could still spread. That suggests that ORF8 is not essential to replication, but it might still give some competitive edge over mutants that have lost the protein.
ORF8 is typically only 121 amino acids long:
MKFLVFLGIITTVAAFHQECSLQSCTQHQPYVVDDPCPIHFYSKWYIRVGARKSAPLIELCVDEAGSKSPIQYIDIGNYTVSCLPFTINCQEPKLGSLVVRCSFYEDFLEYHDVRVVLDFI
The ORF8 protein
But a B.1.1.7 mutation changes the 27th amino acid from Q to a genetic Stop sign:
MKFLVFLGIITTVAAFHQECSLQSCT[Stop]
When the infected cell builds the ORF8 protein, it stops at this mutation and leaves a stump only 26 amino acids long:
Areas removed by the Q27stop mutation
Researchers assume that this ORF8 stump cannot function. But if losing the protein leaves B.1.1.7 at a disadvantage, it’s possible that the advantages of another mutation like N501Y might make up for the loss.
Two other B.1.1.7 mutations appear in ORF8 after the stop point, changing R to I and Y to C:
HQPYVVDDPCPIHFYSKWYIRVGA[I]KSAPLIELCVDEAGSKSPIQ[C]IDIGNYTVSCLPFTINCQEPKLGSLVVRCSFYEDFLEYHDVRVVLDFI
Q27stop mutation
R52I mutation
Because the ORF8 protein is cut short, these two mutations may do nothing.
Detection and Spread
B.1.1.7 first came to light in the United Kingdom in late November. Researchers looked back at earlier samples and found that the first evidence dates back to Sept. 20, in a sample taken from a patient near London.
The B.1.1.7 lineage has now been detected in over 50 countries, including the United States. Britain has responded to the surge of B.1.1.7 with stringent lockdowns, and other countries have tried to prevent its spread with travel restrictions.
The B.1.1.7
coronavirus
lineage
B.1.1.7 detected,
typically in a traveller
Local transmission
The B.1.1.7
coronavirus
lineage
B.1.1.7 detected
Local transmission
The B.1.1.7
coronavirus
lineage
B.1.1.7 detected
Local transmission
The B.1.1.7
coronavirus
lineage
B.1.1.7 detected
Local transmission
B.1.1.7 is estimated to be roughly 50 percent more transmissible than other variants. Federal health officials warn that it may become the dominant variant in the United States by March. It is no more deadly than other forms of the coronavirus. But because it can cause so many more infections, it may lead to many more deaths.
B.1.1.7 lineage
detected
B.1.1.7 lineage
detected
B.1.1.7 lineage
detected
B.1.1.7 has been detected in at least 14 states, but the United States has no national surveillance program for determining the full extent of its spread.
How Did the Variant Evolve?
A number of researchers suspect that B.1.1.7 gained many of its mutations within a single person. People with weakened immune systems can remain infected with replicating coronaviruses for several months, allowing the virus to accumulate many extra mutations.
When these patients are treated with convalescent plasma, which contains coronavirus antibodies, natural selection may favor viruses with mutations that let them escape the attack. Once the B.1.1.7 lineage evolved its battery of mutations, it may have been able to spread faster from person to person.
Other Mutations in Circulation
One of the first mutations that raised concerns among scientists is known as D614G. It emerged in China early in the pandemic and may have helped the virus spread more easily. In many countries, the D614G lineage came to dominate the population of coronaviruses. B.1.1.7 descends from the D614G lineage.
Coronavirus
spike gene,
D614G
lineage
D614G mutation
A more recent variant detected in South Africa quickly spread to several other countries. It is known as 501Y.V2 and is part of the B.1.351 lineage. This variant has eight mutations that change amino acids in the spike protein. Among these mutations is N501Y, which helps the spike latch on more tightly to human cells.
L18F mutation
Coronavirus
spike gene,
501Y.V2
variant
N501Y mutation
None of these variants are expected to help the coronavirus evade the many coronavirus vaccines in clinical trials around the world. Antibodies generated by the Pfizer-BioNTech vaccine were able to lock on to coronavirus spikes that have the N501Y spike mutation, preventing the virus from infecting cells in the lab.
Experts stress that it would likely take many years, and many more mutations, for the virus to evolve enough to avoid current vaccines.
Sources: Andrew Rambaut et al., Virological; Andrew Ward, Scripps Research; Trevor Bedford, nextstrain.org; Paul Duprex, University of Pittsburgh School of Medicine; Houriiyah Tegally et al., medRxiv; Nature; Centers for Disease Control and Prevention; Global Report Investigating Novel Coronavirus Haplotypes. Spike models from Ward Lab, Scripps Research. Spike-receptor model by Cong Lab, Chinese Academy of Sciences. ORF8 model by the Yang Zhang Research Group, University of Michigan. Cahill-Keyes map projection by Gene Keyes.