The human genome project, which published its results 20 years ago last month, was a landmark in biology.
人類基因組計劃于上個月公布了20年前的成果, 這是生物學上的一個里程碑。
It was also somewhat misleadingly named.
它的名字也有些誤導人。
After all, there is no such thing as “the” human genome.
畢竟,不存在所謂的“人類基因組”。
Instead, there are 8bn individual humans, each of whom share the vast majority of their DNA—but not all of it.
相反,人類有80億人,每個人的絕大部分DNA都是相同的,但不是全部。
The genome published by the Human Genome Project in 2003 was put together from a dozen anonymous blood donors in and around Buffalo, in New York state.
2003年的人類基因組計劃公布的基因組是通過紐約州布法羅及其周邊地區的十幾個匿名獻血者收集而成的。
But there is more to life than Buffalo.
但除了布法羅的基因組,其他生命還有更多基因組。
That, in essence, is the motive behind the publication this week, in Nature, of a set of 47 new “reference” genomes taken from individuals on four continents (Africa, both of the Americas, and Asia).
從本質上講,這就是本周《自然》雜志發表的47個新“參考”基因組的動機,這些基因組來自四大洲(非洲、兩個美洲和亞洲)的個體。
The idea of the Human Pangenome Project, the organisation behind the publications, is that rather than relying on a single “reference” genome, it would be better to have several, and to ensure that between them they capture as much of the genetic diversity of Homo sapiens as possible.
這些出版物背后的組織“人類泛基因組計劃”的想法是,與其依賴單一的“參考”基因組,不如多找幾個,并確保在它們之間盡可能多地捕捉到智人的遺傳多樣性。
Compared with the total size of the genome, the amount of diversity in question is small.
與基因組的總大小相比,需要討論的多樣性數量很小。
Two people picked at random will share around 99.6% of their DNA.
隨機挑選的兩個人的DNA大約有99.6%是相同的。
That similarity is why the original genome produced by the Human Genome Project has proved so useful.
正是這種相似性,證明了人類基因組計劃產生的原始基因組如此有用。
Its annotated strings of genetic code serve as a baseline.
它將帶有注釋的遺傳密碼串作為基線。
Other genomes can be compared with it to look for variations, whether harmful or beneficial.
其他基因組可以與它進行比較,以尋找有害或有益的變異。
Yet although humans are mostly alike, their differences do matter.
然而,盡管人類在很大程度上是相似的,他們的差異仍然很重要。
A relatively recent mutation, for instance, means adults with ancestors from northern Europe, or some parts of India and the Middle East, are more likely to be able to digest lactose (a sugar found in milk) than those from elsewhere.
例如,一個相對較新的突變意味著,祖先來自北歐、印度和中東部分地區的成年人比來自其他地方的人更有可能消化乳糖(牛奶中發現的一種糖)。
Which variation deserves to be treated as the standard?
哪一種變體應該被視為標準?
Sometimes, the limits of using a single reference have direct medical consequences.
有時,使用單一參考的限制會產生直接的醫療后果。
A set of genes called HLA, for instance, are involved in running the immune system.
例如,一組被稱為HLA的基因參與了免疫系統的運行。
They are highly variable, and mutations in them have been associated with autoimmune diseases such as type-1 diabetes.
它們是高度可變的,它們的突變與自身免疫性疾病(如1型糖尿病)有關。
One study, published in 2015, found that, because many gene-sequencing technologies are not perfectly accurate, comparing readouts from the region with the single reference genome led to mistakes around 20% of the time.
2015年發表的一項研究發現,由于許多基因測序技術并不完全準確,將一個區域的讀數與單個參考基因組的讀數進行比較會導致20%左右的誤差。
Another paper, published in 2022, found that relying on the reference genome meant that the details of some gene variants found in people with African ancestry, and seemingly associated with cancer, are poorly understood.
2022年發表的另一篇論文發現,依賴該參考基因組意味著,人們對非洲血統人群中發現的一些基因變異的細節知之甚少,這些基因變異似乎與癌癥有關。