Aggregated News

graph

More than one million people have now had their genome sequenced, or its protein-coding regions (the exome). The hope is that this information can be shared and linked to phenotype — specifically, disease — and improve medical care. An obstacle is that only a small fraction of these data are publicly available.

In an important step, we report this week the first publication from the Exome Aggregation Consortium (ExAC), which has generated the largest catalogue so far of variation in human protein-coding regions. It aggregates sequence data from some 60,000 people. Most importantly, it puts the information in a publicly accessible database that is already a crucial resource (http://exac.broadinstitute.org).

There are challenges in sharing such data sets — the project scientists deserve credit for making this one open access. Its scale offers insight into rare genetic variation across populations. It identifies more than 7.4 million (mostly new) variants at high confidence, and documents rare mutations that independently emerged, providing the first estimate of the frequency of their recurrence. And it finds 3,230 genes that show nearly no cases of...