Whole Genome Sequencing Used in Human Trait Prediction

Scientists from Human Longevity Inc. and Craig Venter Institute report that they have developed methods for predicting human traits by using whole-genome sequencing data. The traits they used are 3D facial structure, skin color, eye color, height, weight, biological age, and body mass index (BMI). For the study, they collected samples from 1,061 ethnically diverse populations from San Diego, California. The ethnic groups included African, Latino, East Asian, and South Asian. The study also included individuals from 18 years to 82 years of age with an average age of 36 years. Genomes of those 1,061 individuals were sequenced, and phenotypic data from all individuals were also collected. They used a two-step approach to match the genomic data to phenotypic data by first mapping of the phenotypes and genomes followed by using the statistical functions i.e. learning embeddings and learning similarity functions to create matches between the phenotype and genomic data.

In their study, scientists reported predictive models for the aforementioned traits. Although the statistical power of the analysis was limited due to the small sample size, they reported a strong prediction match. They also reported that when multiple prediction models were combined, the match between the phenotypic traits and genomic data was more accurate than using the single model. The researchers are very optimistic that in the future, when more studies are conducted with larger sample size, that greater precision of prediction of phenotypic traits based on genomic data can be achieved. One of the major implications of this type of study is that it can be used in forensic science to identify unknown criminals and victims. However, the authors also cautioned that if the data is used in an unethical way, the privacy of an individual could be compromised.