The Y chromosome’s contributions to human health are largely unknown.
A team of scientists — including researchers from the Clemson University Center for Human Genetics and the Department of Genetics and Biochemistry – has fully sequenced multiple Y chromosomes from populations around the world. The research provides an important foundation for future studies on how the Y chromosome may contribute to certain disorders and diseases.
“The ability to effectively assemble the complete human Y chromosome has been a long-awaited, yet crucial milestone toward understanding the full extent of human genetic variation,” said Miriam Konkel, a member of the Clemson Center for Human Genetics and an assistant professor in the Department of Genetics and Biochemistry.
“It also provides the starting point to associate Y-chromosomal sequences to specific human traits, evolution and disease,” she said.
The human genome was sequenced for the first time 20 years ago.
However, the reference sequence of the Y chromosome remained largely incomplete because it contains a large proportion of complex repetitive and highly similar segmental duplications, making it exceptionally difficult to fully sequence and assemble.
The Telomere-to-Telomere (T2T) Consortium shared in April 2022 the first assembly of a Y chromosome from telomere to telomere (end to end) from a single individual of European descent. Telomeres are DNA-protein structures that cap and protect the end of chromosomes, similar to an aglet on the end of shoelaces.
Building upon that historic achievement, members of the Human Genome Structural Variation Consortium, including Konkel and Clemson postdoctoral fellow Mark Loftus, assembled and characterized 43 additional human Y chromosomes. The additional chromosomes came from diverse genetic backgrounds with individuals representing the five continental populations. About half came from Africa, including some of the oldest human lineages.
The work revealed the highly variable nature of Y chromosomes across individuals.
Standard short-read genomic sequencing technologies require breaking genomic DNA into short fragments of no more than 250 base-pairs. These fragments are then reassembled into the full genome of more than 3 billion base pairs across 46 chromosomes in humans. The method is very accurate and works well for most, but not all, of the genome.
Almost all “complete” human chromosome sequences, including those found in the current human reference genome sequence (known as GRCh38), are only about 90% complete, because it is difficult to assemble the highly repetitive and other complex sections accurately. GRCh38 falls particularly short for the Y chromosome, as nearly half of the Y chromosome is unresolved/assembled.
As a result, while the much larger and gene-rich other sex chromosome—the X chromosome—has been extensively studied, the Y chromosome has been often overlooked outside of male-based fertility studies.
The work by Konkel, Loftus and others revealed a full picture of the Y chromosome’s key characteristics and differences between individuals for the first time.
“The combination of more accurate assembly algorithms and longer reads was the key,” Loftus said.
Konkel and Loftus studied one of the most difficult regions of the Y chromosome to assemble, the large Yq12 heterochromatin region.
Of note is the striking variation in size and structure across the 43 Y chromosomes sequenced that covered 180,000 years of human evolution and range from 45.2 million to 84.9 million base pairs in length.
The diversity of human Y lineages allowed the researchers to redefine inter-chromosomal region boundaries and identify large-scale variations at an unprecedented resolution and clarity.
The study also revealed an unexpected degree of structural variation across the Y chromosomes. For example, half of the euchromatin (gene-rich region) of the sequenced chromosomes carries large recurrent inversions—segments that contain the same nucleotide sequences but oriented in the opposite direction—at a rate much higher than anywhere else in the genome.
The study further identified regions of the Y chromosome that demonstrate little single nucleotide variation but show high gene copy number variation for specific gene families. Other gene families tended to maintain their copy numbers, however, consistent with their roles in fertility and normal development.
“Having fully resolved Y chromosome sequences from multiple individuals is essential in order for us to begin to understand how this variation can affect function,” said Pille Hallast, an associate research scientist at the Jackson Laboratory for Genomic Medicine, who co-authored the study with Loftis and Peter Ebert, a researcher at Heinrich Heine University.
Hallast continued, “The degree of structural variation between individuals came as a big surprise to me, even though the nucleotide sequences within the Y chromosome genes are comparatively conserved. The variable gene copy numbers in certain gene families and extremely high inversion rates are almost certain to hold significant biological and evolutionary roles.”
The Y chromosome’s contributions to male health are poorly understood.
“Now that we have these high quality assemblies and appreciate the diversity, we can ask the question of what role do certain genes on the Y chromosome play in diseases that are going beyond male fertility,” Konkel said.
Two recent research studies collectively implicate the Y chromosome in aggressive features of colorectal and bladder cancers in men. One of the studies showed that tumor cells that had lost their Y chromosome can more effectively evade the body’s immune system. The study found tumors exhibiting loss of the Y chromosome were more responsive to anti-PD1 treatments compared to similar tumors retaining the Y chromosome.
“People may think that because the Y is assembled, the work is done, but it’s more like now that it’s assembled the work can fully start to begin. The story is just unfolding,” Loftus said. “There is so much more to learn.”
Detailed findings were published in the journal Nature in an article titled “Assembly of 43 human Y chromosomes reveals extensive complexity and variation.”
In addition to the scientists from Clemson, Heinrich Heine University and the Jackson Laboratory, researchers from the University of Washington, the German Cancer Research Center, the University of Michigan Medical Center, the European Molecular Biology Laboratory, Temple University, the University of Connecticut and the Wellcome Sanger Institute were also involved in the project.
The Clemson research was funded by the National Institute of General Medical Sciences under the Center of Biomedical Research Excellence (COBRE) in Human Genetics grant 1P20GM139769. Kunkel is a junior investigator on the grant.