31 May 2024

Unravelling the Y chromosome


Unravelling the Y chromosome

In 2003, scientists announced the end to one of the most remarkable achievements in history: the first sequence of the human genome [1]. This monumental feat was the result of years of collaborative effort and groundbreaking technological advancements, and marked a significant moment in our understanding of human genetics.

However, the genome wasn’t finished – there were still gaps left in the sequences of all 24 chromosomes. While most of these gaps were small and scattered throughout the genome, one chromosome stood out with over half of its sequence remaining elusive – the Y chromosome.

In a remarkable stride forward, advances in long-read sequencing (LRS) technology have revolutionized genomic research, offering scientists the ability to produce a complete, gapless sequence of the human genome for the first time [2]. This breakthrough technology has opened new doors for exploring the intricacies of our genetic blueprint with unprecedented precision and accuracy.

In 2023, the Telomere-to-Telomere (T2T) consortium produced a complete sequence of the Y chromosome [3]. This achievement not only fills a critical gap in our understanding of human genetics but also paves the way for new discoveries in areas ranging from male infertility and reproductive health to the evolution of sex chromosomes.

Small but important   

The Y chromosome is one of two sex chromosomes, with its counterpart being the X chromosome. While the X chromosome is present in both males and females, the Y chromosome is predominantly found in males and determines male sex characteristics. This chromosome carries the sex-determining region Y (SRY) gene, which plays an essential role in initiating male development during embryogenesis [4]. The presence of the SRY gene triggers a cascade of molecular events that lead to the differentiation of the gonads into testes, the production of testosterone, and the development of male reproductive structures [5].

The Y chromosome is the smallest of all the chromosomes, containing the fewest number of genes, and is considerably shorter than the X chromosome in length (~57 Mbp compared with ~156 Mbp) (Figure 1) [6]. Despite its diminutive stature, the Y chromosome still holds significant importance in human biology.

Figure 1. A comparison of the sex chromosomes shows the substantial size difference, with the Y chromosome as one-third the length of the X chromosome, and with only 10% of the genes.

One of the most intriguing aspects of the Y chromosome is its unique pattern of inheritance. Unlike other chromosomes, the Y chromosome is passed exclusively from fathers to their sons, without undergoing the process of recombination with its counterpart during meiosis. This lack of recombination has resulted in the gradual loss of 97% of ancestral genes over evolutionary time, a phenomenon known as Y chromosome degeneration [7]. This has led to some researchers suggesting that the Y chromosome may eventually disappear entirely – although, recent studies have challenged this notion, revealing that the Y chromosome has maintained a stable assortment of genes for the past 25 million years.

While the Y chromosome may be dwindling in size, it’s loss can still have a profound impact. Research has revealed compelling links between the loss of the Y chromosome and an increased risk of various health conditions, including Alzheimer’s disease, heart disease, and certain types of cancer [8-10]. The mechanisms underlying these associations are still being elucidated, but they underscore the importance of understanding the role of the Y chromosome in maintaining overall health and well-being.

The need for long reads

Despite being the smallest chromosome in the human genome, the Y chromosome was the last to be fully sequenced, owing to its unique characteristics and complex composition. While all human chromosomes contain repeats, nearly 85% of the Y chromosome is composed of complex repeats, including long palindromes, tandem repeats, and segmental duplications [3].

This complexity poses a significant obstacle to traditional short-read sequencing (SRS) technologies, which struggle to accurately resolve and assemble these repetitive elements. As a result, despite advancements in genomic sequencing, the majority of the Y chromosome has remained a mystery until now.

However, the emergence of LRS technologies has heralded a new era in genomic research, providing scientists with the tools needed to tackle the challenges posed by the Y chromosome. By generating longer sequence reads, LRS enables researchers to traverse the repetitive regions of the Y chromosome with greater accuracy, facilitating the assembly of its complete sequence.

The revolution in LRS has not only enabled the full sequencing of the Y chromosome but also, a more comprehensive understanding of its diversity and evolution. In a recent study, researchers combined Pacific Biosciences high-fidelity reads (HiFi) and Oxford Nanopore ultra-long reads (ONT) to assemble the Y chromosomes from 43 males, spanning an astonishing 182,900 years of evolution [11]. This comprehensive analysis revealed striking variations in both size and structure across the 43 Y chromosomes, with lengths ranging from 45.2 million to 84.9 million base pairs. Notably, half of the male-specific euchromatic region also exhibited significant inversions, occurring at a recurrence rate more than twice as high as that observed in all other chromosomes [12].

This newfound ability to explore the intricacies of Y chromosome diversity opens avenues for investigating its role in human evolution, population genetics, and disease susceptibility. By unraveling the complex tapestry of Y chromosome variation, researchers are poised to gain deeper insights into the mechanisms driving genetic diversity and its implications for human health and evolution.

Implications for health and disease

While the X chromosome has been extensively studied, the Y chromosome has often been overlooked outside of male-based fertility studies. As a result, the contributions of the Y chromosome to male health still remain poorly understood. The completion of the Y chromosome sequence represents a significant moment in genomic research, that could provide profound insights into various aspects of human biology, including fertility, cancer risk, and sex-specific genetic effects.

The T2T Y chromosome assembly added over 30 million base pairs of sequence to the original reference. This revealed the complete ampliconic structures of important gene families; 41 additional protein-coding genes, and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region (Figure 2) [3].

Figure 2. Schematic representation of the human Y chromosome. Key features highlighted, such as the pseudoautosomal regions (PAR1, PAR2), Yp (short arm), Yq (long arm), and the AZF loci.

One of the notable regions uncovered by the complete Y chromosome sequence is the azoospermia factor region (AZF), a crucial stretch of DNA harboring several genes essential for sperm production. Microdeletions within the AZF region are one of the most common structural chromosomal abnormalities and represent a major cause of male infertility [13].

With the newfound ability to explore this region in its entirety, researchers can now explore the impact of deletions within palindromic repeats, shedding light on potential genetic factors that may influence fertility [14]. By elucidating the genetic underpinnings of infertility, this research holds promise for the development of novel diagnostic and therapeutic strategies to address reproductive challenges.

Another noteworthy discovery within the complete Y chromosome sequence is the presence of the TSPY gene array, which is the third largest gene array in the human genome [3]. TSPY contains the largest number of protein-coding copies on the Y chromosome and is only expressed in the testis. This gene array is postulated to serve important functions in spermatogonial renewal and meiotic division in male germ cells. However, it is also associated with various cancers including testicular germ cell tumors, melanoma, hepatocellular carcinoma, and prostate cancer [15, 16].

Intriguingly, individuals can exhibit a wide range of copy numbers of the TSPY gene, with variations ranging from 10 to 40 copies. It’s possible that this genetic variability could impact cancer susceptibility, thereby contributing to the male predominance in certain cancers.

The identification of these key features within the Y chromosome sequence highlights the intricate connections between genetics and human health. By unraveling the complexities encoded within our genetic blueprint, researchers are paving the way for a deeper understanding of the factors influencing fertility and cancer risk. Armed with this knowledge, clinicians and researchers are better equipped to develop targeted interventions and personalized treatments tailored to individual genetic profiles.

Discover how LRS is revolutionizing personalized medicine, here.


As we continue to unravel the mysteries of the Y chromosome and its implications for human health, we gain invaluable insights that extend far beyond the realm of reproductive biology. These insights pave the way for personalized therapies tailored to individual genetic profiles. By harnessing the power of precision medicine, clinicians can potentially develop targeted interventions for reproductive health issues, cancer prevention, and treatment, ultimately improving outcomes for patients worldwide.

Projects like the Human Pangenome Reference Consortium are currently underway, aiming to generate high-coverage HiFi and ONT sequencing data for hundreds of additional human samples [17]. This wealth of genomic information holds the potential to revolutionize our understanding of human health, disease susceptibility, and population genetics.

Through ongoing research and collaboration, companies like Eremid® are at the forefront of this movement. We can provide the necessary infrastructure and expertise for large-scale LRS projects, underpinned by our PacBio Certified Service Provider status and extensive experience with ONT platforms.

By leveraging cutting-edge technologies like HiFi sequencing and ONT ultra-long sequencing, Eremid can deliver top-tier clinical genomics services and bioinformatics for a wide range of human health applications.


  1. International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature, 431(7011), 931–945. https://doi.org/10.1038/nature03001
  2. Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A. V., et al. (2022). The complete sequence of a human genome. Science, 376(6588), 44–53. https://doi.org/10.1126/science.abj6987
  3. Rhie, A., Nurk, S., Cechova, M., Hoyt, S. J., Taylor, D. J., et al. (2023). The complete sequence of a human Y chromosome. Nature, 621(7978), 344–354. https://doi.org/10.1038/s41586-023-06457-y
  4. Tyler-Smith, C. (2013). Y Chromosome (Human). In S. Maloy & K. Hughes (Eds.), Brenner’s Encyclopedia of Genetics (Second Edition) (pp. 376–379). Academic Press. https://doi.org/10.1016/B978-0-12-374984-0.01658-2
  5. Clifton, D. K., & Steiner, R. A. (2009). CHAPTER 1—Neuroendocrinology of Reproduction. In J. F. Strauss & R. L. Barbieri (Eds.), Yen & Jaffe’s Reproductive Endocrinology (Sixth Edition) (pp. 3–33). W.B. Saunders. https://doi.org/10.1016/B978-1-4160-4907-4.00001-2
  6. Maan, A. A., Eales, J., Akbarov, A., Rowland, J., Xu, X., Jobling, M. A., Charchar, F. J., & Tomaszewski, M. (2017). The Y chromosome: A blueprint for men’s health? European Journal of Human Genetics, 25(11), 1181–1188. https://doi.org/10.1038/ejhg.2017.128
  7. Wilson, J., Staley, J. M., & Wyckoff, G. J. (2020). Extinction of chromosomes due to specialization is a universal occurrence. Scientific Reports, 10(1), 2170. https://doi.org/10.1038/s41598-020-58997-2
  8. Abdel-Hafiz, H. A., Schafer, J. M., Chen, X., Xiao, T., Gauntner, T. D., Li, Z., & Theodorescu, D. (2023). Y chromosome loss in cancer drives growth by evasion of adaptive immunity. Nature, 619(7970), 624–631. https://doi.org/10.1038/s41586-023-06234-x
  9. Vermeulen, M. C., Pearse, R., Young-Pearse, T., & Mostafavi, S. (2022). Mosaic loss of Chromosome Y in aged human microglia. Genome Research, 32(10), 1795–1807. https://doi.org/10.1101/gr.276409.121
  10. Sano, S., Horitani, K., Ogawa, H., Halvardson, J., Chavkin, N. W., et al. (2022). Hematopoietic loss of Y chromosome leads to cardiac fibrosis and heart failure mortality. Science, 377(6603), 292–297. https://doi.org/10.1126/science.abn3100
  11. Hallast, P., Ebert, P., Loftus, M., Yilmaz, F., Audano, P. A., et al. (2023). Assembly of 43 human Y chromosomes reveals extensive complexity and variation. Nature, 621(7978), 355–364. https://doi.org/10.1038/s41586-023-06425-6
  12. Porubsky, D., Höps, W., Ashraf, H., Hsieh, P., Rodriguez-Martin, B., et al. (2022). Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders. Cell, 185(11), 1986-2005. https://doi.org/10.1016/j.cell.2022.04.017
  13. Yu, X.-W., Wei, Z.-T., Jiang, Y.-T., & Zhang, S.-L. (2015). Y chromosome azoospermia factor region microdeletions and transmission characteristics in azoospermic and severe oligozoospermic patients. International Journal of Clinical and Experimental Medicine, 8(9), 14634–14646.
  14. Elsaid, H. O. A., Gadkareim, T., Abobakr, T., Mubarak, E., Abdelrhem, M. A., et al. (2021). Detection of AZF microdeletions and reproductive hormonal profile analysis of infertile sudanese men pursuing assisted reproductive approaches. BMC Urology, 21(1), 69. https://doi.org/10.1186/s12894-021-00834-3
  15. Kido, T., & Lau, Y.-F. C. (2014). The Y-located gonadoblastoma gene TSPY amplifies its own expression through a positive feedback loop in prostate cancer cells. Biochemical and Biophysical Research Communications, 446(1), 206–211. https://doi.org/10.1016/j.bbrc.2014.02.083
  16. Kido, T., & Lau, Y.-F. C. (2019). The Y-linked proto-oncogene TSPY contributes to poor prognosis of the male hepatocellular carcinoma patients by promoting the pro-oncogenic and suppressing the anti-oncogenic gene expression. Cell & Bioscience, 9, 22. https://doi.org/10.1186/s13578-019-0287-x
  17. Liao, W.-W., Asri, M., Ebler, J., Doerr, D., Haukness, M., et al. (2023). A draft human pangenome reference. Nature, 617(7960), 312–324. https://doi.org/10.1038/s41586-023-05896-x
Looking for deeper insights on your next project? Discuss a project
"Eremid provides the support we need to make a global impact in our large immunogenomic oncology clinical studies. The team’s expertise and flexibility from assay design to data delivery is helping us achieve our vision – an ideal research partner." Geoffrey Erickson, Immunis AI, MI USA — Senior Vice President, Corporate Development
"Working with Eremid has been a pleasure. We received excellent data with a very fast turnaround and appreciated the attentive and helpful project management!" Steve Watkins, BCD Biosciences, CA USA — CEO
Trusted by