Here’s how pangenomics can help fast-track crop improvement

February 28, 2023

Photo credit: Rajeev Varshney


Recent advances in genomics technologies are providing plant breeders with new tools to improve crop yield, quality, and disease resistance. Two approaches that have been gaining increased attention in recent years are pangenomics and haplotype catalogue (HapCat) analyses.

These analytic methods are the primary focus of DivSeek’s Chickpea PanGenome & HapCat Hub, which is led by Professor Rajeev Varshney, Director of the Centre of Crop & Food Innovation at Murdoch University. The aim of the hub is to develop and refine tools and pipelines for pangenomic and HapCat analyses, using chickpea as a case species.

So, what is pangenomics, and how could it help plant breeders to create stronger, healthier, and more nutritious crop varieties?

Well, the ‘pangenome’ of a crop refers to the complete set of genetic variants present across the entire species. To construct a pangenome map, scientists must sequence the whole genomes of a very large number of individuals.

This allows them to identify the large ‘core genome’ (i.e. DNA regions that are always the same and never change), as well as the smaller ‘accessory genomes’ (i.e. DNA regions that vary across individuals, and give rise to genetic diversity).

Pangenomics is relatively new, but offers huge potential to accelerate crop improvement. When breeders or researchers want to identify candidate DNA sequences (e.g. for gene-editing, or selecting individuals for breeding programs), they often do a genome-wide association study (GWAS).

This involves sequencing the whole genomes of many individuals and comparing them to a single ‘reference’ genome, in order to identify genomic regions that vary across individuals. Homing in on those regions, they search for correlations between particular genetic variants (genotypes) and particular plant traits (phenotypes), such as growth rate, yield, or drought tolerance.

However, a reference genome can only tell you so much. “A single reference genome only tells you about genes of only one particular individual,” explains Varshney.  “Pangenomics can tell you about all possible genes for a given crop species.”

The first crop pangenome was developed for maize in 2014. Since then, researchers have created many more, including major crop species like rice, soybean, and rapeseed.

In 2021, Varshney and his team built a pangenomic map for chickpea using sequencing data from 3,366 genomes. Along the way, they identified superior haplotypes for desirable crop traits, and 56 germplasm lines that could be used for bringing novel haplotypes into the elite germplasm through haplotype-based breeding.

“Now that we have identified superior haplotypes for agronomic traits such as seed size, we can develop assays for those haplotypes,” says Varshney. “[This means] chickpea breeding programmes like Chickpea Breeding Australia can start using this information to introgress superior haplotypes in commercial chickpea varieties.” This he says will provide higher yield and revenues to chickpea growers in Australia.

So, pangenomics lets us take stock of the diversity across an entire species, enabling us to approach crop improvement from a truly global perspective.

HapCat analysis, on the other hand, usually focuses on a smaller population of individuals. Instead of trying to describe all possible genetic variants, HapCat analysis creates a “catalogue of haplotypes”. This is basically a database of all individuals in a population, and the genetic variants they carry on each chromosome. This fine-scale information helps plant breeders select parents to cross together in order to produce superior offspring.

Varshney’s aim for the Chickpea PanGenome & HapCat Hub is to develop and refine data tools and pipelines for using HapCat and pangenomics for crop improvement.

“At present, different research groups have specialized data analytical tools and pipelines that need specialized bioinformatics skills to run pangenome and HapCat analysis, he says. “We would like to refine those tools and pipelines so that they can be used by researchers with minimum bioinformatics skill.”

Developing user-friendly analytical tools is important, he says, because it helps to “democratize” the genomic sequencing era, and to bring its benefits to developing countries as well.

Beyond refining the toolkit and making it more accessible, Varshney also hopes to raise awareness for these methods and to educate the wider plant genetic resources community, through targeted capacity-building workshops and webinars.

Written by: Kiri Marker

Corresponding author: Rajeev Varshney