Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding

Bayer, P E and Valliyodan, B and Hu, H and Marsh, J I and Yuan, Y and Vuong, T D and Patil, G and Song, Q and Batley, J and Varshney, R K and Lam, H M and Edwards, D and Nguyen, H T (2021) Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding. The Plant Genome (TSI). pp. 1-12. ISSN 1940-3372

[img] PDF - Published Version
Download (1MB)


The gene content of plants varies between individuals of the same species due to gene presence/absence variation, and selection can alter the frequency of specific genes in a population. Selection during domestication and breeding will modify the genomic landscape, though the nature of these modifications is only understood for specific genes or on a more general level (e.g., by a loss of genetic diversity). Here we have assembled and analyzed a soybean (Glycine spp.) pangenome representingmore than 1,000 soybean accessions derived from the USDA Soybean Germplasm Collection, including both wild and cultivated lineages, to assess genomewide changes in gene and allele frequency during domestication and breeding. We identified 3,765 genes that are absent from the Lee reference genome assembly and assessed the presence/absence of all genes across this population. In addition to a loss of genetic diversity, we found a significant reduction in the average number of protein-coding genes per individual during domestication and subsequent breeding, though with some genes and allelic variants increasing in frequency associated with selection for agronomic traits. This analysis provides a genomic perspective of domestication and breeding in this important oilseed crop.

Item Type: Article
Divisions: Global Research Program - Accelerated Crop Improvement
Uncontrolled Keywords: Soybean, Breeding
Subjects: Others > Plant Breeding
Others > Genetics and Genomics
Depositing User: Mr Arun S
Date Deposited: 25 Aug 2021 08:10
Last Modified: 25 Aug 2021 08:10
Official URL:
Acknowledgement: H. N. and B. V. thanks the United Soybean Board (St. Louis, MO) and former Bayer Crop Science, Dow Agro- Sciences, and BASF for funding support (project #1320-532- 5615) and commitment to making this data publicly available. This work was supported by the Australian Research Council Grants awarded to D. E and J. B. (DP160104497, LP160100030, and LP140100537). This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia. P. E. B. acknowledges support of the Forrest Research Foundation. B. V. thanks the United States Department of Agriculture–National Institute of Food and Agriculture (USDA-NIFA), Evans Allen funding support (project #1020002). H.-M. L. acknowledges the support from Hong Kong Research Grants Council Area of Excellence Scheme (AoE/M403/16). H. H. thanks the China Scholarship Council for supporting his studies at the University of Western Australia. R. K. V. thanks Science & Engineering Research Board (SERB) of Department of Science & Technology (DST), Government of India for providing the J C Bose National Fellowship (SB/S9/Z-13/2019). We thank Steven Cannon and Rex Nelson (USDA-ARS in Ames, IA) for their help in making this genomic resource available at Soybase.
View Statistics

Actions (login required)

View Item View Item