Glycine max: Williams 82 Genome Assembly v4.0

Citation
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Genome sequence of the palaeopolyploid soybean.. Nature. 2010 Jan 14; 463(7278):178-83.
Overview
Authors

Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA

Assembly Genome Size
978.4 Mb
Number Of Scaffolds
282
Scaffold N50 value
10
Scaffold N50 length
49.9 Mb
Number Of Genes
52,872
Description
The Williams 82 version 4 assembly (Wm82v4) builds on the widely-used assembly version 2, as well as an incremental version 3 that involved incorporation of BAC sequence to fill contig gaps in 2016. The Wm82v2 assembly was primarily Sanger-based, and new gap-filling in v3 and v4 utilized PacBio-based BAC assemblies targeted to gap regions. The Wm82v4 assembly closed 3,626 gaps and added 5,138,978 bp of sequence relative to Wm82v2, increasing the contig N50 from 233.1 kbp to 419.3 kbp.
Methodology
Genus
Glycine
Scientific Name
Glycine max
Data Source Version
Sequencing 2009
Data Source Name
Williams 82
Data Source URI
Abbreviation
glyma.Wm82
Program, Pipeline, Workflow or Method Name
Schmutz J, et al., Nature 463:178 (2010)
Program Version
2010
Analysis Method

This release was constructed starting with the version 3 release. First, an optical map and synteny with the two other Glycine max assemblies (Lee and Soja) were used to correct the order and orientation of the original scaffolds. Next a total of 16.5x raw PACBIO read coverage on targeted clone regions was used to patch gaps and ~50x Illumina reads were used to polish the new sequence. For full details of the new assembly, see the original source for the genome.

Download
Data Release
LegFed DataStoreAccess at CyVerse Data Commons
NCBI BioprojectPRJNA19861
Original SourcePhytozome 13

This data is publicly available at the above locations. If you use this data in your research, please cite:

Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA, Genome sequence of the palaeopolyploid soybean., Nature. 2010 Jan 14; 463 7278 178-83