Glycine max: Williams 82 Genome Assembly v4.0

Citation
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Genome sequence of the palaeopolyploid soybean.. Nature. 2010 Jan 14; 463(7278):178-83.
Overview
Date Released
January 2010
Data Source
Source Name
: Williams 82
Source URI
: https://www.ncbi.nlm.nih.gov/biosample/SAMN00002965
Statistics
Assembly Genome Size
978.4 Mb
Number Of Scaffolds
282
Scaffold N50 value
10
Scaffold N50 length
49.9 Mb
Number Of Genes
52,872
The Williams 82 version 4 assembly (Wm82v4) builds on the widely-used assembly version 2, as well as an incremental version 3 that involved incorporation of BAC sequence to fill contig gaps in 2016. The Wm82v2 assembly was primarily Sanger-based, and new gap-filling in v3 and v4 utilized PacBio-based BAC assemblies targeted to gap regions. The Wm82v4 assembly closed 3,626 gaps and added 5,138,978 bp of sequence relative to Wm82v2, increasing the contig N50 from 233.1 kbp to 419.3 kbp.
Assembly Details
Workflow/Pipeline Name
Schmutz J, et al., Nature 463:178 (2010)
Algorithm
Arachne v.20071016
Program Version
2010
This release was constructed starting with the version 3 release. First, an optical map and synteny with the two other Glycine max assemblies (Lee and Soja) were used to correct the order and orientation of the original scaffolds. Next a total of 16.5x raw PACBIO read coverage on targeted clone regions was used to patch gaps and ~50x Illumina reads were used to polish the new sequence. For full details of the new assembly, see the original source for the genome.
Download
Sources
LegFed DataStore
NCBI Bioproject
Original Source

This data is publicly available at the above locations. If you use this data in your research, please cite:

Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA, Genome sequence of the palaeopolyploid soybean., Nature. 2010 Jan 14; 463 7278 178-83