EVOLVES Activity 4- Breeder friendly informatics

Description

Enhancing the functionality of our KnowPulse web portal to facilitate decision making in pulse crop breeding and genetics

Data management, analysis, and communication are just as critical as data generation, especially in the world of big data. KnowPulse (http://knowpulse.usask.ca) is a publicly searchable legume web portal housing germplasm information (such as passport data and pedigrees) and associated genomic, genotypic and phenotypic data for a number of species including cultivated and wild lentil. It is a relational database that allows users to query genotypic and phenotypic data in many different ways. This web portal currently contains data from previous lentil genomics projects, including SNP data across a range of genotypes and phenotypic data from lentil genetic and genomic research conducted at the USask, including under AGILE. We started to develop KnowPulse in 2009 using Tripal, which is an open-source biology-focused platform based on the popular Drupal content management system. The USask team has co-developed Tripal from its infancy to the thriving open-source software project it is today, with over 125 biological databases and an active community world-wide. As such, Tripal allows us to leverage many existing tools and visualizations developed by the community, while simultaneously providing a platform to share tools we develop. KnowPulse is currently a leader within the Tripal community in support of large-scale variation datasets (e.g. phenotypic data support for AGILE) and provides multiple tools for data upload, download, searching, and simple visualization for both breeding and research. We work closely with other legume data support projects such as Legume Federation105 and Legume Information System.

With a greater emphasis on data generation for wild species, there is a need to extend our tools to provide increased cross-species comparison and tracking capabilities, as well as further explore the structural variation landscape in the genus Lens. SynVisio is a powerful open-source tool for exploring structural variation and comparing genomes among species. Through collaboration with C Gutwin and the SynVisio development team (Computer Science, USask), we will integrate SynVisio with Tripal to aid researchers in investigation of large-scale chromosomal rearrangements (e.g. translocations and inversions, Activity 1a) and comparative mapping with other legumes (Activity 1b). Full integration of such tools with KnowPulse will encourage user uptake by removing the hurdles of software install, configuration and data input. Furthermore, we will use the Tripal API to make this tool available to the wider Tripal community, including a number of other legume databases. In addition to integration, we will also collaborate with C Gutwin to extend SynVisio to support gene and haplotype tracks for each genome displayed. These tracks will aid us in exploring copy number variants and provide context while investigating structural variation (Activity 1).

KnowPulse is committed to ensuring FAIR data principles are met. For example, data are interoperable through shared identifiers and relationships within a common GMOD Chado schema. To maximize findability, we focus on tool integration to provide a consistent and intuitive user environment (e.g. SynVisio and JBrowse). Furthermore, exposing our data through standards-compliant web services ensures data findability for power users and for additional tool integration. Through Tripal Content web services, KnowPulse is already exposing data to other Tripal databases including those in the legume community. As part of EVOLVES, we will also implement the BrAPI specification using Tripal BrAPI. Tripal BrAPI currently exposes germplasm and study information and we will extend it to support both genotypic and phenotypic data in the BrAPI specification. Having these data available through BrAPI- compliant web services opens opportunities for communication with a wide array of tools, including but not limited to, the James Hutton Suite of tools (e.g. Flackjack, Helium), BrAPPs (e.g. BrAPI-R, Comparative map viewer), GOBii tools and the Integrated Breeding Platform Breeding Management System (IBP-BMS). Due to the rapidly changing visualization tool landscape, we will further evaluate software options throughout the research process to ensure we are using the best tools to visualize variation in lentil.

Deliverables
  • Integration of external tools (e.g. SynVisio, CViTjs, Flapjack) with KnowPulse through Tripal

  • Development of haplotype display through KnowPulse (i.e. Genotype Matrix, JBrowse and SynVisio).

  • BrAPI-compliant web-services for data sharing with both users and tools to promote FAIR data principles

Funding Grant
Title
EVOLVES: Enhancing the Value of Lentil Variation for Ecosystem Survival
Data Custodian
  • Kirstin E Bett
  • Albert Vandenberg
Research Organization
Funding Range

2019-2023