As plant breeding – and agriculture in general – becomes digital and data-driven, a common problem emerges: How can breeders and biologists access and, more importantly, use this data with agility and efficiency to inform their decision making? The wealth of information currently being generated requires solutions that enable interoperability of data sources, connectivity and data reuse among the participating members of the community. The Breeding API (BrAPI) effort, whose main mission is to “enable interoperability among plant breeding databases,” is already making interoperability possible. But does it go far enough?
Evidenced by the high level of interest in the most recent BrAPI-Hackathon, a growing number of developers are embracing the open standard for current and new breeding information management systems. An open standard alleviates the database interoperability problem, lowering the time to market for applications, as well as analytics engines, field data collection, and more. After database interconnectivity is accomplished, the next evolutionary step is to make these interconnected data sources easier to use by community members and stakeholders who might not have programming or informatics expertise. We feel that BrAPI can be utilized here as well.
BrAPI already allows us to build apps that can be used by the wider community without the need for system-specific connections, but we have an idea to take it one step further.
What if we could leverage the BrAPI standard combined with code generation tools to create limited scope applications for both single-use and rapid prototyping?
What kind of value could that bring to labs and researchers who don’t have access to large development teams? How much time can we cut down between idea and execution?
As the breeding informatics team at the Innovation Lab for Crop Improvement, we pose a few critical questions.
What would it look like?
Imagine a webpage where a researcher could very quickly select a set of BrAPI endpoints with the appropriate parameters to retrieve data. The user would then be given a text box to input R code to analyze the retrieved data. They then get another prompt to select BrAPI Endpoints to upload the results. Once they complete the process they download a file with software to do exactly that. After adjusting a few settings the researcher could run this directly from their computer or deploy it as a docker image. This file could easily be shared and distributed for others to build upon and enhance.
How would this tool reduce development time?
For smaller research groups, the largest barrier to developing needed software applications can be supporting a full development team. In many cases, one of the largest time expenditures in terms of development resources is laying the groundwork. Though far more limited in scope than a fully functional framework, these single purpose systems can be created much quicker with most of their data transaction and formatting layers being algorithmically generated and entirely reusable. For a researcher, this means the only component left to code is the calculation itself. Even better, that calculation segment would be written in R which lessens the learning curve to use the tool.
Can it be done?
From a technical side, building such a system seems very doable. The BrAPI standard is stored in GitHub as yml files and already leverages code generation tools like Swagger-codegen and Openapi-generator to rapidly stand up test clients and servers. Both code generation tools are open-source products that are already extensible and would require some modification to meet our use case. If we can encapsulate this code generation in a set of parameters it would then be easy to create a web front end with which a researcher can interact. The last step would be to wrap the resulting application in a deployable application and deliver it to the user.
Has anyone tried to build this system before? The answer is yes; well, sort of. The idea of making the process of creating software applications more approachable is not new; in fact, software that solves a similar type of problem already exists. Tools like Galaxy and Biocontainers are part of a constellation of applications designed to make writing and sharing reproducible analysis workflows easier for non-developers across different domains. Our idea, at its core, shares these principles, but with the main distinction that we aim to leverage the domain knowledge that is already captured in the BrAPI specification. Taking advantage of this domain knowledge should, at least in principle, make these breeding-specific applications easier to write.
Concluding thoughts
The resulting “Brapid Apps” from our brapp prototype as-a-service toolkit could empower breeding programs to customize applications for existing breed management systems to better suit their needs.
And once operational we have some novel future features that could be built. A Kubernetes integration could allow for a set of these Brapps to be deployed at scale meeting high-throughput needs. With a diverse set of “Brapid Apps” an app market could even be created to make collaboration quick and easy.
The Brapid App service is only a dream for now, but one day it could revolutionize the breeding informatics space. Please let us know what you think about it!
Learn more about their work with the Innovation Lab for Crop Improvement breeding informatics team.