Deep learning framework Apache MXNet and Open Neural Network Exchange (ONNX) today launched the Consortium for Python Data API Standards, a group that wants to make it easier for machine learning practitioners and data scientists no matter which framework, library, or tool from the Python ecosystem it came from. ONNX is a group initially formed by Facebook and Microsoft in 2017 to power interoperability between frameworks and tools. Today the group includes near 40 organizations influential in AI and data science like AWS, Baidu, and IBM as well as hardware makers like Arm, Intel, and Qualcomm.
The group, which will develop standards for dataframes and arrays or tensors, said the consortium is necessary due to fragmentation of the kinds of frameworks of the data ecosystem in recent years.
Other major frameworks include TensorFlow, PyTorch, and NumPy; the Python programming language is also used for Python dataframes like Pandas, PySpark, and Apache Arrow. PyTorch, one of the most popular machine learning frameworks in use today is not a part of the consortium, a Facebook company spokesperson told VentureBeat in an interview.
“Currently, array and dataframe libraries all have similar APIs, but with enough differences that using them interchangeably isn’t really possible,” group members said in a blog post today. “We aim to grow this Consortium into an organization where cross-project and cross-ecosystem alignment on APIs, data exchange mechanisms and other such topics happens. These topics require coordination and communication to a much larger extent than they require technical innovation. We aim to facilitate the former, while leaving the innovating to current and future individual libraries.”
Initial efforts will start with a working group then request feedback from array and dataframe library maintainers and iterate before the first version of the standard is made available for use. The first feedback session begins next month. As part of the launch, the group is releasing tools for comparing array or tensor and tracking some of the primary functions of a dataframe library.
While AI research dates back to the 1950s, the practical need to create standards and build an infrastructure for benchmark testing, interoperability, and other practical developer needs led to the formation of groups like ONNX. Beyond machine learning, other examples of tech groups formed to create standards include C++ and Open Geospatial Consortium.