Ray Use Cases

Graphs at scale with Ray, for AI in Manufacturing

Ray Summit 2022

Graph models provide the best representation for data in many use cases in manufacturing, continuous and discrete, plus closely related business verticals such as pharma. This drives a growing demand for graph technologies applied in this domain. The core concept of bill of materials (BOM) linked to product data, linked to supplier networks, production planning, inventory data, linked with customer and sales data almost naturally translates into graphs. Meanwhile much the relevant input for AI in manufacturing is stored in documents, spreadsheets, or legacy data silos. NLP and deep learning applications help prepare such data for integration into graph models.

The graph space has transformed over the past few years: graph neural networks provide exciting new capabilities, graph visualizations augmented by GPU break through previous barriers, algorithms of mathematical graph theory can be applied at scales hardly doable before. There are popular graph query languages and the W3C standards for ontologies and axiomatic inference and validation, and probabilistic graphs. Unfortunately, these camps within graph space remain largely disjoint. An open source project �kglab� provides integration paths for different kinds of graph work, while aligning with PyData tools. Manufacturing firms including Siemens, Bosch, and BASF began using this library. A follow up project leverages Ray to provide graphs at billion-node scale, for horizontal scale-out of graph compute in large industrial use cases.

This talk explores use cases for AI in Manufacturing, discussing where Ray can address critical bottlenecks at scale, and also helps augment work with NLP and deep learning. We�ll consider the roles that Ray could play for more optimal graph technologies. Some points are counter-intuitive: for example, GNNs are quite useful, however the hard problems requiring graph algorithms at scale are often in data preparation (resolving ambiguity, detecting unwanted cycles, handling gaps and errors), which must occur long before training any GNN models. While there are numerous graph database vendors, few can handle scale, and their enterprise licensing costs often exceed the cloud computing costs for large clusters. Ray on K8s can be used to break through crucial limitations, allowing for open source integrations for managing large graphs within secure enterprise environments.

About Paco

Known as a "player/coach," with core expertise in graph technologies, natural language, data science, cloud computing; ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups. Board member for Recognai; Advisor for Amplify Partners, Data Spartan, KUNGFU.AI. Lead committer on PyTextRank, kglab. Formerly: Director, Community Evangelism for Apache Spark at Databricks.

Paco Nathan

Managing Partner, Derwen, Inc.
Ray Summit 2022 horizontal logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Save your spot

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.