Data and AI at Scale By the Bay. Description. Examples.

Embarking on a fresh chapter of our series "Meanwhile, We Continue to Cover our Tracks"

Up next: the powerful tandem of Data & AI. Guided by our theme: "Code and Data the Age of AI", we're sharing insights & best practices from past editions, offering a tantalizing preview of this year's journey.

Submit a talk! ONLY 1 DAY TO GO!

Always dreamt of AI that gorges on all the Data it can get its hands on? Welcome home!

We've always looked at AI fed by all the data you could give it, and consider it a art of distribuited systems and data pipelines. We're continuing to embed the Python stack in distributed systems and data pipelines but taking it a step further.

Get ready for new ways to achieve scale. Thanks to LLM programming, all programming paradigm ways and frameworks for big and small data are back in action.

Consider this track as the grand finale of software engineering, diving into all possible ways to program with data and AI.

Chris Fregly: "Continuous ML Applications in Production"

Cris introduces continuous "VOTE" techniques (Validation, Optimizing, Training, Explainability) to improve machine learning pipelines using the open-source tool, PipelineAI.

The talk matters to us because it addresses a significant gap in real world enterprises, which is the ability to adapt and optimize AI models in real-time.

Anima Anandkumar: "Next-generation frameworks for Large-scale Machine Learning"

Anima showed how simple gradient compression (SignSGD) leads to communication savings while preserving accuracy, consequently explaining that with better algorithmic design, obtaining “free lunches” and better efficiency in ML is possible.

The talk addresses current challenges in the field of deep learning, such as the need for larger datasets and models, and the demand for more computing infrastructure

Antje Barth: "Put Your Machine Learning on Autopilot"

Antje Barth focuses on the concept of Automated Machine Learning (AutoML), live-demoing how it can automate tasks like data analysis, feature engineering, model training, and tuning in the ML workflow.

The significance of the talk lies in AutoML's potential to save time and effort. It can accelerate AI adoption and make the technology more accessible to a broader audience.

Adam Gibson: "Deploying and serving hardware-optimized ML pipelines using GraalVM"

GraalVM combined with the eclipse deeplearning4j framework to create and deploy ML pipelines combining Python scripts and hardware-optimized ML pipelines into one binary.

Hardcore optimizations down to hardware - can't get enough of it!

Adi Polak: "Rethinking scalable machine learning with Spark ecosystem"

Adi guides us through a typical ML workflow powered by Spark: data ingestion, data cleanup, feature extraction, model training, model serving, and scoring.

Apache Spark is a widely used tool for large-scale data processing and machine learning, making this talk highly relevant for many attendees.

Alexander O'Connor: "Transformers End-to-End: Experience training and deploying large language models in production"

Alexander peports his team's experience of leveraging LLMs for the kinds of problems they, and many other data science teams, face in customer support, ecommerce, and more. Transformers and large language models are a significant trend in the field of AI and ML which we could not miss.

Alon Gubkin: "Building an ML Platform from Scratch"