Evan Chan is both a data architect and a very hands-on engineer — can build a full data stack from scratch, but also love to — and have — hand-tuned super fast algorithms, custom data structures, and columnar compression engines. He can build the next database, ML system, the next Kafka. Currently, Evan works at Conviva, Inc. as a Principal Software Engineer.
Evan has been a distributed systems / data / software engineer for twenty years. He led a team developing FiloDB, an open source (github.com/filodb/FiloDB) distributed time series database that can process a million records per second PER NODE and simultaneously answer a large number of concurrent queries per second. He has architected, developed, and productionized large scale data and telemetry systems at companies including Apple, and loves solving the most challenging technical problems at both large and small scales, from advanced custom data structures to distributed coordination. He is an expert in bleeding edge #jvm #java #scala and #rust performance. Current interests include Rust and columnar compression. He has led the design and implementation of multiple big data platforms based on Apache Storm, Spark, Kafka, Cassandra, and Scala/Akka. He has been an active contributor to the Apache Spark project, and a two-time Datastax Cassandra MVP.
Lessons Learned: Porting a Streaming Data Pipeline from Scala to Rust.
Conviva runs one of the world’s busiest real-time streaming data analytics pipelines on a platform of Scala and Akka. We have been porting this platform to Rust, and would like to share our experience and lessons learned during this port for a mission-critical streaming data system. Why Rust? What are the unexpected surprises, joys, and pain points?
- Why the data world is moving to native languages
- Top lessons for Scala and C++ practitioners as they learn Rust
- Async programming in Rust: pain points, lessons
- Translating data structures and patterns to Rust from Scala
- Graphs, complex structures, ownership, and Arenas
- Boxing, dynamic vs static dispatch, and performance
Comments