Join Michael Armbrust, head of Delta Lake engineering team, to learn about how his team built upon Apache Spark to bring ACID transactions and other data reliability technologies from the data warehouse world to cloud data lakes.
Apache Spark is the dominant processing framework for big data. Delta Lake adds reliability to Spark so your analytics and machine learning initiatives have ready access to quality, reliable data. This webinar covers the use of Delta Lake to enhance data reliability for Spark environments.
Topics areas include:
– The role of Apache Spark in big data processing
– Use of data lakes as an important part of the data architecture
– Data lake reliability challenges
– How Delta Lake helps provide reliable data for Spark processing
– Specific improvements improvements that Delta Lake adds
– The ease of adopting Delta Lake for powering your data lake
See full Getting Started with Delta Lake tutorial series here:
https://databricks.com/getting-started-with-delta-lake-tutorial-series/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
source