Data lakes represent the forefront of data management, and this paradigm shift is already underway. In the data lake workshop, you’ll delve into the fundamentals of data lakes and explore their widespread adoption. Understanding the architecture of a data lake, comparing it to traditional databases and other big data solutions will be a focal point.
During the workshop, you’ll work on building your own data lake from scratch, comprising essential components such as an object store, a metastore, and a query engine. Utilizing the query engine, you’ll manipulate data within your data lake, gaining insights into its structure and the dynamics of data flow within this ecosystem.
Moreover, the workshop will delve into more advanced topics including Apache Iceberg tables, which facilitate CRUD operations, and address crucial aspects of data lake management such as security measures and cost controls.