![]() ![]() The Benefits and Drawbacks of Each Pattern The Data Lakehouse pattern emerged as an alternative to the structure and discipline required by the Data Warehouse pattern and allowed organizations to collect terabytes of raw data in many different formats. Data has been called 'The Oil' of the 21st century, and organizations knew they needed to collect as much raw data as possible and would figure out a way to refine the raw data. The Data LakeĪs organizations began to realize the value of data in the internet age, the need to capture and store all types of data arose. A key characteristic was disciplined usage. This enabled utilizing the Data Warehouse for Reporting and Auditing. The data model was normally implemented utilizing the tenants of the Star Schema methodology. The Data Warehouse was familiar, normally using a SQL type of database that contains by a managed and structured data model. The Data Warehouse was the first enterprise data storage pattern to gain dominance in the data architecture world. To better understand what a Delta Lakehouse is, we should first review the two enterprise data storage patterns that came before it: Where are the origins of the Lakehouse Concept? Finally, the data is exposed from the Gold layer to the Reporting and Analytic consumers via the Databricks Spark SQL APIs which process the requests thru a dedicated data presentation Cluster in Databricks. Then refine the data from Bronze to Silver, and then Silver to Gold using Azure Databricks Jobs which automate the execution of one of more Databricks Spark notebooks running the Notebooks as an Azure Active Directory Service Principle which has been granted access to the Storage Account via membership in an Azure AD Group that has been granted access to the Storage Account Containers via ACL binding. The general flow of data is to ingest the data into the Bronze Storage Account from source systems using the Azure Synapse Pipelines Data Copy Activities. Azure Data Lake Gen 2 Storage Accounts, one for each data quality zone.In this architecture, the 3 planes are represented by the following resources: The general structure of a Delta Lakehouse consists of the Data Storage plane, the Data Ingestion plane, and the Data Processing & Presentation plane.īelow is a diagram of a typical Delta Lakehouse architecture within the Azure cloud. ![]() The Delta Lakehouse is a pattern for creating repositories for raw data in a variety of formats that provides that provides data reliability and fast analytics. This repo contains information on the Delta Lakehouse Design pattern. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |