Building (data) lake houses

22

September

2021

4.5/5 (2)

Most people can only dream of one day owning a lake house. A cozy cottage by a vast lake, to fish, or to simply sail your boat. However, this article is not talking about those type of lake houses. In fact, digital lake houses seemed like a dream but have now come into existence.

The term lake-house is a portmanteau of both a ‘data warehouse’ and a ‘data lake’. Both are a source of data storage (Databricks, 2021). The former mentioned, data warehouse, is filled with well-structured historic corporate data, data such as business sales in the past few years. This can be used to analyze, estimate, and evaluate future changes, through so-called business intelligence or BI. The latter mentioned, data lakes, are in essence the rest of the data, less structured or semi-structured, cheaper raw data.

This management architecture is still in its infancy but is fast gaining ground in the software industry. Databricks, the company behind the design of data lake houses is one of the fast-growing software companies worldwide and is expected to initiate the largest IPO of a software firm ever, reaching a valuation of around 36 billion US Dollars according to Forbes (2021).

Databricks offers its customers an open-source AI platform to which BI is added, increasing efficiency and simplifying business analytics. You could see it as a data lake but on top of that a warehouse layer is built. This warehouse layer ensures quality control and provides a basis of BI to the data, which can then be used to report on future business changes. Because of the combination of AI and BI the data can be unstructured, semi-structured, or fully structured and still be useful for business decision-making.

Advantages include a reduction of costs as only one data platform has to be maintained, it is also reducing complexity as employers can draw from one data platform (Xplenty, 2021).

A disadvantage is however, as said before, data lake houses are still at the beginning of development. Therefore, the tool is not fully advanced and still lacks proper capacity to cater to all the needs of large corporate firms.

It will be an interesting development to follow in the years to come when data lake houses are fully adopted and innovated to be used by large internationals. This will certainly upgrade BI and its use in the business decision making process.

Please rate this