Although some organisations have successfully implemented data lakes that mix old and new data, others have incurred problems. And some will undoubtedly run into trouble in the future. The truth is, that while it may be intuitively appealing to store all data in one place, in practice, data is as heterogeneous as its uses.
Many data lakes are now causing the very problems they were designed to solve. All too often, data stored is unfit for purpose, lacking standardisation or simply duplicated. But the problem is worse than that: the rate
of introducing new sources is outstripping the ability to store and manage the data.
Consequently, costs are increasing and many organisations are struggling to extract real business value from the vast amounts of data that is stored.
While some data is structured and stored for a specific purpose – such as regulatory compliance – much more is unstructured and simply hoarded in the belief that it will prove valuable some day. Against this backdrop, it is all too easy for a data lake to turn into a data swamp that has significant overheads but serves little purpose.