In this analyst perspective, Dave Menninger takes a look at data lakes. He explains the term “data lake,” describes common use cases and shares his views on some of the latest market trends. He explores the relationship between data warehouses and data lakes and share some of Ventana Research’s findings on the subject. He also provides an assessment of the risks organizations face in working with data lakes and offers recommendations for maximizing the potential of data.
I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data. Here are some of the trends I see continuing to impact data architectures.
Ventana Research recently announced its 2020 research agenda for data, continuing the guidance we’ve offered for nearly two decades to help organizations derive optimal value and improve business outcomes. Data volumes continue to grow while data latency requirements continue to shrink. Meanwhile, virtually every organization is confronting a need for good data governance.
This year, I attended Informatica World 2019, Informatica's annual user conference. The main focus this year was on the cloud with a heavy does of AI. Under that focus, Informatica's conference emphasized capabilities across six areas (all strong areas for Informatica): data integration, data management, data quality & governance, Master Data Management (MDM), data cataloging, and data security.
The emerging internet of things (IoT) is an extension of digital connectivity to devices and sensors in homes, businesses, vehicles and potentially almost anywhere. This innovation means that virtually any appropriately designed device can generate and transmit data about its operations, which can facilitate monitoring and a range of automatic functions. To do this IoT requires a set of event-centered information and analytic processes that enable people to use that event information to make optimal decisions and take act effectively.
Organizations now must store, process and use data of significantly greater volume and variety than in the past. These factors plus the velocity of data today — the unrelentingly rapid rate at which it is generated, both in enterprise systems and on the internet — add to the challenge of getting the data into a form that can be used for business tasks.
Once again I attended Tableau's Users Conference, along with 17,000 other attendees, affectionately self-referred to as "data nerds". Pushing the envelope in data capabilities and access, Tableau introduced the "Ask Data" feature, allowing users to prose natural language queries and receive a response, along with new data preparation capabilities and other enhancements to help data analysts. Further, Tableau announced new developer enhancements including a new developer program to better align tools built for Tableau with Tableau's interface. For the full breakdown of Tableau User Conference 2018, and my analysis of all the largest announcements, watch my hot take video.
There’s been some speculation in the market that Hadoop may be disappearing. Some of this speculation has been driven by vendors that have recently downplayed Hadoop in their marketing efforts. For example, the Strata+Hadoop World conference is now known as the Strata Data Conference. The Hadoop Summit is now known as the Dataworks Summit. In Cloudera’s S-1 filing with the SEC for its initial public offering, the term “Hadoop” appears only 14 times, while the term “machine learning” appears 83 times. So, if some of the vendors that created the market appear to be pivoting away from Hadoop, does your organization need to do something similar, or is there a role for Hadoop in your IT architecture?