Changes to the community structure and the menu bar
Significant changes have happened to the structure of our community and our drop down menu bar. READ ABOUT IT HERE
Big Data
cancel

Looking back at Big Data management in 2016

Looking back at Big Data management in 2016

SteveSarsfield

pexels-photo-27843.jpg

I’m an interested follower of the data management space and have been doing so for some years. This year, we had some amazing changes that have happened in 2016.  I know many bloggers in this space are looking ahead and trying to prognosticate about what the future will be for big data, but before we do that, let’s reminisce about some of the noteworthy things that have happened this year.

Hadoop’s hype waned

In the early days, Hadoop was marketed as the future of data management. Although it still may be, I think many companies were over-promised on Hadoop’s capabilities with regard to performing advanced, concurrent big data analytics.  In a recent benchmark, we learned that you can’t yet replace your data warehouse with Hadoop. However, some companies have been in proof-of-concept for two or more years and were given hope whatever capability they wanted would come in a future version.

There’s no doubt that some value does derive from the data lake, but it can’t provide everything to all people. Companies began to realize this year that the hope that was peddled by some vendors was not a strategy. More than ever, companies want to a more complete architecture that may require more than one solution to keep up with demand of their business intelligence and data science need.

Hope turned to Spark

In recent news, both users and investors also have much hope in Spark. However, it is still fairly immature technology.  Users still face the requirement that Spark requires large amounts of memory and run into memory issues when the data and analytics gets big. In our testing, we found that although it was powerful and fast, SPARK SQL doesn’t yet necessarily have all the analytic functions you may need to perform advanced analytics.

What’s cool about Spark is the renewed hope that it brings to open source.  Time will tell whether this new technology will be a big player in enterprise architecture.

Real value was derived from Big Data

In looking back in 2016, data scientists make very good use of big data and derived real value from it.  It came out of the sandbox and into product. Researchers started to tap into the human genome, companies began to understand themselves and their customers better, application developers developed more compelling games, infrastructure was made more secure with detailed monitoring, users set up dashboards with a trillion rows of data and even traditional companies transformed their business with deeper analytics.  The case studies are both far-reaching and varied in their extraction of value from big data.

Data warehouse became cool again

Over the years, it has been interesting to see the perceived rise and fall in the importance of data warehouse as a strategic initiative. In my book, it has always been strong, but we were swayed by promises that other technologies would replace the data warehouse.  Today, I see the resurgence. I regularly attend TDWI events and I’ve seen both attendance and interest increase.  If you’re young and looking for a long-term position, get ready to step in for the generation of data warehouse professionals who are getting ready to retire and need an apprentice to learn the ropes.

The cloud soared high

You can’t beat the convenience of going to a web site, clicking on a button and spinning up a database.  More business users than ever circumvented their IT departments this year to spin up their own analytics clouds. Companies were also thinking about capacity this year, too. Rather than holding extra capacity for on-premises databases, the end-of quarter calculation could be spun up in the cloud. For quick, convenience analytics, the cloud soared this year.

The rise of Kafka

This year, Kafka became better known as middleware than a literary talent. Kafka is a platform that’s gaining in popularity that can provide a unified, scalable data backbone capable of distributing data among supported applications. It’s a popular way to share data amongst the many producers and consumers of data that are probably in your organization. Its rise signals that a new approach to data management is taking hold.  Companies are using multiple platforms, including Hadoop, Spark, Vertica and others to achieve what they require, but to also keep costs low.

What do you think?

Of course, we barely scratched the surface of our data management evolution in 2016. There’s so much more. However, I’d love to hear what you think was most significant in data management this year.  Comment below.

0 Kudos
About the Author

SteveSarsfield

Steve Sarsfield is a product evangelist and spokesperson in HPE’s Big Data Software Business unit. He is also a big data enthusiasts and author of the book, The Data Governance Imperative. Steve has many years of experience in big analytics, information quality, big data and data governance.