Summary by Deloitte with the support of Pablo Gonzalez.
What is the technology about? Big Data and Analytics is about using massive volumes of data of all type generated by digitization and very fast speed, with advanced machine learning algorithms, flexible data visualization, and automated decision tools.
Evolution of the technology: From data warehouses, to data lakes (e.g. Hadoop), to data pipelines (e.g. Spark), to automated learning
Key metrics: size of data used, amount of data sources, time from data-gathering to decision
- From siloed information in discrete silos to integrated information across the enterprise
- From structured only information (numbers mostly) to all kinds of information (including natural language)
- From batch data processing with days or weeks of delay to real time data processing
- From static and clunky reporting to advanced data visualization tools
- From statistical analysis of data to machine learning on data
- From reading data and then deciding to data allowing automated decisions or decision support tools
- Application of Big Data: Netflix and Amazon product recommenders, demand estimation at Walmart, Google Translate, Churn reduction at American Express, Network Optimization at Sprint.
- Technology of Big Data: Hadoop, Spark, Google Big Table, Amazon Machine Learning, Google Tensor Flow
- SaaS for Big Data: Looker and Tableau for data visualization, Alteryx as a data pipeline, MixPanel for web analytics
Implications for companies
- Company processes can generate very substantial amounts of data. Is this data captured? Is this data stored? Is this data used?
- Revenues can be improved by using data to sell better, cross-sell and upsell more and retain better.
- Costs and investments can be lowered by using data to spend less and more efficiently.
- New business models can be created based on data monetization.
- The talent and skills needed to work with data become critical.
Industries for which the trend is critical
- All industries generate amazing amounts of data, much of which is currently “dark data” that is not captured or used. As industries are digitized the data exhaust this generates will allow them to optimize those digital processes and monetize the data directly.
Some interesting resources for digging deeper
- Andreessen Horowitz’s Primer on Machine Learning
- Tomasz Tunguz (Red Point Capital, ex Google) and Frank Bien (Looker CEO) book on becoming data driven, Winning with Data
For more information reach out to me at Deloitte