Rapid aggregation of data—and efficient data analysis—are key to next-generation innovation across a range of companies leveraging Analytics and Big Data.
While attending the Gigaom Structure Data conference in New York (March 19-20, 2014), along with the SanDisk® team it was amazing to see how company after company took the stage, describing their “aha” moments when they realized that the data they had already collected could be the source of new business ideas – transforming their business models and forging new business plans.
Data scientists – those who focus on Big Data, and apply Analytics to find the underlying trends in the data – are key employees these days. That came across clearly during the conference’s presentations. They’re the ones who bridge the gap between the numbers-heavy analysis and the business units that will leverage that data to evolve the business itself.
In this blog post, I’m listing are a few examples of analytics initiatives are taking center stage in brand-name companies that shared their key learnings at the conference. You can watch an archived video of the livestream of each example, as described in the main-tent presentations provided at the links below.
- Amazon Web Services: Ingesting lots of small datasets proved to be challenge for Amazon Web Services, explained Ryan Waite, General Manager of Data Services at Amazon Web Services during his presentation. To address this, AWS built Kinesis, a new AWS data service it launched four months ago that helps AWS customers (such as those in the online gaming industry and online ads industry) to gain near-real-time access to AWS’ metering and monitoring services. This gives them faster feedback on their own use of AWS services. Kinesis captures many small bits of data that, combined with all others, generated multiple TB per hour, from hundreds of thousands of sources. A “partition key” distributes the data across “shards” that build up the database for Kinesis. The data is stored in Kinesis for 24 hours. Importantly, it’s an elastic service that scales up, as needed, by adding more data resources within AWS’ data center infrastructure.
Watch the presentation on Gigaom.com
- Ford Motor Co.: Ford leverages deep data analysis to inform all aspects of the company’s business, from sales to marketing to customer service and product design. At Gigaom Structure Data, Michael Cavaretta, Data Scientist and Manager of Ford’s Predictive Analytics Group, described the group’s work in Machine Learning, Artificial Intelligence, Data Mining, Text Mining and Information Retrieval. All of this is being done to make business processes data-driven. He spoke about the ways that analytics and data guides business decisions. Distilling the best data from very complex mathematics and algorithms, it’s really important to have “good vizualization and storytelling” to transmit the Big Data/Analytics learnings to Ford’s business units, Cavaretta said. For example, an inventory management system allowed dealerships to optimize the supply of cars on their lot, based on information about customer requests for specific vehicles and car features. Analytics is playing a role in manufacturing, too. The ability of Big Data to reduce costs is a key point of focus at Ford, he said – adding that the ability to search data from multiple sources—including the Internet of Things—could improve manufacturing processes, helping to refine them through intelligent data-based learning.
Watch the presentation on Gigaom.com
- A panel of data scientists, from LinkedIn, Airbnb and Uber, discussed the way that data analytics is shaping the contours of their evolving business models. “We think of data as the customers’ voice – what works, and what doesn’t work, what we should be doing better,” said Riley Newman, Head of Data Science at Airbnb, which provides data about lodging rentals in neighborhoods around the world. “So, we always begin with data to understand where the opportunities for the business are.” For LinkedIn, the measurement of metrics is a key priority, said Yael Garten, Manager of Mobile Data Science at LinkedIn. “If you can’t measure it, you can’t fix it—so measure everything,” she said. From a product point of view, the data is key to LinkedIn’s recommendations, and its listing of job openings and influencers, combining “to give every [LinkedIn] member the right experience at the right time – for wherever they are in their professional careers.” For Uber, timely collection of data monitoring is vital to timely customer service for the company’s car services in major cities, matching available cars to customer requests and routes. Henry Lin, senior data scientist at Uber, says the car services company is data-driven. It looks at many data sources, from traffic flow to scheduled pick-up times and locations (geospatial data): “We measure everything – and it’s integrated into some kind of [information] flow that drives back into our growth.”
Watch the presentation on Gigaom.com
A simple summary is this: These well-known companies are finding that simply aggregating data into a big data warehouse is a good start to data analysis. But “ingestion” of data, as it’s called, is only the beginning of a data-driven strategy. It’s how the data is prepared for analysis – and how the outputs are surfaced to the business units—that spells the success of any data-driven transformation project.
Flash’s Role in Big Data and Analytics
Flash has an important role to play in Big Data and Analytics, as SanDisk Enterprise Vice President of Marketing Steve Fingerhut described in his breakout session at the Gigaom conference: “Flash SSDs: High-Performing, Cost-Effective Solutions for Big Data Analytics.”
Fingerhut shared the results of SanDisk benchmarking studies that demonstrated how Flash solid state drives (SSDs) and SanDisk software can accelerate analytics workloads and reduce the time required to access and analyze valuable data. “The rising use of Analytics shows there’s “a lot of hunger for insight into the data that companies have been collecting [in their business],” Fingerhut said.
“As the data center infrastructure changes–both for IT and cloud data centers,” he said, “flash SSD storage will play a key role in transforming those data centers, enabling faster performance and supporting faster time-to-results for businesses.”
What I’ll remember from this Gigaom Structure Data conference is the amazing diversity of Big Data Analytics workloads, as they run in many different types of businesses. Each of the speakers showed how Big Data is impacting their business—what they are doing to cope with the data tsunami that threatens to swamp traditional data centers—and their approach to building data-driven business opportunities that simply didn’t exist 10 years ago.
At the Gigaom conference, which took place at New York’s Chelsea Piers, the river traffic on the Hudson River sailed by, as the attendees contemplated data moving at hyperspeed throughout their networks and their data center infrastructure.
The contrast between the river traffic and the data traffic was striking – because in Big Data Analytics, customers want to measure time in milliseconds — and not in seconds, minutes or even hours. And those who don’t learn how to tap into their data reserves – and to identify key trends in the data, will lose competitive edge to other businesses that are climbing the data-driven learning curve right now. It is, indeed, an interesting time to be a data scientist.