Data Center Tech Blog
Steve Fingerhut

Vice President of Marketing, Enterprise Storage Solutions

Research shows that companies with massive investments in Big Data projects are not only generating excess returns but are also gaining competitive advantages. With today’s wide availability of high-performance storage, processing power, and cost-effective virtualization and cloud architectures, Big Data is no longer just for the giants of the web, it is a critical tool for businesses to remain competitive. For those organizations not using analytics yet, they soon will be, or simply risk being left behind.

What Big Data means for your business will vary on your particular business and workloads. Unlocking what data is most valuable to making strategic and operational decisions, requires strategy, leadership, expertise and the right tools, as the rise of analytics is quickly diversifying applications and solutions alike:

  • Batch Analytics and Hadoop: With tools such as Hadoop, organizations can process massive (exabytes of) data to build complex, ad hoc queries such as “which products should I suggest to a buyer searching for product x” based on historical data.
  • Interactive analytics with NoSQL: NoSQL databases provide a non-relational mechanism for multi-structured data that can handle the scale and agility of modern applications, and can enable quick lookups like “trending” reports, e.g. top 10 searches in last hour.
  • Real-time analytics: In-memory computing platforms such as GigaSpaces and SAP HANA eliminate seek-time and deliver faster, more predictable performance for finance and stock ticker feed analysis, or to solve complex business problems like with SAP’s BPC (Business Planning and Consolidation) analytics tool.
  • E-discovery analytics: Since 2006, electronic information can be used as evidence in civil litigation or government investigations. These processes have complex legal obligations that require fast indexing, real-time search of dynamic data volumes, and advanced metadata analysis.

Scaling massive data efficiently – reduce complexity and costs
All of these workloads can be massively accelerated by SSD-based solutions that reduce latency and deliver much more efficient server and storage deployments.

In NoSQL database management system Cassandra, all-SSD options are becoming more prevalent as they reduce latency and improve application performance, offering an opportunity to also reduce complexity of data movement when used in conjunction with Hadoop.

CassandraThroughputCassandra Acceleration Latency and Throughput
DataStax Cassandra running with 1TB of data, YCSB-Zipfian benchmarks showing the solution with SSDs vs. HDDs:

When using HDDs, Cassandra underutilizes CPU, with only a small percentage of the cores consumed. With SSDs, performance increases, allowing more efficient utilization of the system resources, including full utilization of the CPU cores – enabling more efficient Cassandra and NoSQL capacity and performance scaling. For analytics, this facilitates better allocation of budgets, to use savings to expand big data coverage faster, compounding the business benefits.

I’ll be discussing SSD-based solutions for analytics this Wednesday in a Gigaom Research Webinar and panel discussion on “Innovations in flash storage and their impact on analytics” and at Gigaom Structure Data in March. Join us in the discussion and learn how to ensure a competitive edge in the Big Data era, with cost-effective solutions.

Innovations in Flash Storage and Their Impact on Analytics – Gigaom Research Webinar

Taking Flash to Places Where No Flash Has Gone Before

bring your data to life

Today’s digital economy with mobile, IoT and cloud is based on the value of data. How do you unlock it? Download the infographic to learn more: