Guest blog by Johann George, Sr. Principal Strategist, Software Solutions
Today SanDisk announced the availability of ZetaScale™ Software at the MongoDB World conference in New York City, a new software that extends datasets from DRAM into flash storage to increase scale and improve performance while lowering overall costs.
ZetaScale is all about helping you save money, and it does this in one of two ways. If you are running a large installation with applications such as Cassandra, MongoDB, NoSQL databases and the like that have been optimized with ZetaScale, you can achieve far better performance. For some use cases this actually means reducing the number of servers needed, sometimes by an order of magnitude.
If you are using an in-memory compute application that has been optimized by ZetaScale such as Redis, you can extend your dataset from DRAM to flash. As flash costs are far lower than expensive RAM, this means you can not only save you on cost and power utilization, but also extend capacity. This is particularly helpful as new applications are written that will run on multiple servers, and could leverage a flash-optimized interface for faster access to storage. At the end of this blog post you can find out how get the ZetaScale SDK at no charge for use with your applications.
Understanding a Key/Value Store
I wanted to take this opportunity to go a bit deeper into what ZetaScale is and how our software works to achieve better performance, scale and cost efficiencies.
At it’s simplest level, ZetaScale Software is a key/value store. The concept behind a key/value store is simple. Normally, when you access storage on a drive, it looks like a series of contiguous fixed size blocks numbered sequentially starting at zero. When you want to access a particular block, you specify its handle, a logical block number which allows you to read or write that block. The size of the block is fixed for a particular drive; usually 512 or 4,096 bytes.
With a key/value store, when you write data, rather than finding a free block and specifying that logical block number as the handle where the data might be written to, the handle is simply an arbitrary unique name you choose. So you could name a piece of data Bilbo or cat1972. In a key/value store, this name or handle is referred to as the key. When you want to retrieve the data, you ask for it by name and the data is returned back to you. Again, in the parlance of a key/value store, this data is referred to as a value. Unlike a drive, this data is not a fixed size and can be arbitrarily large or small as needed.
Key/Value Store for Applications
As it turns out, this is a more natural way for applications to deal with data. Data rarely comes in nice fixed size chunks and most applications would rather give their data a name of their choosing than trying to associate it with an unused logical block number.
You have probably seen the key/value paradigm before. Perl provides a key/value interface with its hash datatype. Python calls its key/value datatype a dictionary. C++, Java and Scala have the Map class which provides the same functionality. However, unlike a key/value store, these language datatypes are not persistent. Once the program that is using it terminates, all key/value data is lost.
Another place you might have encountered the key/value paradigm is in the NoSQL world. Most NoSQL databases present a key/value interface at some level. The most well known are probably Cassandra, MongoDB, Riak and Redis but there are countless others and the list continues to grow. This is no surprise since as we mentioned earlier, this is a very natural interface for applications.
What About File Systems?
By now you might be wondering about file systems. After all, why not just store your data as files in a directory. Files can be arbitrarily named and can also store data of different sizes. You can certainly do that. The main problem is that file systems are much more heavyweight than key/value stores. This is particularly noticeable when using small objects. File systems tend to be slower and waste space. Most applications tend to go through much pain to work around these issues when using file systems.
Where ZetaScale Fits In
Is ZetaScale only a key/value store? Actually, ZetaScale also contains a host of useful features such as hashed and indexed containers, configurable durability, transactions, snapshots and a configurable caching layer. I’ll expand and explain these features in more detail in future blog posts. The role of these features is to help remove many of the headaches when writing applications, and perhaps just as importantly, ZetaScale has highly optimized these features for Flash.
How Well does it work?
We took the open source NoSQL database Cassandra and modified it to use ZetaScale to access storage. On certain workloads, we saw a five times increase in performance. We also modified MongoDB to use ZetaScale and saw almost a factor of ten increase in performance on some common workloads. We have adapted a variety of other well known applications to use ZetaScale and have seen similar gains.
I should mention here that while we mostly test ZetaScale using SanDisk SSD drives, ZetaScale is not limited to SanDisk Flash and has been designed to work with any standard SSD. Of course, some drives perform better than others and that will be reflected by ZetaScale as well.
Helping You Achieve Cost Efficiencies
ZetaScale is all about helping you save you money. It does this in one of two ways. If you are running a large installation with applications that have been optimized with ZetaScale such as Cassandra, MongoDB and the like, because the ZetaScale optimized application delivers more performance, so you can often reduce the number of servers needed; sometimes by an order of magnitude.
If you are using an in-memory compute application that has been optimized by ZetaScale such as Redis, you can often replace DRAM with far less expensive Flash saving cost and power utilization and also extending capacity. If you are writing a new application which will run on many servers and would like to use a flash optimized interface for faster access to storage, you might want to build it using ZetaScale. If any of these are you, send us an email to email@example.com. We would love to hear from you and get you set up with ZetaScale SDK.
Here are some helpful PDF resources to help you learn how to take advantage of flash with ZetaScale:
Follow us for more ZetaScale news and join the conversation with @SanDiskDataCtr and #ZetaScale hashtag.