Basho Launches Riak 2.0 – Disruptive Distributed Databases

Basho Highlights

  • Riak is Open Source noSQL distributed database and S3-compliant storage software
  • Basho makes money from Enterprise software versions and services
  • Customers include Comcast, BestBuy, AT&T and the UK NHS
  • Seagate has built Riak compatibility into its Kinetic Open Storage platform
  • Version 2.0 includes security, structured data and search functions
  • NoSQL, Column Store and MPP database techniques have different pros and cons
  • The rise of disruptive scale-out databases coincides with a new focus on scale-up in-memory databases
lizzie hingley

Lizzie Hingley Art

I met with Sean Cribbs of Basho last week to talk about its Riak database. Sean is a developer himself – one of the 110 employees of his privately owned company. It has 2 current products, Riak and Riak CS. The former is a highly available noSQL distributed database and the latter, Cloud-based S3-compatible (Amazon’s Simple Storage Service) storage software. Both are available as Open Source software, with Basho making money from Enterprise versions and services. You’ll be interested to learn more about the features it is adding with Version 2.0 launched today.

Comcast, AT&T And The UK National Health Service Use Riak

The founders of Basho came from Akamai and aim to provide flexible reliable storage for Internet applications, focusing on large-scale business applications rather tan transaction processing. Riak uses a simple key/value model for object storage, where objects consist of a unique key and a value, stored in a flat namespace called a ‘bucket’. We last came across the ‘bucket’ concept when talking to SpectraLogic, where it is used to identify a group of files as defined in S3.
Despite its small size Basho Riak has some impressive users. In particular:

  • Comcast Universal uses Riak for highly-available object storage in xfinity.comcast.net and xfinitytv.comcast.net
  • BestBuy uses Riak for its eCommerce applications
  • AT&T which uses Riak in a number of applications, especially the development of mHealth for the secure storing of bio data from wearable personal health monitors
  • The UK National Health Service is using Riak for its Spine2 database
  • Seagate’s Kinetic Open Storage platform is interoperable with Riak

As a clustered noSQL database Riak is tolerant to hardware failures. It is one of a number of new databases we have come across recently, including MongoDB at Dell and Toad at HP.

Riak Version 2.0 New Features

Riak 2.0 launched today has a number of new technical and business features. Specific inclusion includes:

  • Riak Data Types – it includes a range of flexible, distributed data types to simplify application development while retaining Riak’s availability and partition tolerance characteristics; data types include distributed counters, sets, maps, registers, and flags
  • Strong Consistency – developers can now choose between ‘eventually consistent’ (the default in the current 1.4 version providing high-availability) or – the new – ‘strongly consistent’, based on data requirements
  • Apache Solr full-text search integration – the search function has been completely redesigned leveraging the Apache Solr engine, fully supporting the Solr client query APIs, which enables integration with many existing software solutions
  • Security – it includes the ability to administer access rights and utilize plug-in authentication models, with Authentication and Authorization provided via client APIs
  • Simplified Configuration Management – it changes where and how configuration information is stored in an easy-to-parse and transparent format
  • Reduced Replicas for Secondary Sites – only in the Riak Enterprise 2.0 version, users can now choose to store fewer copies of replicated data across multiple data centres, allowing a better balance between storage overhead and availability

We’re impressed with the new security features, notably the mandatory transport layer, which includes SSL and authorisation features, as well as the addition of structured data types beyond key value objects and the SoIr search function. The addition of ‘strong consistency’ with compare-and-swap operations allows users to choose which data is highly-available and which data has stronger guarantees.

Some Conclusions – Scale Out And Scale Up

Basho is very much involved in helping the Internet generation build distributed databases. Although RESTful APIs are not something Basho does – they are associated with AT&T’s mHealth application and both Riak and Riak CS help these kinds of application to scale out. Other distributed storage concepts include RedHat Gluster of course. One criticism of noSQL database approaches is that they tend to be very dependent on very technically proficient developers, while MPP systems are expensive and Column Store techniques are problematic because they change the data stored on disk. Nevertheless it is impressive that Riak is being used to build applications where security is a necessity – patient-record access applications for instance.
For us it is interesting that the expansion of disruptive scale-out software such as Riak is coinciding with a new focus on scale-up in-memory databases (specifically SAP HANA). We’ll do our best to try to position these 2 schools against each other in coming months.

One Response to “Basho Launches Riak 2.0 – Disruptive Distributed Databases”

Read below or add a comment...

Trackbacks

  1. […] Database’. Types of NoSQL databases include Column, Document, Key-value (see our write up of Riak) and Graph. A criticism of these types of database is that they are difficult to programme and […]