Monday, April 02, 2012

Surus: HBase ORM

Surus[1] was mentioned previously, but never really explained.
As soon as you are trying to do anything non-trivial with HBase, you have to deal with transformation from Pojo to HBase and HBase to Pojo. That's where ORM jumps in, and that's where we start with Surus.

Surus is a simple, yet powerful HBase ORM. It features:
  • Mapping is defined by annotations
    (sigh with relieve - no code generation and no setters/getter)
  • Support both column and column family levels of mapping granularity 
  • Uses JSON for complex data types, and serializes data in a compact binary format

Considering laconic format of the blog posts, let's review typical use-cases for Surus:
  • Mapping definition
  • Writing to HBase 
  • Reading from HBase 
  • Integration with Hadoop mapreduce framework
For our example we will need HBase table tbl_example with structure:
Let's assume that we want to:
  • Store Integer value in column stat:number_of_users
  • Store Map<String, Integer> in column stat:months
  • Store Map<Integer, Integer> in column family family_mapping
  • Store Map<Long, Integer> in every column of the family nested_maps
Our Java class will look like:

To perform HBase insert, we need Put object and some magic from EntityService:

To parse Result object from HBase, we reverse our activities:

And finally, lets review how to integrate with Hadoop mapreduce.
First, mapper:
Next, reducer:
Surus is mature framework, but despite its abilities it is still simple and fast.
Feel free to navigate to Surus wiki [2] for more details and examples.

[1] Surus at github

[2] Surus wiki

2 comments:

Unknown said...

Hello, please setup public permissions on your wiki in github, we are unable to read most of the pages.

benslin kard said...

Hbase is the Hadoop database. Think of it as a distributed, scalable, big data store.