As soon as you are trying to do anything non-trivial with HBase, you have to deal with transformation from Pojo to HBase and HBase to Pojo. That's where ORM jumps in, and that's where we start with Surus.
Surus is a simple, yet powerful HBase ORM. It features:
- Mapping is defined by annotations
(sigh with relieve - no code generation and no setters/getter) - Support both column and column family levels of mapping granularity
- Uses JSON for complex data types, and serializes data in a compact binary format
Considering laconic format of the blog posts, let's review typical use-cases for Surus:
- Mapping definition
- Writing to HBase
- Reading from HBase
- Integration with Hadoop mapreduce framework
For our example we will need HBase table tbl_example with structure:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<TableSchema name="tbl_example"> | |
<ColumnSchema name="family_mapping" BLOCKCACHE="false" VERSIONS="1"/> | |
<ColumnSchema name="stat" BLOCKCACHE="false" VERSIONS="1"/> | |
<ColumnSchema name="nested_maps" BLOCKCACHE="false" VERSIONS="1"/> | |
</TableSchema> |
Let's assume that we want to:
- Store Integer value in column stat:number_of_users
- Store Map<String, Integer> in column stat:months
- Store Map<Integer, Integer> in column family family_mapping
- Store Map<Long, Integer> in every column of the family nested_maps
Our Java class will look like:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class Example { | |
@HRowKey | |
public byte[] key; | |
@HProperty(family = "stat", identifier = "number_of_users") | |
public long numberOfUsers; | |
@HMapProperty(family = "stat", identifier = "months", keyType = String.class, valueType = Integer.class) | |
public Map<String, Integer> months = new HashMap<String, Integer>(); | |
@HMapFamily(family = "family_mapping", keyType = Integer.class, valueType = Integer.class) | |
public Map<Integer, Integer> familyMapping = new HashMap<Integer, Integer>(); | |
@HMapFamily(family = "nested_maps", keyType = Long.class, valueType = Map.class) | |
@HNestedMap(keyType = Long.class, valueType = Integer.class) | |
public Map<Long, Map<Long, Integer>> nestedMaps = new HashMap<Long, Map<Long, Integer>>(); | |
} |
To perform HBase insert, we need Put object and some magic from EntityService:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// declare and initialize Example instance | |
Example example = new Example(); | |
example.numberOfUsers=... | |
// declare EntityService | |
EntityService<Example> esExample = new EntityService<Example>(Example.class); | |
// get Put object | |
Put put = esExample.insert(example); |
To parse Result object from HBase, we reverse our activities:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// create Get object | |
HTable tExample = ... | |
Get get = new Get(ID_IN_BYTES); | |
Result result = tExample.get(get); | |
Example example = esExample.parseResult(result); |
And finally, lets review how to integrate with Hadoop mapreduce.
First, mapper:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
protected void map(ImmutableBytesWritable key, Result result, Context context) throws IOException, InterruptedException { | |
int vertexA = Bytes.toInt(key.get()); | |
Example example = esExample.parseResult(result); | |
... | |
} |
Next, reducer:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
protected void reduce(ImmutableBytesWritable key, Iterable<ImmutableBytesWritable> values, Context context) throws IOException, InterruptedException { | |
int vertexA = Bytes.toInt(key.get()); | |
for (ImmutableBytesWritable entry : values) { | |
Example example = create Example instance of "entry" | |
Put putA = esExample.insert(example); | |
putA.setWriteToWAL(false); | |
context.write(HBASE_KEY_DUMMY, putA); | |
} | |
} |
Surus is mature framework, but despite its abilities it is still simple and fast.
Feel free to navigate to Surus wiki [2] for more details and examples.
[1] Surus at github
[2] Surus wiki
1 comment:
Hello, please setup public permissions on your wiki in github, we are unable to read most of the pages.
Post a Comment