k, k, k… after 2 days in Amsterdam, for the first european edition of the Hadoop Summit, I finally realised that most of the obstacles I encountered working with HDFS > HBase > Hive were due to the edge use case I was putting in practice
1) very few people use Hive over HBase
2) even less people use HBase’s Thrift interface (apart of Facebook)
they are all very powerfull instruments, but here I have to quote my current project manager… “u do not pay licenses, but u need the most expensive people on the market to have it running”
more details here… if u wanna try it out (pumping a MySQLdump into Hadoop) https://pypi.python.org/pypi/SqlHBase/0.1