Previous blog (HBase 1): http://bitnine.net/blog-computing/understanding-of-hbase-part-1/
Previous blog (HBase 2): http://bitnine.net/blog-computing/understanding-of-hbase-part-2/
In the previous blog, HBase 2, we introduced CRUD Operation that is a function of Client basic API on HBase. In this blog, we are going to talk about Client Managing API function on HBase.
- All data, which are stored on HBase is grouped on ultimately one or more table.
- Naming a table follows the rule of file name as using a part of stored file path.
- Because HBase uses the way of column-oriented storage, it is not often following normalization rule like RDBMS (Existing a quite few tables)
- Table of HBase is unstacked and stored in the region
- Table Property
- The name of table cannot be started with “.” (period) or “-” (hyphen).
- Column families
- When you make a table, you should arrange a column family.
- Maximum file size
- It means the maximum size of region in the table. (Unit: byte)
- When the Region approaches at the maximum size you set, the system divides it. (Default value: 256MB)
- Column families
- HColumnDesciptor Generator
- Not being able to modify the name of column family
- The maximum number of version is three at the default value and you can decrease it by one if you don’t need to manage the version.
- Block Size
- It is similar concept with a page of RDBMS
- The default value of block size is 64KB
- The block size decides both the size of data that HBase reads in file storage and memory cache size.
- Block Cache
- It is an option to keep data in in-memory cache.
- If you have a data which is scanned only one time, it is better to make it disable not to be saved on the block cache.
- Expiration data (TTL)
- An expired data is deleted during the compaction
- the default value is lnteger.MAX_VALUE.(=Permanently saved)
- It is activated when you input a lower integer than the default value.
- It guarantees to enable that every block of column family loaded on memory continues to remain.
- It is better to use an In-Memory on the column family that has small value.
- Bloom Filter
- It is used to reduce the search time when you conduct to read data that have a tendentiousness.
- The default value is false because Bloom Filter causes extra loading time on a storage space or memory.
- Replication Range (scope)
- The value of inactivation is 0. (=Default value. Not operation copy of it)
- The value of activation is 1. (It operates a copy of column family on remote cluster)
- Basic functionality
- Functionality related to Table
- Functionality related to Schema
- Functionality related to Cluster
BITNINE GLOBAL INC., THE COMPANY SPECIALIZING IN GRAPH DATABASE
비트나인, 그래프 데이터베이스 전문 기업