The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop 2 environment provides scalable services including HDFS, Yarn, Zookeeper, HBase, PIG, Sqoop or Apache Drill.
Hadoop fuels Smart Data, with its powerful components, providing scalability and support of standard for large deployment, with limited software cost. Smart Data is taking advantage of the following Hadoop components.
- Zookeeper : to enable highly reliable distributed coordination between nodes
- Yarn/Map Reduce : to enable multi process, scheduling and monitoring of process
- Solr/Cloud : the leading indexation engine, to process document indexation and search on indexed document
- HDFS : Vanilla Hub can store data (text file, xml document) and document on hdfs storage
- HBase : data can be stored and retrieved from Hbase database