|Description||Hadoop-based Cloud Data Warehouse|
|Angel, 9/12 |
Anand Babu Periasamy
|Angel, 11/12 |
Treasure Data’s vision is to provide a service-based big data solution that eliminates the cost and complexity of the big data analytics solutions currently on the market. The Treasure Data Cloud Data Warehouse service offers an affordable, quick-to-implement and easy-to-use big data option that does not require specialized IT resources, making big data analytics available to the mass market.
Here’s how we do it. We leverage Hadoop and other open source technologies to keep our costs low, and we’ve added our own innovative technology to address three critical Hadoop bottlenecks:
• Data-Load. We provide two tools to make loading data fast and easy: a bulk data-loader for initial data load; and td-agent, a lightweight daemon based on our highly successful Fluentd product, for streaming data collection and load. They both support standard JSON format transformation for structured, semi-structured and unstructured data types. • Columnar Data Processing. We have replaced HDFS - which still has difficulties in data management and SPOF issue – with our own columnar database. This enables us to process massive volumes of data much more quickly making near real-time analysis a reality for Hadoop users. We also use our MessagePack technology and various compression algorithms to achieve 5-10x data storage efficiencies. • Faster and Easier Querying. On the back-end, we provide an SQL-like query language that distributes and runs your query in parallel. There is no need to learn a complex domain specific language. We also provide a JDBC driver, which allows you to use your preferred BI / Visualization tools (e.g. Jaspersoft, Indicee, Metric Insights); and we will offer an ODBC driver soon which will enable integration with Excel, Tableau, etc.
The Treasure Data service is hosted on Amazon S3 so we can offer a scalable, reliable and secure infrastructure production environment without the need to pull IT staff from other projects. Processing, storage and network resources are completely elastic and can be scaled up or down as requirements. Our service is managed 24x7 by an expert operations staff and this eliminates the overhead costs associated with resourcing and managing an on-premise environment.