File
Attribute | Apache Hudi | Apache ORC |
---|---|---|
Name | Apache Hudi | Apache ORC |
Description | Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc. | ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. |
Source code | https://github.com/apache/hudi | https://github.com/apache/orc |
Website | https://hudi.apache.org/ | https://orc.apache.org/ |
License | Apache license 2.0 | Apache license 2.0 |
Year created | 2016 | 2013 |
Company | Uber | Hortonworks, Facebook |
Use cases | Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions | Write once read many, Analytics, Efficient storage, ACID transactions |
Language support | java, scala, c++, python | |
Is human readable |
no
|
no
|
Orientation | column or row | row |
Has type system |
yes
|
yes
|
Has nested structure support |
yes
|
yes
|
Has native compression |
yes
|
yes
|
Has encoding support |
yes
|
yes
|
Has constraint support |
yes
|
no
|
Has acid support |
yes
|
no
|
Has metadata |
yes
|
yes
|
Has encryption support |
maybe
|
yes
|
Data processing framework support | Apache Spark, Apache Flink, | Apache Flink, Apache Gobblin, Apache Hadoop, Apache NiFi, Apache Pig, Apache Spark, |
Analytics query support | Apache Hive, Apache Impala, AWS Athena, BigQuery, Clickhouse, Presto, Trino, | Apache Impala, Apache Druid, Apache Hive, Apache Pinot, AWS Athena, BigQuery, Clickhouse, Firebolt, Presto, Trino, |
Showing 1 to 21 of 21 entries