| Name |
Apache ORC |
Apache Avro |
| Description |
ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. |
Apache Avro is the leading serialization format for record data, and first choice for streaming data pipelines. |
| License |
Apache license 2.0 |
Apache license 2.0 |
| Source code |
https://github.com/apache/orc |
https://github.com/apache/avro |
| Website |
https://orc.apache.org/ |
https://avro.apache.org/ |
| Year created |
2013 |
2009 |
| Company |
Hortonworks, Facebook |
Apache |
| Language support |
java, scala, c++, python |
java, c++, c#, c, python, javascript, perl, ruby, php, rust |
| Use cases |
Write once read many, Analytics, Efficient storage, ACID transactions |
Stream processing, Analytics, Efficient data exchange |
| Is human readable |
no
|
no
|
| Orientation |
row |
row |
| Has type system |
yes
|
yes
|
| Has nested structure support |
yes
|
yes
|
| Has native compression |
yes
|
yes
|
| Has encoding support |
yes
|
yes
|
| Has constraint support |
no
|
no
|
| Has acid support |
no
|
no
|
| Has metadata |
yes
|
yes
|
| Has encryption support |
yes
|
no
|
| Data processing framework support |
Apache Flink,
Apache Gobblin,
Apache Hadoop,
Apache NiFi,
Apache Pig,
Apache Spark,
|
Apache Flink,
Apache Gobblin,
Apache NiFi,
Apache Pig,
Apache Spark,
|
| Analytics query support |
Apache Impala,
Apache Druid,
Apache Hive,
Apache Pinot,
AWS Athena,
BigQuery,
Clickhouse,
Firebolt,
Presto,
Trino,
|
Apache Impala,
Apache Druid,
Apache Hive,
Apache Pinot,
AWS Athena,
BigQuery,
Clickhouse,
Firebolt,
|