| Name |
Apache Parquet |
Delta Lake |
| Description |
Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. |
Delta Lake is an open-source storage framework that enables building a Lakehouse architecture. |
| License |
Apache license 2.0 |
Apache license 2.0 |
| Source code |
https://github.com/apache/parquet-format |
https://github.com/delta-io/delta |
| Website |
https://parquet.apache.org/ |
https://delta.io/ |
| Year created |
2013 |
2019 |
| Company |
Twitter, Cloudera |
Databricks |
| Language support |
java, scala, c++, python, r, php |
scala, java, python, rust |
| Use cases |
Write once read many, Analytics, Efficient storage, Column based queries |
Write once read many, Analytics, Efficient storage, ACID transactions |
| Is human readable |
no
|
no
|
| Orientation |
column |
column |
| Has type system |
yes
|
yes
|
| Has nested structure support |
yes
|
yes
|
| Has native compression |
yes
|
yes
|
| Has encoding support |
yes
|
yes
|
| Has constraint support |
no
|
yes
|
| Has acid support |
no
|
yes
|
| Has metadata |
yes
|
yes
|
| Has encryption support |
yes
|
maybe
|
| Data processing framework support |
Apache Beam,
Apache Drill,
Apache Flink,
Apache Spark,
|
Apache Drill,
Apache Flink,
Apache Spark,
|
| Analytics query support |
Apache Hive,
Apache Impala,
Apache Druid,
Apache Pinot,
AWS Athena,
Azure Synapse,
BigQuery,
Clickhouse,
Dremio,
DuckDB,
Firebolt,
|
Apache Hive,
AWS Athena,
Azure Synapse,
BigQuery,
Clickhouse,
Dremio,
Presto,
Trino,
|