File
Attribute | Apache Parquet | Apache Iceberg |
---|---|---|
Name | Apache Parquet | Apache Iceberg |
Description | Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. | Iceberg is a high-performance format for huge analytic tables. Utilises data stored in either parquet, avro, or orc. |
License | Apache license 2.0 | Apache license 2.0 |
Source code | https://github.com/apache/parquet-format | https://github.com/apache/iceberg |
Website | https://parquet.apache.org/ | https://iceberg.apache.org/ |
Year created | 2013 | 2017 |
Company | Twitter, Cloudera | Netflix |
Language support | java, scala, c++, python, r, php | |
Use cases | Write once read many, Analytics, Efficient storage, Column based queries | Write once read many, Analytics, Efficient storage, ACID transactions |
Is human readable |
no
|
no
|
Orientation | column | column or row |
Has type system |
yes
|
yes
|
Has nested structure support |
yes
|
yes
|
Has native compression |
yes
|
yes
|
Has encoding support |
yes
|
yes
|
Has constraint support |
no
|
no
|
Has acid support |
no
|
yes
|
Has metadata |
yes
|
yes
|
Has encryption support |
yes
|
maybe
|
Data processing framework support | Apache Beam, Apache Drill, Apache Flink, Apache Spark, | Apache Drill, Apache Flink, Apache Gobblin, Apache Pig, Apache Spark, |
Analytics query support | Apache Hive, Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt, | Apache Impala, Apache Druid, Apache Hive, AWS Athena, BigQuery, Clickhouse, Dremio, DuckDB, Presto, Trino, |
Showing 1 to 21 of 21 entries