File
Attribute | Apache Parquet | CSV |
---|---|---|
Name | Apache Parquet | CSV |
Description | Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. | Comma-Separated Values (CSV) is a text file format that uses commas to separate values in plain text. |
License | Apache license 2.0 | N/A |
Source code | https://github.com/apache/parquet-format | |
Website | https://parquet.apache.org/ | https://www.rfc-editor.org/rfc/rfc4180.html |
Year created | 2013 | 0 |
Company | Twitter, Cloudera | |
Language support | java, scala, c++, python, r, php | java, scala, c++, python, r, php, go |
Use cases | Write once read many, Analytics, Efficient storage, Column based queries | |
Is human readable |
no
|
yes
|
Orientation | column | row |
Has type system |
yes
|
no
|
Has nested structure support |
yes
|
no
|
Has native compression |
yes
|
no
|
Has encoding support |
yes
|
no
|
Has constraint support |
no
|
no
|
Has acid support |
no
|
no
|
Has metadata |
yes
|
no
|
Has encryption support |
yes
|
no
|
Data processing framework support | Apache Beam, Apache Drill, Apache Flink, Apache Spark, | Apache Beam, Apache Drill, Apache Flink, Apache Gobblin, Apache Hive, Apache NiFi, Apache Pig, Apache Spark, |
Analytics query support | Apache Hive, Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt, | Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt, |
Showing 1 to 21 of 21 entries