Name
CSV
Apache Hudi
Description
Comma-Separated Values (CSV) is a text file format that uses commas to separate values in plain text.
Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc.
Source code
https://github.com/apache/hudi
Website
https://www.rfc-editor.org/rfc/rfc4180.html
https://hudi.apache.org/
Language support
java, scala, c++, python, r, php, go
License
N/A
Apache license 2.0
Year created
0
2016
Company
Uber
Use cases
Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions
Is human readable
yes
no
Orientation
row
column or row
Has type system
no
yes
Has nested structure support
no
yes
Has native compression
no
yes
Has encoding support
no
yes
Has constraint support
no
yes
Has acid support
no
yes
Has metadata
no
yes
Has encryption support
no
maybe
Data processing framework support
Apache Beam,
Apache Drill,
Apache Flink,
Apache Gobblin,
Apache Hive,
Apache NiFi,
Apache Pig,
Apache Spark,
Apache Spark,
Apache Flink,
Analytics query support
Apache Impala,
Apache Druid,
Apache Pinot,
AWS Athena,
Azure Synapse,
BigQuery,
Clickhouse,
Dremio,
DuckDB,
Firebolt,
Apache Hive,
Apache Impala,
AWS Athena,
BigQuery,
Clickhouse,
Presto,
Trino,