Name
Apache Hudi
CSV
Description
Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc.
Comma-Separated Values (CSV) is a text file format that uses commas to separate values in plain text.
Source code
https://github.com/apache/hudi
Website
https://hudi.apache.org/
https://www.rfc-editor.org/rfc/rfc4180.html
License
Apache license 2.0
N/A
Year created
2016
0
Company
Uber
Use cases
Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions
Language support
java, scala, c++, python, r, php, go
Is human readable
no
yes
Orientation
column or row
row
Has type system
yes
no
Has nested structure support
yes
no
Has native compression
yes
no
Has encoding support
yes
no
Has constraint support
yes
no
Has acid support
yes
no
Has metadata
yes
no
Has encryption support
maybe
no
Data processing framework support
Apache Spark,
Apache Flink,
Apache Beam,
Apache Drill,
Apache Flink,
Apache Gobblin,
Apache Hive,
Apache NiFi,
Apache Pig,
Apache Spark,
Analytics query support
Apache Hive,
Apache Impala,
AWS Athena,
BigQuery,
Clickhouse,
Presto,
Trino,
Apache Impala,
Apache Druid,
Apache Pinot,
AWS Athena,
Azure Synapse,
BigQuery,
Clickhouse,
Dremio,
DuckDB,
Firebolt,