File

Attribute	Apache Hudi	Apache Parquet
Name	Apache Hudi	Apache Parquet
Description	Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc.	Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval.
Source code	https://github.com/apache/hudi	https://github.com/apache/parquet-format
Website	https://hudi.apache.org/	https://parquet.apache.org/
License	Apache license 2.0	Apache license 2.0
Year created	2016	2013
Company	Uber	Twitter, Cloudera
Use cases	Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions	Write once read many, Analytics, Efficient storage, Column based queries
Language support		java, scala, c++, python, r, php
Is human readable	no	no
Orientation	column or row	column
Has type system	yes	yes
Has nested structure support	yes	yes
Has native compression	yes	yes
Has encoding support	yes	yes
Has constraint support	yes	no
Has acid support	yes	no
Has metadata	yes	yes
Has encryption support	maybe	yes
Data processing framework support	Apache Spark, Apache Flink,	Apache Beam, Apache Drill, Apache Flink, Apache Spark,
Analytics query support	Apache Hive, Apache Impala, AWS Athena, BigQuery, Clickhouse, Presto, Trino,	Apache Hive, Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt,