datafusion
Rust Mediumapache/datafusion
8,566 stars
2,032 forks
20 open issues
Active Apr 2026
Beginner-Friendly Issues 20
Issues tagged for new contributors
Refactor: extract sort pushdown logic from FileScanConfig into separate module
#21433 · Apr 7, 2026
invalid `TIME` typed literal causes planner panic
#21431 · Apr 7, 2026
Add BuildHasher variants for hash_utils
#21428 · Apr 7, 2026
Optimize object store accesses for the CSV scanner
#21419 · Apr 6, 2026
enhancement
ResolveGroupingFunction does not unwrap Alias nodes
#21411 · Apr 6, 2026
bug
Support parquet content-defined chunking options
#21408 · Apr 6, 2026
enhancement
Automate breaking change detection
#21406 · Apr 6, 2026
enhancement
doc: explain the parquet cdc feature from a user perspective
#21404 · Apr 6, 2026
Support compound field access after subscripts, e.g. payload[1].a
#21384 · Apr 5, 2026
Parallel merge in SortPreservingMergeExec after sort elimination
#21381 · Apr 5, 2026
Optimize character_length UDF performance
#21380 · Apr 5, 2026
enhancement
bug
A few TPCH benchmark queries are incorrect causing issues when scale factor > 1
#21368 · Apr 4, 2026
bug
Defer task spawning in SortPreservingMergeExec to first poll
#21329 · Apr 3, 2026
enhancement
PropagateEmptyRelation does not eliminate outer joins when one side is empty
#21320 · Apr 2, 2026
enhancement
Sort pushdown: reorder row groups by statistics within each file
#21317 · Apr 2, 2026
Duplicate `GROUPING SETS` rows are incorrectly collapsed during execution
#21316 · Apr 2, 2026
bug
Add configurable UNION DISTINCT support to FILTER rewrite optimization
#21310 · Apr 2, 2026
enhancement
[EPIC] first class support for struct field / Variant access in Parquet
#21308 · Apr 1, 2026
`ProjectionExec` produces unknown statistics for all `ScalarFunctionExpr` outputs
#21307 · Apr 1, 2026
enhancement