geoparquet-io¶
Fast I/O and transformation tools for GeoParquet files using PyArrow and DuckDB.
Features¶
- Fast: Built on PyArrow and DuckDB for high-performance operations
- Comprehensive: Sort, partition, enhance, and validate GeoParquet files
- Spatial Indexing: Add bbox, H3 hexagonal cells, KD-tree partitions, and admin divisions
- Best Practices: Automatic optimization following GeoParquet 1.1 spec
- Flexible: CLI and Python API for any workflow
- Tested: Extensive test suite across Python 3.9-3.13 and all platforms
Quick Example¶
# Install
pip install geoparquet-io
# Inspect file structure and metadata
gpio inspect myfile.parquet
# Check file quality and best practices
gpio check all myfile.parquet
# Add bounding box column for faster queries
gpio add bbox input.parquet output.parquet
# Sort using Hilbert curve for spatial locality
gpio sort hilbert input.parquet output_sorted.parquet
# Partition into separate files by country
gpio partition admin buildings.parquet output_dir/
Why geoparquet-io?¶
GeoParquet is a cloud-native geospatial data format that combines the efficiency of Parquet with geospatial capabilities. This toolkit helps you:
- Optimize file layout for cloud-native access patterns
- Add spatial indices for faster queries and analysis
- Validate compliance with GeoParquet best practices
- Transform large datasets efficiently using columnar operations
Getting Started¶
New to geoparquet-io? Start here:
- Installation Guide - Get up and running quickly
- Quick Start Tutorial - Learn the basics in 5 minutes
- User Guide - Detailed documentation for all features
Command Reference¶
- inspect - Examine file metadata and preview data
- check - Validate files against best practices
- sort - Spatially sort using Hilbert curves
- add - Enhance files with spatial indices
- partition - Split files into optimized partitions
- format - Apply formatting best practices
Support¶
- Issues: GitHub Issues
- Source Code: GitHub Repository
- Contributing: See our Contributing Guide
License¶
Apache 2.0 - See LICENSE for details.