Binance Data Loader
A Python library for downloading and processing historical data from Binance Vision.
GitHub: https://github.com/HuakunShen/binance-data-loader
PyPI: https://pypi.org/project/binance-data/
Features
- Download historical data from Binance Vision S3 bucket
- Support for multiple asset types (spot, futures)
- Flexible prefix-based approach for any data type
- Output formats: Parquet (default) or CSV
- Pandera schema validation for data integrity
- Timestamp auto-detection (milliseconds vs nanoseconds)
- Concurrent downloads for better performance
- Optional retention of raw ZIP files
- Preserve original directory structure
Installation
pip install binance-data
# Or with uv
uv pip install binance-dataQuick Start
from binance_data_loader import BinanceDataDownloader
# Download BTCUSDT 1h futures data as Parquet
downloader = BinanceDataDownloader(
prefix="data/futures/um/daily/klines/BTCUSDT/1h/",
destination_dir="./data",
output_format="parquet",
)
downloader.download()Data Loading & Resampling
from binance_data_loader import BinanceDataLoader
from datetime import datetime, timedelta, UTC
loader = BinanceDataLoader(data_dir="./data", data_type="spot")
# Load with resampling
df = loader.load(
symbol="BTCUSDT",
interval="1m",
resample_to="15m",
start_time=datetime.now(UTC) - timedelta(days=7),
)Supported Intervals
- Seconds:
1s - Minutes:
1m,3m,5m,15m,30m - Hours:
1h,2h,4h,6h,8h,12h - Days:
1d,3d - Weeks:
1w - Months:
1M
Shifted Resampling
Generate multiple shifted datasets for training data augmentation:
# Default 15m intervals end at 0, 15, 30, 45 minutes
df_standard = loader.load(symbol="ETHUSDT", interval="1m", resample_to="15m")
# Shifted by 1m - intervals end at 1, 16, 31, 46 minutes
df_shifted = loader.load(
symbol="ETHUSDT", interval="1m", resample_to="15m", shift="1m"
)Perfect for machine learning training with data augmentation.
Data Schema
Kline data columns: open_time, open, high, low, close, volume, close_time, quote_volume, count, taker_buy_volume, taker_buy_quote_volume, ignore
A Python library for quantitative trading data preparation.