Advanced ArcticDB Usage for Technical Analysis with Python
ArcticDB is a powerful tool for efficiently managing and analyzing time-series financial data. When combined with technical analysis techniques, it can streamline backtesting and algorithmic trading strategies. Below, I demonstrate advanced features like deduplication, metadata management, and multi-symbol data analysis for technical indicators.
Why Use ArcticDB for Technical Analysis?
High Performance: Faster read/write operations for large datasets.
Versioning: Keep track of data versions for backtesting.
Metadata Support: Tag datasets with useful information (e.g., source, retrieval time).
Batch Processing: Efficiently handle multiple stock symbols.
Step 1: Fetch and Store Multiple Stocks with Metadata
import yfinance as yf
from arcticdb import Arctic
import pandas as pd
DB_PATH = "lmdb://./advanced_stock_db"
LIBRARY_NAME = "technical_analysis_data"
# Connect and create library
ac = Arctic(DB_PATH)
if LIBRARY_NAME not in ac.list_libraries():
ac.create_library(LIBRARY_NAME)
lib = ac[LIBRARY_NAME]
symbols = ["AAPL", "MSFT", "GOOG", "TSLA"]
for symbol in symbols:
ticker = yf.Ticker(symbol)
hist_data = ticker.history(period="5y", auto_adjust=False)
# Add metadata with data source and timestamp
metadata = {"source": "Yahoo Finance", "retrieval_date": pd.Timestamp.now()}
if not hist_data.empty:
lib.write(symbol, hist_data, metadata=metadata)
print(f"{symbol} data stored with metadata.")
Step 2: Batch Read and Compute Technical Indicators
import numpy as np
# Batch read multiple stock data
symbols_data = lib.batch_read(symbols)
for symbol_data in symbols_data:
df = symbol_data.data
symbol = symbol_data.symbol
# Compute Technical Indicators: Bollinger Bands and MACD
df['SMA_20'] = df['Close'].rolling(window=20).mean()
df['Upper_Band'] = df['SMA_20'] + 2 * df['Close'].rolling(window=20).std()
df['Lower_Band'] = df['SMA_20'] - 2 * df['Close'].rolling(window=20).std()
# MACD calculation
df['EMA_12'] = df['Close'].ewm(span=12, adjust=False).mean()
df['EMA_26'] = df['Close'].ewm(span=26, adjust=False).mean()
df['MACD'] = df['EMA_12'] - df['EMA_26']
df['Signal_Line'] = df['MACD'].ewm(span=9, adjust=False).mean()
# Write back updated data
lib.write(symbol, df)
print(f"{symbol} updated with technical indicators.")
Step 3: Deduplication and Versioning
# Fetch and update AAPL data to simulate updates
result = lib.read("AAPL")
existing_data = result.data
# Simulate new overlapping data
new_data = existing_data.tail(10) # Fake overlapping update
# Combine and deduplicate
combined = pd.concat([existing_data, new_data])
deduplicated_data = combined[~combined.index.duplicated(keep='last')]
# Write as a new version
lib.write("AAPL", deduplicated_data, prune_previous_version=False)
print("AAPL data updated without deleting previous version.")
Step 4: Metadata and Symbol Statistics
# Fetch metadata
result = lib.read("AAPL")
print("Metadata for AAPL:", result.metadata)
# Get symbol statistics
info = lib.get_symbol_info("AAPL")
print("Symbol Info for AAPL:", info)
Step 5: Visualization and Analysis
import matplotlib.pyplot as plt
# Plot Bollinger Bands and Closing Prices
df = lib.read("AAPL").data
plt.figure(figsize=(12, 6))
plt.plot(df.index, df['Close'], label='Close Price')
plt.plot(df.index, df['SMA_20'], label='20-Day SMA')
plt.fill_between(df.index, df['Upper_Band'], df['Lower_Band'], color='lightgray', label='Bollinger Bands')
plt.title("AAPL Bollinger Bands")
plt.legend()
plt.show()
Conclusion:
Feature Description Metadata Handling Store additional metadata per dataset Batch Processing Handle multiple symbols simultaneously Deduplication Combine and clean overlapping data Versioning Maintain multiple versions for backtesting Technical Indicators Compute SMA, Bollinger Bands, and MACD