Using ArcticDB for Financial Analysis with Python and yFinance
Efficient financial data storage and analysis are essential for traders, analysts, and data scientists. ArcticDB, a high-performance database optimized for time-series data, combined with yFinance, makes it easy to fetch, store, and analyze financial data at scale.
Why Use ArcticDB for Financial Data?
Fast read/write operations for large datasets
Supports DataFrame operations like deduplication
Versioning for historical data comparisons
Optimized for time-series data
Step 1: Install Required Libraries
pip install arcticdb yfinance pandas
Step 2: Fetch Stock Data with yFinance
import yfinance as yf
import pandas as pd
# Fetch historical data for Apple (AAPL)
ticker = yf.Ticker("AAPL")
hist_data = ticker.history(period="5y", auto_adjust=False)
print(hist_data.head())
Step 3: Store Data in ArcticDB
from arcticdb import Arctic
DB_PATH = 'lmdb://./stock_db'
LIBRARY_NAME = 'financial_data'
# Connect to ArcticDB and create a library
ac = Arctic(DB_PATH)
if LIBRARY_NAME not in ac.list_libraries():
ac.create_library(LIBRARY_NAME)
lib = ac[LIBRARY_NAME]
# Write data to ArcticDB
lib.write("AAPL", hist_data)
print("Data written to ArcticDB.")
Step 4: Read and Analyze Data from ArcticDB
# Read data from ArcticDB
result = lib.read("AAPL")
df = result.data
print(df.tail())
# Simple moving average (SMA) example
df['SMA_50'] = df['Close'].rolling(window=50).mean()
# Plot the stock price with SMA
df[['Close', 'SMA_50']].plot(title="AAPL Stock Price with 50-Day SMA")
Step 5: Update and Deduplicate Data
# New data update
new_data = ticker.history(period="1mo", auto_adjust=False)
combined = pd.concat([result.data, new_data])
filtered = combined[~combined.index.duplicated(keep='last')]
lib.write("AAPL", filtered.sort_index())
print("Data updated.")
Getting Started with ArcticDB
ArcticDB Tutorial in Python
ArcticDB is a high-performance database for time-series and large data storage, particularly useful for financial data like stock prices. It supports operations similar to Pandas DataFrames and offers fast read/write capabilities.
Step 1: Install ArcticDB
pip install arcticdb
Step 2: Setup and Connect to ArcticDB
Create a connection to ArcticDB using either LMDB (Local DB) or MongoDB.
from arcticdb import Arctic
DB_PATH = "lmdb://./my_arctic_db" # Path for local ArcticDB
ac = Arctic(DB_PATH) # Connect to ArcticDB
print("Available Libraries:", ac.list_libraries())
Step 3: Create a Library
Libraries are collections to organize different datasets.
LIBRARY_NAME = "example_library"
# Create a new library if it doesn't already exist
if LIBRARY_NAME not in ac.list_libraries():
ac.create_library(LIBRARY_NAME)
lib = ac[LIBRARY_NAME]
print(f"Library '{LIBRARY_NAME}' created.")
Step 4: Write Data to ArcticDB
You can store Pandas DataFrames in ArcticDB.
import pandas as pd
# Sample DataFrame with stock prices
data = pd.DataFrame({
'date': pd.date_range('2023-01-01', periods=5),
'open': [100, 102, 105, 110, 120],
'close': [102, 105, 108, 115, 125]
}).set_index('date')
symbol = "AAPL"
# Write the DataFrame to ArcticDB
lib.write(symbol, data)
print(f"Data for {symbol} written to ArcticDB.")
Step 5: Read Data from ArcticDB
# Check if the symbol exists
if lib.has_symbol(symbol):
df = lib.read(symbol).data
print(f"Data for {symbol}:\n", df)
else:
print(f"No data found for {symbol}.")
Step 6: Update Data with Deduplication
# New data with some overlapping dates
new_data = pd.DataFrame({
'date': pd.date_range('2023-01-03', periods=5),
'open': [106, 112, 118, 123, 130],
'close': [110, 117, 121, 130, 135]
}).set_index('date')
# Append and deduplicate
existing = lib.read(symbol).data
combined = pd.concat([existing, new_data])
filtered = combined[~combined.index.duplicated(keep='last')]
lib.write(symbol, filtered.sort_index())
print("Data updated with deduplication.")
Step 7: Delete Data
lib.delete(symbol)
print(f"Data for {symbol} deleted.")
Summary of Key Functions
Function : Description
create_library()
- Create a new librarywrite()
- Store DataFrameread()
- Fetch DataFramehas_symbol()
- Check if data exists for a symboldelete()
- Delete data for a symbol
Conclusion
By integrating ArcticDB and yFinance, you can efficiently fetch, store, and analyze financial data for better decision-making. ArcticDB’s versioning and high-performance capabilities make it a game-changer for large-scale financial analysis.
Want to take your financial analysis even further? Explore more with technical indicators and machine learning techniques!