Building a Real-Time Stock Price Fetching Tool with LangChain and DistilBERT
In today’s data-driven world, real-time information combined with natural language processing (NLP) can create powerful tools for various applications. In this article, we will build a tool that fetches the latest stock prices and summarizes the information using DistilBERT, a smaller variant of BERT, all while leveraging the LangChain library.
LangChain is a versatile library that enables seamless integration of language models with various tools and data sources. One of the standout features of LangChain is the ability to annotate functions with the @tool
decorator, which makes them directly callable by language models. This allows for efficient and straightforward integration of real-time data fetching and processing within an NLP pipeline.
Step 1: Setting Up Your Environment
Before diving into the code, ensure you have Python installed on your machine. We’ll use the langchain
, transformers
, requests
, and pandas
libraries. Install these using pip:
pip install langchain transformers requests pandasp
tep 2: Fetching Stock Prices
We’ll use the Alpha Vantage API to fetch stock prices. To get started, you need a free API key from Alpha Vantage.
Here’s a function to fetch the latest stock price for a given symbol:
import requests
import pandas as pd
def fetch_stock_price(symbol, api_key):
url = f"https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol={symbol}&interval=1min&apikey={api_key}"
response = requests.get(url)
data = response.json()
# Parse the JSON data
time_series = data.get("Time Series (1min)")
if not time_series:
return None
# Get the latest price
latest_time = list(time_series.keys())[0]
latest_price = time_series[latest_time]["1. open"]
return latest_time, latest_price
# Example usage
api_key = "YOUR_API_KEY"
symbol = "AAPL"
latest_time, latest_price = fetch_stock_price(symbol, api_key)
print(f"Latest price of {symbol} at {latest_time} is {latest_price}")
This function sends a request to the Alpha Vantage API, retrieves the latest stock prices, and parses the JSON response to extract the latest price. We handle the response to ensure that if the data is unavailable, the function returns None
.
Step 3: Setting Up DistilBERT with LangChain
DistilBERT, a smaller and faster version of BERT, is perfect for our summarization task. We’ll use the LangChain library to integrate DistilBERT. LangChain’s @tool
decorator is particularly useful here, as it allows us to define functions that the language model can call directly, enhancing the modularity and flexibility of our code.
Here’s how to set up a summarizer with DistilBERT:
from langchain import LanguageModelChain, tool
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
import torch
class DistilBertSummarizer(LanguageModelChain):
def __init__(self):
super().__init__()
self.tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
self.model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
@tool
def summarize(self, text):
inputs = self.tokenizer(text, return_tensors='pt')
with torch.no_grad():
outputs = self.model(**inputs)
logits = outputs.logits
summary = logits.argmax(dim=1).item()
return summary
# Example usage
summarizer = DistilBertSummarizer()
stock_info = f"Latest price of {symbol} at {latest_time} is {latest_price}"
summary = summarizer.summarize(stock_info)
print(f"Summary: {summary}")
The @tool
decorator provides several advantages:
- Direct Integration: It allows seamless integration of external functions with language models, making it easier to call these functions from within the model’s inference process.
- Modularity: Functions decorated with
@tool
can be easily reused and combined with other tools, enhancing code modularity and reusability. - Efficiency: By directly calling annotated functions, the language model can process and generate responses more efficiently.
Step 4: Combining Data Fetching and Summarization
Now, we’ll combine the stock price fetching function and the DistilBERT summarization into a single cohesive function:
def get_stock_summary(symbol, api_key):
latest_time, latest_price = fetch_stock_price(symbol, api_key)
if latest_time and latest_price:
stock_info = f"Latest price of {symbol} at {latest_time} is {latest_price}"
summarizer = DistilBertSummarizer()
summary = summarizer.summarize(stock_info)
return summary
else:
return "Failed to fetch stock price"
# Example usage
symbol = "AAPL"
summary = get_stock_summary(symbol, api_key)
print(f"Stock Summary: {summary}")
Explanation
This function combines the data fetching and summarization steps:
- It fetches the latest stock price using the
fetch_stock_price
function. - It creates a summary of the stock information using the
DistilBertSummarizer
class. - If the stock price is successfully fetched, it returns the summary; otherwise, it returns an error message.
Conclusion
In this article, we built a simple yet powerful tool to fetch the latest stock prices and summarize the information using DistilBERT and the LangChain library. This demonstrates how combining real-time data fetching with NLP models can create useful applications.
Advantages of Using LangChain and @tool
- Ease of Integration: LangChain simplifies the integration of language models with external tools and data sources.
- Modularity and Reusability: Functions annotated with
@tool
can be easily reused and combined, promoting modular code design. - Efficiency: Directly callable functions improve the efficiency of the language model’s inference process.
Feel free to extend this project by adding more features, such as historical data analysis or integrating other NLP models. The possibilities are endless. Happy coding!