market_cap_series.csv is a time series of market capitalizations for various companies.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load market capitalization data from CSV, parsing dates and setting 'Date' as the index
market_cap_series = pd.read_csv('market_cap_series.csv', parse_dates=['Date'], index_col='Date')
print(market_cap_series.head())To visualize the change in market capitalization for each company, I took their earliest and latest values and plotted them on a bar graph.
# Select the market capitalization of the first and last trading days
first_market_cap = market_cap_series.iloc[0]
last_market_cap = market_cap_series.iloc[-1]
# Concatenate and plot the market capitalizations of the first and last trading days
pd.concat([first_market_cap, last_market_cap], axis=1).plot(kind='barh')
plt.show()To develop the index, I summed the market capitalizations of all the companies, normalizing the sum.
# Aggregate the total market capitalization for each trading day and print the result
raw_index = market_cap_series.sum(axis=1)
# Normalize the aggregated market capitalization to the first trading day and scale to 100
index = raw_index.div(raw_index.iloc[0]).mul(100)
print(index)To evaluate the index, I first calculated the percentage change.
# Calculate the overall return of the index from the first to the last trading day
index_return = ((index.iloc[-1] - index.iloc[0]) / index.iloc[0]) * 100
print(index_return)Next, I obtained the total market capitalization through company_info.csv, calculated the portion each company contributed to it, and then multiplied those values by the percentage change of the index to see how each company contributed to the index return.
# Load company information data from CSV
company_info = pd.read_csv('company_info.csv', index_col='Stock Symbol')
# Extract the 'Market Capitalization' column from the company information data
market_cap = company_info['Market Capitalization']
# Calculate the total market capitalization of all companies
total_market_cap = market_cap.sum()
# Calculate the weight of each company's market capitalization relative to the total market capitalization and print the result
weights = market_cap.div(total_market_cap)
# Calculate and plot the contribution of each company to the overall index return, sorted by contribution
index_contribution = weights.mul(index_return).sort_values()
index_contribution.plot(kind='barh')
plt.show()Another way to evaluate the index is by comparing it against a benchmark.
# Convert the normalized index series to a DataFrame for further analysis
data = index.to_frame('Index')
# Load Dow Jones Industrial Average (DJIA) data from CSV, parsing dates and setting 'DATE' as the index
Djia = pd.read_csv('djia.csv', parse_dates=['DATE'], index_col='DATE')
# Normalize the DJIA series to the first trading day and scale to 100, then add as a new column to the data DataFrame
djia = Djia.div(Djia.iloc[0]).mul(100)
data['DJIA'] = djia
# Calculate and print the total return for both the custom index and DJIA
print((data.iloc[-1] / data.iloc[0] - 1) * 100)
# Plot the normalized values of both the custom index and DJIA
data.plot()
plt.show()One final method to evaluate the index is by analyzing correlations between the stocks within the index. stock_prices.csv is a time series containing closing stock price information for each company.
# Load stock price data from CSV, parsing dates and setting 'Date' as the index
stock_prices = pd.read_csv('stock_prices.csv', parse_dates=['Date'], index_col='Date')
# Calculate the daily returns of the stocks
returns = stock_prices.pct_change()
# Calculate and print the pairwise correlations of daily returns between stocks
correlations = returns.corr()
# Plot a heatmap of the daily return correlations between stocks
sns.heatmap(correlations, annot=True)
plt.title('Daily Return Correlations')
plt.show()