Tutorial - Correlation between Indian Bank Stocks (feat. Nifty Bank)


3 min read

Hello everyone, this article is about the tutorial on how to do the correlation analysis of Indian Bank stocks which was one done in this article . Where we compared major private and public banking stocks listed on the stock exchange. This is my first tutorial article, please share it if you like it. Let's get right into it.

I will be using Python for data analysis for this one.

Install Dependencies

First, we need to install the required dependencies

pip install pandas
pip install numpy
pip install yfinance
pip install seaborn

Import Dependencies

Now, we will import these required dependencies into our code.

import pandas as pd
import numpy as np
import yfinance as yf
import seaborn as sns

Fetching Stock Price Data

I have taken the below stocks for correlation analysis and stored them in a list. You can see I have added their ticker symbols ending with ".NS" these are picked up from Yahoo Finance since we will be fetching data from Yahoo Finance.


Now we will fetch data for these stocks from Yahoo Finance using yfinance. I am fetching data of 10 years with auto_adjust=True for getting close price data adjusted for corporate actions. Then storing them into another list called data.

data = []

for ticker in tickers:
    ytick = yf.Ticker(ticker)
    df = ytick.history(period="10y", auto_adjust=True, threads = True)
    df = df[df['Close'] > 0]


Manipulating Data

Now, we have a list with multiple panda data frames. For creating a correlation matrix, we need to merge them into a single data frame with only the closing price data of all the stocks.

For this to achieve, we will first use the zip function to zip all the data frames and then use a dictionary to associate the data frame with the stock using the dict function. Finally, we will use the concat function of pandas to concatenate all these data frames from dict with appropriate columns.

mergedDf = pd.concat(dict(zip(tickers, data)), axis=1)

The new data frame should look something like this


Now since here we have multiple columns of data, but we don't need it, that's why we will only keep Close column data. We will get the Level 1 values and filter only with the Close column.

closeDf = mergedDf.loc[:,mergedDf.columns.get_level_values(1).isin(['Close'])]

It will now look like this


But we also don't need these two levels of headers now. So we can drop Level 1 and only keep symbol tickers as the column header.

closeDf.columns = closeDf.columns.droplevel(1)

It will now look like this


Correlation Analysis

Now we have close price data for all stocks in a single data frame. We now need to calculate daily returns so that we can then create a correlation matrix from that data. But, simple pct_change returns won't work as every stock has a different base as well, so we will need to calculate daily log-returns of all stocks. We can do this by doing the following:

logretDf = np.log(closeDf.pct_change() + 1)

It will now look like this


Now we have a data frame from which we can get various insights regarding correlation, standard variation etc. We can show the correlation matrix by


You will see something like this


This shows us the correlation percentage of all stocks with each other over the past 10 years of data. To visualize this better we can use seaborn to plot it in a graphical fashion like below


You will see something like this


This will show us a much easier view of the correlation between various banking stocks.

If you got this far, means now you can do correlation analysis of any number of stocks. If you like what I write, please do share it on social media. Till next time, peace.!

Did you find this article valuable?

Support Amit Wani by becoming a sponsor. Any amount is appreciated!