fbpx

AI and ML

IPL Data Analysis | Data Science | Python | AI

IPL Data Analysis

In this article analysis of summary of IPL matches from 2008 to 2017 is done using Data  Science and python packages like pandas, matplotlib and seaborn.

The Dataset can be downloaded from here.

data collected includes some properties like-

  1. Season
  2. City in which match held
  3. Team1
  4. Team2
  5. Winner
  6. Toss decision
  7. Win by runs
  8. Win by wickets
  9. dl applied
  10. Umpires
  11. Venue

 

Data importing:

import pandas

import matplotlib.pyplot as plt

import seaborn as sns

data=pandas.read_csv(r’C:\Users\vinay\Downloads\ipl_matches.csv’)

 

Which stadium is best for winning by runs?

data.venue[data.win_by_runs!=0].mode()

MA Chidambaram Stadium, Chepauk

 

Which stadium is best for winning by wickets?

data.venue[data.win_by_wickets!=0].mode()

Eden Gardens

M Chinnaswamy Stadium

 

For an IPL team (take any team of your choice) which stadium is best when they win the toss?

data.venue[data.toss_winner==’Mumbai Indians’][data.winner==’Mumbai Indians’].mode()

Wankhede Stadium for “Mumbai Indians”

 

Which is the best-chasing team?

data.winner[data.win_by_wickets!=0].mode()

Kolkata Knight Riders

 

Which is the best defending team?

data.winner[data.win_by_runs!=0].mode()

Mumbai Indians

 

graphical analysis to describe some  patterns:

Has Toss-winning helped in Match-winning?

Code:

to = data[‘toss_winner’] == data[‘winner’]

plt.figure(figsize=(12,4))

sns.countplot(to)

plt.show()

 

IMAGE

From the above countplot it looks like, Toss winning actually helps in Match winning – or to be statistically right, we could say there’s a correlation between Toss Winning and Match Winning and so we can assume that it helps.

 

Does choosing batting or bowling first helped in match winning?

Code:

plt.figure(figsize=(12,4))

sns.countplot(data.toss_decision[data.toss_winner==data.winner])

plt.show()

 

IMAGE

From the above visualization, we can say that choosing to field first for the toss winner helped in winning the match than batting first in IPL from 2007 to 2017.

 

In which city does Weather affected matches?

Code:

plt.figure(figsize=(20,9))

sns.countplot(data.city[data.dl_applied==1])

plt.show()

 

IMAGE

From the above countplot figure, we can say that weather in the cities like Bangalore and Kolkata affected some matches followed by Vizag, Hyderabad, and Delhi

 

Which season had the most number of matches?

Code:

sns.countplot(x=’season’, data=data)

plt.show()

 

IMAGE

From the season 2011-2013, we had the most number of matches from other seasons and 2013 season has the most number of matches.

 

Best chasing venues?

Code:

plt.figure(figsize=(5,12))

sns.swarmplot(x=’win_by_wickets’,y=’venue’,data=data)

plt.show()

 

IMAGE

Eden Gardens and M Chinnaswamy Stadium are the best venues for chasing, the density of each attribute contributes in the swarmplot figure.

 

Best defending venue?

Code:

plt.figure(figsize=(5,12))

sns.swarmplot(x=’win_by_runs’,y=’venue’,data=data)

plt.show()

 

IMAGE

From the above swarmplot figure between venues and win_by_runs, we can say that MA Chidambaram Stadium is the best defending stadium.

Heatmap :

Code:

cor=data.corr()

plt.figure(figsize=(12,10))

sns.heatmap(cor,annot=True,cmap=’coolwarm’)

plt.show()

 

IMAGE

From the above heatmap diagram, there is a negative correlation between the win_by_runs and win_by_wickets , and we know that if one the value is zero then other would be a non-zero value

 

Most successful ipl team?

Code:

plt.figure(figsize=(50,60))

sns.factorplot(y=’winner’,kind=’count’,data=data)

plt.show()

 

IMAGE

From the above factorplot data, we can say that most winning team is the most successful team and Mumbai Indians is the most successful team followed by Chennai super kings and Kolkata knight riders.

 

Team won by max runs?

And

The best defending team?

Code:

plt.figure(figsize=(5,12))

sns.swarmplot(x=’win_by_runs’,y=’winner’,data=data)

plt.show()

 

IMAGE

from the above swarm plot data, we can say that Mumbai Indians won by maximum runs in IPL history.

And

From the above data, the best defending team is Mumbai Indians.

 

Team won by max wickets?

and

Best chasing team?

Code:

plt.figure(figsize=(5,12))

sns.swarmplot(x=’win_by_wickets’,y=’winner’,data=data)

plt.show()

 

IMAGE

From the above data, we can say Kolkata Knight Riders are best chasing team and

Most of the teams won by 9 wickets but Royal Challengers Bangalore won two times by maximum wickets taken.

 

Conclusion:

Different factors effects on the team winning as we seen in the above examples and above-visualized data was accurate to the real world IPL statistics.

 

This is all about IPL Data Analysis the work is done as a part of a project by a team of aspiring Data Professionals while pursuing Internship/Training at TechTrunk Ventures – PVT LTD.

 

 

No comments yet! You be the first to comment.

Leave a Reply

Your email address will not be published. Required fields are marked *

error: