How to get your personal Dota2 Data

3 minute read

This is going to be a short post. This is really interesting for me personally. As a Data Scientist and avid Dota 2 player, what could be better than doing data analysis on Dota 2 matches? In this post, I used the API from opendota.com. This API is free to use at least for your personal Dota 2 data which I assume is not that much and not exceeding the free tier limits. For the data cleaning and data collection, I will use Pandas and requests.

Get the necessary library

import pandas as pd
import numpy as np
import requests

Check your call status just to make sure.

r = requests.get('https://api.opendota.com/api')
r.status_code

If it is showing 200 so it is successfully accessing the API. Now put in your personal Dota2 ID. You can find it based on your profile in opendota or the ID from your in-game.

Make a call on Dota 2 player API

myDota2ID = '296360583'

r = requests.get('https://api.opendota.com/api/players/{}/matches'.format(myDota2ID))

jsondata = pd.json_normalize(r.json())
jsondata.sample(5)

	match_id	player_slot	radiant_win	duration	game_mode	hero_id	start_time	version	kills	deaths	assists	skill	average_rank	leaver_status	party_size
157	6736613083	1	True	1967	22	119	2022-09-02 10:17:16	NaN	5	8	23	NaN	62.0	0	1.0
2010	4751540797	130	True	533	22	119	2019-05-14 17:03:25	21.0	1	1	3	1.0	NaN	3	5.0
284	6646342814	132	False	2635	22	96	2022-07-03 18:00:18	21.0	5	6	24	NaN	NaN	0	1.0
1569	5296101869	1	True	2232	22	128	2020-03-16 15:02:32	21.0	6	5	21	1.0	NaN	0	4.0
2504	3889956318	132	False	2760	22	68	2018-05-14 14:57:06	21.0	5	9	14	NaN	NaN	0	5.0

jsondata.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3081 entries, 0 to 3080
Data columns (total 16 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   match_id       3081 non-null   int64         
 1   player_slot    3081 non-null   int64         
 2   radiant_win    3081 non-null   bool          
 3   duration       3081 non-null   int64         
 4   game_mode      3081 non-null   int64         
 5   lobby_type     3081 non-null   int64         
 6   hero_id        3081 non-null   int64         
 7   start_time     3081 non-null   datetime64[ns]
 8   version        2619 non-null   float64       
 9   kills          3081 non-null   int64         
 10  deaths         3081 non-null   int64         
 11  assists        3081 non-null   int64         
 12  skill          1483 non-null   float64       
 13  average_rank   282 non-null    float64       
 14  leaver_status  3081 non-null   int64         
 15  party_size     2543 non-null   float64       
dtypes: bool(1), datetime64[ns](1), float64(4), int64(10)
memory usage: 364.2 KB

So there you go the preview we gathered on my personal Dota2 matches. Of course, you could gather more data by accessing more match API based on my Dota2 personal data. It could take you a lot of time because the match details are really detailed including the different 10 players in each game and each player has their own stats.

You could try to access every match but beware it is going to be a lot of time.

Get the match details on every match ID based on personal data

matchlist = []
for match in jsondata['match_id']:
    r = requests.get('https://api.opendota.com/api/matches/{}'.format(match))
    matchlist.append(r.json())

pd.json_normalize(matchlist[0]).columns

Index(['match_id', 'barracks_status_dire', 'barracks_status_radiant', 'chat',
       'cluster', 'cosmetics', 'dire_score', 'dire_team_id', 'draft_timings',
       'duration', 'engine', 'first_blood_time', 'game_mode', 'human_players',
       'leagueid', 'lobby_type', 'match_seq_num', 'negative_votes',
       'objectives', 'picks_bans', 'positive_votes', 'radiant_gold_adv',
       'radiant_score', 'radiant_team_id', 'radiant_win', 'radiant_xp_adv',
       'skill', 'start_time', 'teamfights', 'tower_status_dire',
       'tower_status_radiant', 'version', 'replay_salt', 'series_id',
       'series_type', 'players', 'patch', 'region', 'replay_url'],
      dtype='object')

Creating the Dataframe and normalizing the JSON data from matchlist to save it later into .csv.

matches_df = pd.DataFrame()

for match in matchlist:
    matches_df = pd.concat([matches_df, pd.json_normalize(match)], axis=0) 

matches_df.to_csv('Yourdataname.csv')

Share on

Twitter Facebook LinkedIn

Gama Candra Tri Kartika

How to get your personal Dota2 Data

Get the necessary library

Make a call on Dota 2 player API

Get the match details on every match ID based on personal data

Share on

Leave a comment

You may also enjoy

Day 7 Algorit.ma : Capstone Project

Day 6 Algorit.ma : Unsupervised Learning

Day 5 Algorit.ma : Classification Model

Day 4 Algorit.ma : Regression Model