How to get your personal Dota2 Data
This is going to be a short post. This is really interesting for me personally. As a Data Scientist and avid Dota 2 player, what could be better than doing data analysis on Dota 2 matches? In this post, I used the API from opendota.com. This API is free to use at least for your personal Dota 2 data which I assume is not that much and not exceeding the free tier limits. For the data cleaning and data collection, I will use Pandas
and requests
.
Get the necessary library
import pandas as pd
import numpy as np
import requests
Check your call status just to make sure.
r = requests.get('https://api.opendota.com/api')
r.status_code
If it is showing 200
so it is successfully accessing the API. Now put in your personal Dota2 ID. You can find it based on your profile in opendota or the ID from your in-game.
Make a call on Dota 2 player API
myDota2ID = '296360583'
r = requests.get('https://api.opendota.com/api/players/{}/matches'.format(myDota2ID))
jsondata = pd.json_normalize(r.json())
jsondata.sample(5)
match_id | player_slot | radiant_win | duration | game_mode | lobby_type | hero_id | start_time | version | kills | deaths | assists | skill | average_rank | leaver_status | party_size | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
157 | 6736613083 | 1 | True | 1967 | 22 | 0 | 119 | 2022-09-02 10:17:16 | NaN | 5 | 8 | 23 | NaN | 62.0 | 0 | 1.0 |
2010 | 4751540797 | 130 | True | 533 | 22 | 0 | 119 | 2019-05-14 17:03:25 | 21.0 | 1 | 1 | 3 | 1.0 | NaN | 3 | 5.0 |
284 | 6646342814 | 132 | False | 2635 | 22 | 0 | 96 | 2022-07-03 18:00:18 | 21.0 | 5 | 6 | 24 | NaN | NaN | 0 | 1.0 |
1569 | 5296101869 | 1 | True | 2232 | 22 | 0 | 128 | 2020-03-16 15:02:32 | 21.0 | 6 | 5 | 21 | 1.0 | NaN | 0 | 4.0 |
2504 | 3889956318 | 132 | False | 2760 | 22 | 0 | 68 | 2018-05-14 14:57:06 | 21.0 | 5 | 9 | 14 | NaN | NaN | 0 | 5.0 |
jsondata.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3081 entries, 0 to 3080
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 match_id 3081 non-null int64
1 player_slot 3081 non-null int64
2 radiant_win 3081 non-null bool
3 duration 3081 non-null int64
4 game_mode 3081 non-null int64
5 lobby_type 3081 non-null int64
6 hero_id 3081 non-null int64
7 start_time 3081 non-null datetime64[ns]
8 version 2619 non-null float64
9 kills 3081 non-null int64
10 deaths 3081 non-null int64
11 assists 3081 non-null int64
12 skill 1483 non-null float64
13 average_rank 282 non-null float64
14 leaver_status 3081 non-null int64
15 party_size 2543 non-null float64
dtypes: bool(1), datetime64[ns](1), float64(4), int64(10)
memory usage: 364.2 KB
So there you go the preview we gathered on my personal Dota2 matches. Of course, you could gather more data by accessing more match API based on my Dota2 personal data. It could take you a lot of time because the match details are really detailed including the different 10 players in each game and each player has their own stats.
You could try to access every match but beware it is going to be a lot of time.
Get the match details on every match ID based on personal data
matchlist = []
for match in jsondata['match_id']:
r = requests.get('https://api.opendota.com/api/matches/{}'.format(match))
matchlist.append(r.json())
pd.json_normalize(matchlist[0]).columns
Index(['match_id', 'barracks_status_dire', 'barracks_status_radiant', 'chat',
'cluster', 'cosmetics', 'dire_score', 'dire_team_id', 'draft_timings',
'duration', 'engine', 'first_blood_time', 'game_mode', 'human_players',
'leagueid', 'lobby_type', 'match_seq_num', 'negative_votes',
'objectives', 'picks_bans', 'positive_votes', 'radiant_gold_adv',
'radiant_score', 'radiant_team_id', 'radiant_win', 'radiant_xp_adv',
'skill', 'start_time', 'teamfights', 'tower_status_dire',
'tower_status_radiant', 'version', 'replay_salt', 'series_id',
'series_type', 'players', 'patch', 'region', 'replay_url'],
dtype='object')
Creating the Dataframe and normalizing the JSON data from matchlist
to save it later into .csv
.
matches_df = pd.DataFrame()
for match in matchlist:
matches_df = pd.concat([matches_df, pd.json_normalize(match)], axis=0)
matches_df.to_csv('Yourdataname.csv')
Leave a comment