Ranking Counter-Strike 2 teams using Bradley-Terry model

Dec 29, 2024·
Gustavo De Mari Pereira
Gustavo De Mari Pereira
· 5 min read

Introduction

When I was younger, I enjoyed playing Counter-Strike (CS) with my friends. I started with version 1.5, then moved on to 1.6, and later played CS:GO. Nowadays, I’m more of a spectator, though I occasionally analyze team performance data for fun.

I’m currently researching Reinforcement Learning (RL) and recently explored Reinforcement Learning from Human Feedback (RLHF). While reading about reward models used in the ‘post-training’ phase of large language models (LLMs), I discovered that one key approach is the Bradley-Terry model. This model is commonly used in sports like basketball, soccer, tennis, and even chess.

To better understand how reward models work, I delved deeper into the Bradley-Terry model and considered applying it to real-world data from e-sports, like CS2.

The Bradley-Terry model works by making pairwise comparisons between items and assigning a score that reflects the preference of one item over another ($i \succ j$). For example, it could represent the preference between Team 1 and Team 2, or, in the case of LLMs, between two generated responses.

For LLMs, the typical approach is to generate two responses based on a prompt and then ask a human to choose their preferred one. This process helps fine-tune the LLMs to produce responses that better align with user expectations, which is valuable since it’s difficult to define a function that evaluates response quality.

In contrast, sports have objective metrics, like the number of wins and losses to determine team preferences. This fits to the task of evaluating how CS teams performed against each other during the year of 2024.

The general steps involved on using Bradley-Terry model are the following: 1. gathering data about teams, 2. using the data to fit the Bradley-Terry model, 3. generate rankings.

Data

The first step is to gather data of winning and losses between teams. In the case of CS, I collected the number of wins and losses for each map and team that are in top 20 of HLTV ranking during the year of 2024.

HLTV top 20 teams for 2024

team_namecountry_namestatskd_diffhltv_rating
1SpiritRussia136+9521.1
2VitalityEurope132+7771.1
3Natus VincereEurope159+6141.06
4MOUZEurope134+3261.05
5G2Europe159+4041.05
6The MongolZMongolia103+1031.04
7Eternal FireTurkey122+1831.03
8FaZeEurope162+1251.03
9MIBRBrazil88+611.02
10LiquidOther104+2491.02
11AstralisDenmark106+11.01
12HEROICEurope137-1171.01
13ComplexityUnited States102-2281.01
14Virtus.proRussia132-871
15FURIABrazil101-1990.99
16BIGGermany90-2990.99
17paiNBrazil115-3450.98
18ImperialBrazil84-3370.97
19SAWPortugal62-3400.97
20FalconsDenmark105-5930.95

Subset of Win/Loss matrix for HLTV top 20 teams

VitalitySpiritG2Natus VincereMIBRLiquidFURIApaiN
Vitality03732650
Spirit404101340
G2714072712
Natus Vincere151400524
MIBR020101212
Liquid22552092
FURIA01020302
paiN000011010

Bradley-Terry model

Using the Win/Loss data, we can fit the parameters of Bradley-Terry model using maximum likelihood estimation (MLE).

There is a iterative formula to do that:

$p_i = \frac{\sum_j w_{ij}}{\sum_j {w_{ij} + w_{ji}/(p_i + p_j)}}$

We start with a initial guess like: $p_i = 1/N, \forall i \in \{1, 2, ..., N\}$ and apply the iterative formula.

For each iteration, we standardize the scores to satisfy $\sum_i p_i = 1$:

$p_i = \frac{p_i}{\sum_i p_i}$

After some iterations, we arrive to the final scores and we could obtain a ranking.

Ranking

This is the final scores and the ranking for the top 20 HLTV teams of 2024 using Bradley-Terry model.

Interestingly, it puts the major winner Spirit in the 1st position.

rankteam_namescore
1Spirit0.235284
2Vitality0.166654
3Natus Vincere0.130508
4G20.0786501
5MOUZ0.0673365
6Liquid0.0532813
7FaZe0.0430689
8Virtus.pro0.0262954
9MIBR0.0248231
10paiN0.0240706
11The MongolZ0.0239945
12Astralis0.0228869
13Eternal Fire0.0211036
14HEROIC0.0178517
15FURIA0.0159465
16Complexity0.0140515
17SAW0.0134657
18Falcons0.00873056
19Imperial0.00680508
20BIG0.00519306

To calculate the probability of team i winning team j, we could use the following formula: $Pr(i \succ j) = \frac{p_i}{(p_i + p_j)}$. For example, $Pr(\text{Spirit} \succ \text{G2}) = \frac{0.235284}{(0.235284 + 0.0786501)} = 0.749469395$

Conclusion

The Bradley-Terry model is very versatile, he can be used in traditional sports but also in e-sports and even LLMs. Furthermore, he is simple to understand and could be a valuable tool to assess team performance in e-sports like CS.

References

[1] M. E. J. Newman, “Efficient Computation of Rankings from Pairwise Comparisons,” Journal of Machine Learning Research, vol. 24, no. 238, pp. 1–25, 2023.

[2] R. A. Bradley, “14 Paired comparisons: Some basic procedures and examples,” in Handbook of Statistics, vol. 4, in Nonparametric Methods, vol. 4. , Elsevier, 1984, pp. 299–326. doi: 10.1016/S0169-7161(84)04016-5.

[3] L. B. Anderson, “Chapter 17 Paired comparisons,” in Handbooks in Operations Research and Management Science, vol. 6, Elsevier, 1994, pp. 585–620. doi: 10.1016/S0927-0507(05)80098-2.

[4] H. Turner and D. Firth, “Bradley-Terry Models in R : The BradleyTerry2 Package,” J. Stat. Soft., vol. 48, no. 9, 2012, doi: 10.18637/jss.v048.i09.

[5] C. Huyen, “RLHF: Reinforcement Learning from Human Feedback,” Chip Huyen. Available: https://huyenchip.com/2023/05/02/rlhf.html

Gustavo De Mari Pereira
Authors
Data Scientist & Machine Learning Engineer
M.S. in Computer Science from IME-USP, focused on Reinforcement Learning. Founder of 2 companies, 10+ years of experience working with large-scale databases and building end-to-end ML pipelines. Kaggle competitor and Scikit-learn contributor.