Niklas Weber
|
March 11, 2022

The most unnecessary risk in esports - And how taking it costs the entire industry millions of dollars

Data and Information are the lifeblood of the 21st century.

About the Author
Niklas is a working student here at Bayes Esports. Having started playing League of Legends back in season 1, he is one of our resident LoL Esports experts. If he is not at his PC playing games, chances are he is either sleeping or watching football.

Data and Information are the lifeblood of the 21st century. As consumers, we have the ability to inform ourselves about everything and everyone through whichever outlet we prefer, from multinational news corporations to reviews and posts on social media by other users.

As businesses, we can use data to get a better understanding of our target audiences, optimize our products and be one step ahead of our competitors thanks to advantages in intel.

This is no different in esports. Esports data can be used to enhance the fan experience of esports fans through the provision of betting possibilities before, the creation of real-time visualizations during, and in-depth and stats-based analyses and discussions after the games. This is widely undisputed in the esports industry.

What is not is the usage of unofficial esports data. Since such data is still being used on the regular, it appears as though even some of the biggest companies in the industry are still either woefully unaware of the issues of unofficial data, or these issues are quite simply assented and accepted without any second thought about possible risks, consequences, and alternatives.


It is about time to put an end to this. Let's break down what unofficial data exactly is, which kind of issues can be encountered by companies using it and why doing so might be the most unnecessary risk in all of esports.


What exactly are official and unofficial data?


Official data sources are generated by the servers the games are played on. Since esports is operating entirely digitally, data of every step, every action, and everything that happens during a game already exists. This data can then be used to retrace everything that happened during the game, giving data consumers the ability to, in essence, perfectly recreate entire games. In other words, official data sources are meticulously accurate transcripts of what happened during an esports match.


Unofficial data sources on the other hand are generated through different data collection methods, such as Optical Character Recognition (OCR) and data scraping. Here the data is collected from the broadcast of the esports match (i.e. a live stream on Twitch or YouTube) rather than generated by the game server. Since these broadcasts go to great lengths and efforts to inform the audience about all that is going on in the game, these data sources are largely accurate, but seldomly perfect since the games' observers might still miss some of the action or replays, overlays and graphics might mess with the data collection process.


Why does this matter though? Why is being close enough not good enough and what are some of the issues companies using unofficial data sources might encounter?


For starters, unofficial data is slow. Since that data is recorded off of a live stream, it can only be redistributed to others once the action can be seen on screen. These live streams, however, already run on a delay ranging between 15 to an average of 40 seconds. For news articles and social media posts, being a couple of seconds behind is usually not a big deal. But for the betting industry, it is absolutely crucial that the data is delivered as fast as possible since fans attending esports events live would see how the games play out with significantly less delay. A lot of bets are no longer simply placed on whichever team wins a game or tournament, but on what happens next in a live game. Calculating accurate betting odds for who gets first blood, or which team wins the pistol round of the second half becomes nigh on impossible when factoring in these delays, as betting companies rely on being faster in their calculations than their bettors can be in placing their bets. Betting operators can over time lose millions of dollars because of the usage of unofficial data because they simply cannot react quickly enough.


Furthermore, the inaccuracies of unofficial data sources add up over time. Every time the live stream cuts away from the action to show a replay, the commentators, players, or fans, the data feed stops momentarily and every ad or overlay can have a negative impact on the accuracy of the collected data. While every single missed bit of information might not seem like much on its own, they not only might just be very crucial pieces of information that would impact the outcome of bets, but they also stack up rapidly over the course of an entire season or tournament, to the point where they can change the perception a person has of a player or team. Information is vital for fans and analysts alike to know which team to place a bet on or which player to keep an eye on. While not every piece of information that is consumed has to be perfectly accurate for it to be valuable, companies that claim to and that people believe to be reliable sources should be aware of the responsibility they are entrusted with and should reduce the number of inaccuracies they make far beyond what is possible with OCR data. 

And, after all and as stated before, it is every single user's own choice, which source they trust and which content they want to consume. When the competition is fierce and the choices for users are aplenty, relying on inaccurate and unreliable data sources is a surefire way for media outlets to fall behind their competition and lose their source of income.


Ultimately, unofficial data sources might be cheaper in the short term, but investing in their official counterparts is guaranteed to be more profitable in the long run. 

Bettors being able to exploit the slow calculation of betting odds, users losing interest in one's content because of repeated mistakes and inaccuracies and even potential legal actions by the data rights holders against the scraping of data are all possible and very costly side effects of the usage of unofficial data sources. 

All of them can be avoided via the usage of official data. Data that already exists, that is out there and available on the market. While in traditional sports, we have to live with small inaccuracies because even with the most advanced cameras and technologies mistakes can be made and situations might still be unclear, we simply do not have to in esports. 

So why should we?


Why take the most unnecessary risk in esports?


For the long-term development of esports as a whole, it is of the utmost importance that the differences between official and unofficial data are understood by the community, so that we may engage with esports with the level of professionalism it needs and demands. 

Esports deserves better than for it to be dragged down by the companies and service providers that try to make a half-hearted attempt at cashing in on it through the usage of unofficial data. 

With official data, we can drive esports forward. We can shape the future of the industry.

Recent Blog Posts

What data scientists do for us - and what we do for them

View more

What’s NEXT in esports?

View more

Sports are weird. Seriously.

View more