Messi and Machine Learning

	Author: Claire Walsh

May 15, 2019

Machine learning is a current ‘hot topic’, which has been headlining in technology discussion in recent years. What really is machine learning? Many people do not understand this fundamental topic.

There’s a famous quote from Tom M. Mitchell: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at task in T, as measured by P, improves with experience E”. Even reading this is difficult to comprehend and understand; without really breaking it down and possibly drawing it out. I often think about real life analogies which come to mind when approaching difficult definitions like this. As a massive sports fan, famous athletes immediately spring to mind. Think about Leonel Messi as a “machine learning algorithm”. Let him be simple linear regression, classification, a simple neural network, or a complex deep neural network with 200 hidden layers.

Looking again at the quote above: Messi learns from training (the Experience, E) how to perform at his absolute best (Task, T) and is measured by the amount of goals and assists in which he records (Performance, P). His performance and therefore goals/assists will improve with training. Maybe it’s just me, but that makes it a lot more imaginable than trying to visualise a computer program performing tasks and improving with consistent training/experience.

A fundamental aspect of machine learning is the training and testing phases. Back to sport, any sport, this directly lines up with training for an event, and performing. Whether it’s the champions league final, or just a local road race for a casual runner, this would be the same as the testing phase for a machine learning algorithm. The testing phase basically asks the question “How well did you train for this? did you learn relationships between input features in order to be able to predict outcomes?”

Did you overfit? Those of you who have experience with machine learning will understand this term. Basically, it means that you focus too heavily on a certain dataset and it’s not possible to adapt to new data. Relating this back to sport – did you just train in the sun? this means that you will probably be at your best performance in the sun. What happens if it’s raining on the day of your race? You will struggle to adapt. Similarly, a machine learning algorithm will struggle to adapt if it doesn’t have a wide variety of data which it can learn from. An important aspect of machine learning is generalisation: which refers to being able to “Generalise” to new situations and have enough variance in the training to be able to adapt when new things happen. But be careful! You don’t want to underfit: which means you have not managed to capture the underlying trend of the data.

Take linear regression: and consider Messi deciding how much weight to put on a pass. In training, Messi will have passed the ball thousands of times, in all different weather conditions, in different areas of the pitch, against different qualities of players. After taking all of these “features” into consideration, he will decide to place weight X on the ball, as judged appropriate to a specific situation in the game. He will have a certain level of confidence in his getting this right, based on his training methods. Although of course this is not the same as a linear regression algorithm learning from input data and predicting output based on its “learning phase”, I can’t help but compare the two in an extremely simple manner.

Take classification: and then picture a free kick in the world cup final. Messi has two choices, he will either cross the ball or take a shot. He will consider every historical situation like this, he will also look at the weather conditions especially the wind, the ability of the keeper, and the size of the wall. He will decide based on this analysis which has the highest probability of success and then decide to do one or the other.

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.

Cookie	Type	Duration	Description
bcookie	0	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
bscookie	1	2 years
cli_user_preference	session	1 hour	The cookie is used to store the yes/no selection the consent given for cookie usage. It does not store any personal data.
cookielawinfo-checkbox-advertisement	0	1 year	This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given their consent to the usage of cookies under the category 'Advertisement'.
cookielawinfo-checkbox-analytics	0	1 year	This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given the consent to the usage of cookies under the category 'Analytics'.
cookielawinfo-checkbox-functional	0	1 year
cookielawinfo-checkbox-necessary	0	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-non-necessary	0	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
cookielawinfo-checkbox-performance	0	1 year	This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given the consent to the usage of cookies under the category 'Performance'.
cookielawinfo-checkbox-preferences	0	1 year	This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given the consent to the usage of cookies under the category 'Preferences'.
fpestid	0	1 year	Fpestid is a ShareThis cookie ID set in the domain of the website operator.
GPS	0	30 minutes	This cookie is set by Youtube and registers a unique ID for tracking users based on their geographical location
hubspotutk	0	1 year	This cookie is used by hubspot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.
IDE	1	2 years	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
lang	0		This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	0	1 day	This cookie is set by LinkedIn and used for routing.
lissc	0	1 year	Used to ensure there is correct SameSite attribute for all cookies in that browser.
li_gc	0	30 minutes	Used to store consent of guests regarding the use of cookies for non-essential purposes
li_sugr	0	2 months
rtc	0	1 month	Used as part of anti-abuse processes on LinkedIn
test_cookie	0	11 months
u	0	2 months
UserMatchHistory	0	4 weeks
viewed_cookie_policy	0	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
VISITOR_INFO1_LIVE	1	5 months	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	1		This cookies is set by Youtube and is used to track the views of embedded videos.
_ga	0	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, camapign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assigns a randoly generated number to identify unique visitors.
_gat_gtag_UA_131216975_1	0	1 minute	Google uses this cookie to distinguish users.
_gid	0	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.
__cfduid	0	3 months	Cookie associated with sites using CloudFlare, used to speed up page load times. According to CloudFlare it is used to override any security restrictions based on the IP address the visitor is coming from. It does not contain any user identification information.
__hssc	0	30 minutes
__hssrc	0
__hstc	0	1 year
__stid	0	1 year	The cookie is set by ShareThis. The cookie is used for site analytics to determine the pages visited, the amount of time spent, etc.
__stidv	0	1 year	The cookie is set by ShareThis. The cookie is used for site analytics to determine the pages visited, the amount of time spent, etc.

Messi and Machine Learning

Resources

Dublin headquarters

Belgrade office

Lisbon office