Predictive Learning - The redefining name change of unsupervised learning
Unsupervised learning is getting a new name - Predictive Learning. The definition is the same: You train an AI with unlabeled data. So why is this name change so important? Who cares about a name change? Let me tell you, this name change is not just a change of name is a change of concept. Unsupervised learning is becoming much more than it used to be. So new name, same definition but a conceptually new thing. To me that’s interesting.
The frontrunner of this name change is Yann Lecun. He is the Chief AI scientist at Facebook. Yann calls this predictive learning “Next AI Frontier”. So where the last 10 years have seen a lot of work in supervised classifiers (Is this an image of a cat or not problems) or reinforced learning in beating world champions of board games such as AlphaGo, the next 10 years could be mainly unsupervised learning.
Just to be specific, predictive learning should not be confused with predictive analytics. Predictive analytics is closely related but is more statistics than machine learning and is more “supervised”.
A little background
Before we go into the name change and what it really means, I’d like to put unsupervised learning in a context with the other two main machine learning categories. In a nutshell and drawn up in caricatured way the three main categories are like this:
Supervised learning: The most widely used machine learning category for applied and business use. You need labeled data(and often a lot of it) in order to teach the model the right answers in given cases.
Reinforced learning: Mostly used in experimental and research cases and not used a lot in business so far. AlphaGo is a great example of this and generally teaching models to play games is done through reinforced learning.(Disclaimer: If I understood correctly Alphago also utilized supervised learning but I don’t know the exact architecture). Reinforced learning learns through a feedback loop giving positive or negative feedback to the model depending on results.
Unsupervised learning: Often used for simply to cluster and group data. Think Spotify song suggestion. I’ve used unsupervised learning in business applications but my experience is that it’s still rarely used. Unsupervised learning does not take feedback or labeled data making data collection much easier.
I definitely expect both reinforced and unsupervised learning to become more widely used in business applications given the current development. I also think that we will see more architectures based on a combination of all these categories at once. They seem to play well together and I have been a part of projects that made use of both supervised and unsupervised learning models in one architecture successfully.
The supervised problem
Supervised learning has seen great developments in the last year. There have been several reasons such as training on GPU’s (we have more computer power), more available labeled training data and some technical advances such as ReLU(A most cost effective solution in neural networks).
Getting labelled data is still a problem though. It’s very costly and from personal experience I’d claim that becoming good at collecting, labeling and storing data can become a bit of science. You will have to make decisions on the tradeoffs between costs and quality in your data and no matter what you do you will always have biased data. Biased data will biase the models so you will at all times need to be aware of edge cases in which your models are behaving strangely.
So if we could avoid using labelled data and instead use unlabelled data we would have access to so much more data at a lower cost and with lower chance of biases.
If you compare human intelligence to artificial intelligence you might also realise that much of human intelligence is unsupervised. Of the things we understand very few have to be taught with learning examples. We just look at the world and make pure observation our basis for understanding the world and generating our imagination.
So even though supervised learning is giving us great results it has its limits.
The change
So as I mentioned in the beginning unsupervised learning in a nutshell used to be simple clustering and categorizing it’s not anymore. As Yann Lecun puts it, unsupervised learning is about “filling the void”. Filling the void is not just putting similar things in categories, filling the void is like imagination. The use cases are actually also very similar to imagination.
When training predictive learning models the aim is to understand the current world and then add to that world.
A good example is image completion problems. You have an image that is not complete and you would like to “fill the void”. Predictive learning can do that.
This makes a lot of sense now right? The model has seen so many examples with labelling that it can predict what might be missing from the context.
I like to think about it like this: Think about when you see something odd. Maybe a person on the street with unusual clothes, a price on a product in the supermarket that is too cheap or too expensive or someone eating ice cream in the snow. It just doesn’t make sense, but no one has told us what exact cloth is correct or what every product in the supermarket should cost. This is how I imagine predictive learning.
Another really good example is GPT-3 from OpenAI that I should definitely write an article on soon. GPT-3 is a language model that has been trained unsupervised on a LOT of data and with a little supervised learning on top of it you can have a very effective model. This last part is known as one-shot or few-shot learning. I’ll write an article about this later but it’s basically having a pretrained model that you can train a few examples on top of and still get very good results. The alternative is that you need tens of thousands of training examples for each model you build even though the problems are very similar.
The common theme here for predictive learning is that the models are deep learning models based on neural networks. Older unsupervised models weren’t neural networks very often.
GANs
GANs(generative adversarial network) has to be mentioned here. OpenAI has a big focus on GAN’s and it’s a big driver behind these changes.
I won’t go into detail here but the basic idea is that you train two neural networks. One is trying to complete the task you are trying to solve and the other is trying to trick the first with fake examples. So in effect they train each other in a closed loop.
The impact
Why is this such a big deal? Well unsupervised learning has been used as a more passive approach. It clusters, it groups and it makes sense of the existing world. When we need an active solution we usually go for supervised learning. But as stated supervised learning is expensive. Predictive learning can give us a way to have active and very value-producing models with way less need of investment. And as the world changes around us we don’t need more labeled data.
I used the image completion problem before to illustrate how predictive learning works. It’s a good illustration for what predictive learning is and how it works but from a business or impact perspective it doesn’t show the impact that this technology will have.
This will not just be used for removing people from our vacation pictures or objects in the background when we are filming a commercial. It’s going to fill in the void in a lot of places. It’s going to write articles better than most people with a given style. It’s going to be able to detect when something is out of the ordinary. A worker making a crucial mistake or a traffic situation that isn’t normal. My expectation for this for the next 1-3 years is that it will drastically remove the monetary barriers to work with AI for all sorts of problems.
So the impact is that we can now have active AI’s based on almost free training data. If that is not a big thing in AI i don’t know what is.
Why now?
Actually it’s not happening right now. It’s been happening for a while and it’s not even sure that the name will stick. Things change quickly in AI, but one thing is for sure. This technological advance is going to have a huge impact and we are right now seeing examples like GPT-3 emerge based on this technology.