Transfer Learning from a business perspective
Transfer Learning is from a business perspective one of the most effectful inventions in AI. The result of the technology is that you can train neural networks to perform tasks with way less labelled data than normal. Since labelled data is usually the biggest cost in making applied AI this can significantly improve your business case.
The basic idea is that we, in many cases, have already trained neural networks to perform tasks that are similar to the new tasks we are trying to solve. One example of this could be image recognition. A network that can identify animal species already, has a lot of the abilities that a network that can recognize buildings would need. The reason is that the layers in a neural network often learn general concepts such as shapes. Both cats and houses have shapes, so when we have already learned to identify different shapes, why do it again?
How Transfer Learning works
So in Transfer Learning you have to have a pre trained network. Let’s say we use the animal species example. Someone already spent hours training, fine tuning and testing this network so it performs well. The network could look like this.
The neural network here as you can see has an input layer, some middle layers and an output layer. This is all normal.
Now let’s see what it looks like with transfer learning. With Transfer learning we take one or more of the last layers off and train these few layers from scratch.
Notice you have to have the same input layer. This means that you cannot use transfer learning to suddenly go from image recognition to some completely other problem like language problems.
The purple layers in the end are new layers that we have added, after throwing the old one out, and trained from scratch.
When to use Transfer Learning
So now that you know how it works you might want to know when it’s an appropriate use case. There’s a few rules of thumb here.
If you have more (A lot more probably ) data on the pretrained network problem(The animal problem) and less data on the new problem(The building problem) then transfer learning might be the answer. From a business perspective it could also be the case that the labelled data comes with very different price points and it’s simply too expensive to get a lot of labelled data. You might be able to find a lot of labelled data online for identifying animal species but if your problem is to identify old train models then it might be very expensive to find images of those and you might even have to go take some pictures yourself.
So to sum up, Transfer Learning can greatly reduce your costs on getting data and should be considered if you can get access to similar pretrained models or cheap data on similar problems but your current problem is not that easy to get data on.