Google explains how it uses machine learning in web search
Google’s John Mueller gave one of the clearest and easiest to understand explanations of how Google uses machine learning in web search. He basically said Google uses it for “specific issues” where automation and machine learning can help improve results. The example he gave was about canonization and the example clarifies things.
This is from the Google Webmaster Hangout from 37:47 mark. The example is: “So, for example, we use machine learning for canonization. So that kind of a way is that we have all of these factors that we talked about before. And we give them individual weights. It’s kind of the traditional way of doing things. And we are saying that rel canonical has so much weight and redirection has so much weight and internal links have so much weight. And the traditional approach would be to say we’re just going to offset those weights, against those numbers and see if that works. And if we see that things aren’t working out, we’ll tweak those numbers a bit. And with machine learning what we can basically do is say that this is what we want to achieve and that machine learning algorithms should determine these weights on their own. “
This was the first part of the answer on how Google debugs its search algorithm.
Here is the full transcript of that part.
Machine learning is part of the Google search algorithm and I can imagine that it is getting smarter and smarter every day. As an employee with access to secret files, do you know the exact reason why pages rank better than others or is the algorithm now making decisions and evolving in a way that makes it impossible? understanding for humans?
Complete answer from Jean:
We get this question from time to time and we are not allowed to provide an answer because the machines tell us not to talk about this topic. So I really can’t answer. No I’m kidding.
This is something where we use machine learning in a lot of ways to help us understand things better. But machine learning isn’t just that black box that does it all for you. As you feed the internet on one side, the other side comes out of the search results. It is a tool for us. It’s basically a way to test things out a lot faster and try to figure out what the right solution is.
So, for example, we use machine learning for canonization. What that means is that we have all of these factors that we’ve talked about before. And we give them individual weights. It’s kind of the traditional way of doing things. And we are saying that rel canonical has so much weight and that the redirect has so much weight and that the internal links have so much weight. And the traditional approach would be to say we’re just going to do those weights, those numbers and see if that works. And if we find that things aren’t working out, we’ll tweak those numbers a bit. And with machine learning what we can basically do is say that this is what we want to achieve and that machine learning algorithms should determine these weights on their own.
So it’s not so much that machine learning does its own thing with canonization, but rather that it has this problem well defined. It’s about figuring out what these numbers are that we should be weighting and trying to relearn this system over and over again and figure it out like on the web, that’s how people do it and that’s where things go wrong and that is why we should choose Numbers.
So when it comes to debugging this. We still have those numbers, we still have those weights there. It’s just that they are determined by machine learning algorithms. And if we see that things are going wrong we have to find a way like how could we tell the machine learning algorithm in this case that we should have taken into account, I don’t know the phone numbers anymore on a page rather than just the pure content, to be separated like the local versions for example. And that’s something we can do when we sort of train these algorithms.
So with all of this machine learning stuff, it’s not that there’s just one black box and it’s doing everything and no one knows why it’s doing things. But instead we’re trying to apply it to specific issues where it makes sense to automate things a bit in a way that saves us time and helps extract patterns that we might not have recognized. manually if we had examined them.
Here is the integration of the video:
Here’s how Glenn Gabe summed it up on Twitter:
More than @johnmu: Machine learning helps us extract patterns that we may have missed. And for debugging, Google can see the weights determined by ML algorithms. If there is anything that needs to be improved, Google can work to train the algorithms: https://t.co/J6rDeA68KP pic.twitter.com/Su2pqPKYww
– Glenn Gabe (@glenngabe) December 16, 2019
Discussion forum at Twitter.