Last Updated –
The Microsoft Bing team made a big announcement at the 2019 SMX West conference regarding their spider. They are
transitioning away from an active spider to a more passive spider.
Basically they will stop spidering your pages without a prompt from the site. This puts the responsibility to
trigger the Bingbot to visit a page on the page or site owner, not Microsoft.
The announcement sent some ripples through the search community as this may signal a large disruption to the way
search engines consume and evaluate web site content.
It also amazed many as they have a difficult time comprehending how a search engine, that depends on evaluating
public content can do so if it will stop trying to discover and reevaluate content actively.
The announcement also included a change to the Bing URL submission tool. Now sites can earn trust so they can submit
up to 10,000 URLs to be crawled per day.
Oddly I had been wondering if Google would do something similar to the Bing announcement the more I study SEO.
To me it makes sense because the amount of CPU and network resources required for both the bot and the website it
pretty high, especially for sites with thousands of pages.
How will this potentially change the way SEO works? And why this is a great disruption to the search optimization
world.
How Bing Will Spider Content
Traditionally search spiders have aggressively felt their way through the maze of links connecting pages together on
the web. That is why we call them spiders.
This means companies like Google, Bing, DuckDuckGo and others need little software agents that operate at a massive
scale to consume every page published on the Internet. These agents are called bots, but they are just the first
pass to the entire search ranking workflow.
Bots just find URLs can consume the content. They are not responsible for evaluation, that is handing off to a
different process. Ultimate the content is processed, scored and we eventually see results for search queries.
Bots consume a lot of resources, both on the search engine side and your web server, plus bandwidth between the two
points.
I know from evaluating logs from sites over the years and looking at reports in the Google and Bing search consoles
there is a lot of activity for many sites.
We know there are over 2 billion public websites today, which along means there are many pages these spiders must
check everything out.
But here\’s the thing, most URLs never change. This means spidering each one, even once a month means a lot of wasted
effort for everyone.
What Bing has decided to do it change to a passive spidering process where they will check out your pages when you
tell them to do so.
Now instead of Bing wasting time and resources spidering the same page with the same content a few dozen times a
year it will only spider the URL when the owner notifies the search engine to do so.
To notify Bing there are two ways to let them know, the manual submission tool through the Bing Webmaster Tools portal or through the Bing API.
While mannually updating through the portal will work for a small site, it will not scale for larger sites. This is
why API integration is important.
While there are ways to notify both Google and Bing when a sitemap or RSS feed updates, using an API to explicitly
notify when a page updates is much better.
The ideal scenario will have your publishing workflow automatically notify all search engines about new content or
updates.
URL Submission Limit and Earning Trust
Don\’t think you can just go an notify Bing your entire 100,000 URL website has updated all at once. They impose
limits, which start small and increase the longer a site is verified with the WebMaster tools.
After 6 months a site can submit up to 10,000 URLs per day, way more than almost every website could want, even
large news sites with a million articles or an e-commerce site might need.
There are only a handful of sites that have 10,000 daily page updates. I think Amazon might be an example of a site
needing more than the 10,000 per day limit and I am sure they have an arrangement with both Microsoft and Google.
The rate limits work like this:
Don\’t think this gives you liberty to resubmit your entire site every day to trigger a the bot to visit. While not
included in the Bing blog I am sure they will begin to ignore repeated requests for pages that show little sign of
change.
The goal for Microsoft is to reduce the amount of resources needed on both ends. You still will have the bot visit
and evaluate your content, just more efficiently.
Benefits for Business and Online Marketing
Some of the online reactions to this policy change are predictable. But I see it as a positive move.
The obvious benefit is less computing power required to index the web.
But it also means the quality of content indexed should be better.
Think about it, only sites that take the time to verify with the search engine will have the opportunity to have
their pages indexed.
And only sites with a built in mechanism or an owner that takes the time to manually notify the search engine of
an update will be indexed.
This means mountains of thin and weak content will simply be ignored. This reduces the chance thin and obsolete
content will clutter search results.
It also means sites making an intentional effort to rank for keywords will be able to rank a little easier.
This is why I think this will be a positive disruption to the organic search marketing process.
While Microsoft Bing only commands a small percentage of the online search market the recent spidering announcement
may be a key part of the search engine market future.
If this works and I think it will, I see Google and other search engines adopting a similar policy.
In the end everyone who cares will win. Search companies can use their resources to improve their service offerings
and site owners can reduce their hosting bills.
But the big win will go to online marketers because weak content will become ignored, reducing the noise currently
crowding search results.
This content was originally published here.