Meet Mojeek, the UK search engine tackling Google’s ‘unhealthy monopoly’
Ask anyone to name the first search engine that comes to mind and chances are they’ll say Google. It has not always been so. For the past three decades, the answer to this question has also been Yahoo, Ask Jeeves, WebCrawler, AOL Search or Netscape. British search engine company Mojeek hopes to be the next name people turn to and is developing its own crawling technology to challenge Google’s hegemony.
“Google’s monopoly on search is not healthy,” says Colin Hayhurst, CEO of Sussex-based Mojeek. “Can you imagine if we got all our news from The New York Times, The Washington Post and that was it – do you want two American sources?”
Mojeek plans to challenge Google’s search dominance by building its own catalog of Internet content in a way that requires as little data collection as possible.
But with Google’s search engine market share hovering around 90%, and Microsoft’s Bing the second most popular choice, Mojeek faces a mountain to climb.
Search engines are responsible for finding pages on the web (crawling) and then storing what they find (indexing). Crawling is the “relatively easy part,” says Hayhurst, but indexing forced Mojeek to rewrite its entire software architecture because it was getting too slow at around a billion pages. To date, Mojeek has crawled and indexed six billion pages.
Mojeek was founded in 2004 by Marc Smith, initially as a personal project after becoming frustrated with the direction Google was heading. It was the same year that Google acquired Gmail, “a move, presumably, to collect more information,” jokes Hayhurst.
Mojeek has raised just over £3m in angel investment – pocket change for Google. Its main source of revenue comes from licensing its search API to businesses, such as publishing companies. It also offers a site search API and ads.
Its core mission is to give consumers another choice when it comes to searching the web – and a choice that doesn’t follow users.
“Diversity of information is important,” says Hayhurst, who has previously held leadership positions in web infrastructure startups. “We have one or two places where we go to discover information on the web, browse the web, transact on the web, decide which companies we want to do business with, and we actually have two US companies to decide who you go see on the front page. It’s really unhealthy.”
Mojeek: “We are anti-personalization”
Where Mojeek wants to differentiate itself from other players in the space is privacy. Hayhurst says privacy-focused search engines like DuckDuckGo and Ecosia “are not search engines” and that “most of them are actually proxies for Bing.”
“Most of them take your search query and send it to Microsoft’s Bing API,” adds Hayhurst. “Bing then returns the search results and adds relevant ads.”
Yahoo used to have its own independent engine, but that also uses Bing now, says Hayhurst. Mojeek’s philosophy is that a true privacy-focused search engine should index its own pages. But this is a colossal task. Hayhurst is unable to reveal precisely what percentage of Web Mojeek has indexed because the total number of web pages is unknown.
The focus on privacy is why Mojeek builds its own servers, which are located in a secure room in a data center in Kent. Mojeek has 313 servers that house 666TB of storage, and another 86 servers are to be activated before the end of the year.
Mojeek is currently working on a contextual ads product, maps and news service.
In his early days, Mojeek received many medical questions. According to Hayhurst, Smith thinks it was because users didn’t want to search for sensitive topics on Google.
Mojeek states that it only collects the necessary information when searching for countries, namely country, language and IP address, which are converted into country code. It has no connection option, so searches cannot be linked together.
It’s an example of the trade-offs between convenience and privacy, which Hayhurst acknowledges. But he adds that once search engines get into this “desolation”, they end up collecting more and more data for user personalization in an endless cycle.
“If two people perform the same search query in the same place, at the same time, with the same language, then, in our opinion, they should get the same results,” says Hayhurst. “We’re anti-personalization because I think it leads to all sorts of toxic side effects.”