Yandex will present its new search. Yandex has launched a new version of the search engine. How the Korolev algorithm works

Yesterday, some people in our country suddenly discovered that. It was temporarily blocked by providers TTK, Akado, Avax and Sumtel at the direction of Roskomnadzor. But a significant proportion of the subscribers of these providers did not notice the blocking, as they use the domestic search engine.

V April 2017 in "Yandex" looking for something 43 million people... If you are one of them, then this short article is for you.

P.S. For those who prefer Google and DuckDuckGo, there are links in the last section.

1. How to search among the sites of a certain city, region, federal district or country?

This is how you can find information on the request "ball of graduates" among the sites of the city of Bratsk:

graduation ball cat: 11000976

To find out the number to be dialed after the operator cat:, it is necessary to add to 1100000 region code in "Yandex.Catalogue". For example:

  • Moscow - 1100001;
  • Chernihiv - 1100966;
  • Voronezh - 1100193;
  • Volga region - 1100040;
  • Kyrgyzstan - 1100207;
  • CIS countries - 166.

There are already over 117 thousand sites in Yandex.Katalog. Similarly, you can search for something only among resources dedicated to a specific topic. To do this, instead of region codes, you must use topic codes and add 9,000,000 to them, instead of 1,100,000.

2. How to trick Yandex about your location?

Using a Chrome extension Manual Geolocation you can mark any point on the map and the search engine will think that you are there and adjust the search results in accordance with this data. For example, you can search for objects located near your home in St. Petersburg, but is located in Moscow. Convenient when planning trips.

This item is relevant for all sites that use your location data.

3. How to search for pages in a specific domain zone and in a specific language?

This is how you can find what Ukrainian sites write about zebras (in the ua domain zone) in Ukrainian:

zebra domain: ua lang: uk

Similarly, you can find out the opinion of sites of other states on various issues. Language codes for Yandex:

  • Russian (ru);
  • Ukrainian (uk);
  • Belarusian (be);
  • English (en);
  • French (fr)
  • German (de);
  • Kazakh (kk);
  • Tatar (tt);
  • Turkish (tr).

4. How do I search for pages on a specific site?

This is how you can search for pages only on the site:

zebra site: site

This is how you can search only among articles of certain categories. For example, among the questions in Rescue service website:

url messages: site / iNotes / q / *

And here's how to get a list of all the tags that are used on the site:

5. How to search for pages created on a specific date?

This is how you can find pages created on a specific day:

steve jobs date: 20170617

And like this in the interval between two dates:

steve jobs date: 20170610..20170617

And with the help of the operator idate: you can search for pages by the date of the last indexing.

6. How to search for files of a certain type?

Search for a book in PDF format to download to iBooks:

flowers for algernon mime: pdf

And this is how you can find all MS Word documents with the mention of the word "declaration" on the FTS website:

mime declaration: docx site: nalog.ru

Types of documents that Yandex indexes:

  • html;
  • docx;
  • xlsx;
  • pptx;

7. How to search only in page titles?

With this operator:

It is very convenient when you need to find an article by its exact title.

8. How to search by image file name?


You saved the picture to your computer, want to use it with the indication of the source, but do not remember where it came from? The search operator by the exact name of the image will help:

Operators for searching by attribute values ​​of HTML tags:

applet:- code of the applet tag;
script:- src of the script tag;
object:- all attributes of object;
action:- action of the form tag;
profile:- the profile of the head tag.

9. How do I find links to a specific page?

Yandex has an operator to search for mentions in a request within links. This way you can find links to a specific page.

inlink: ”www.site / iNotes / 533552 ″

10. How do I use widgets and tips?

If you type one of the four words below in the search, then gadgets will appear under the search bar:

  • "Calculator";
  • "Currency Converter";
  • "Unit Converter";
  • "Translation".

And for some queries, the answers are displayed directly in the search bar. Examples.

While on the Yandex blog and on Habré there are hundreds of comments about the merits and demerits of the new algorithm, we will tell you about the main thing: what it means for users, how to enable a new search and what, in fact, has changed.

"Korolev" as a new search

The main thing that you need to know about the Korolev algorithm for users who are not too immersed in the topic is that it is smart. He was presented this way: "Korolev is a machine intelligence that understands you." A search is built on a specially trained neural network. She no longer searches by words, but by meaning. "Thanks to this, the search understands exactly what the user needs and answers difficult questions even more accurately," the developers say.

For example, if you enter the query "Darth Vader appears to this music," the search will first suggest listening to "Imperial March". And at the same time it will give information on the character " Star Wars". Is it logical? Quite. To the query “a film in which an elderly man came to work to get a job,” the first response is a link to a review of the film “The Trainee”. He was meant. And after all, I did not have to remember either the year, or the actors, or even select the words and their order for the request.

The same goes for image searches. If earlier it was carried out by keywords from the description to the pictures, now the algorithm analyzes the image itself. Therefore, if you enter the query "cat in space" - you will be shown not only funny creative works on this topic, but, for example, a cat in washing machine... Simply because there are all the components that are similar in meaning: the cat is available, the door looks like a porthole, and the body looks like a rocket.

Last year Yandex took the first step towards meaningful search by implementing the Palekh algorithm. He knew how to match the meaning of the request and the title of the web page. "Korolev" analyzes not only the title, but the entire page as a whole. The number of pages that the search compares in meaning to the query has grown from 150 documents to 200 thousand pages. Another feature of "Korolev": it also takes into account the meaning of other requests by which people go to the page.

Why does Yandex say that it did it with my help?

Everything we do in Yandex: what queries we ask, which pages we go to, stay late or leave (because we didn't find what we need) - is taken into account in search statistics. If you entered a request, followed the link in the search results and stayed on the page for a while, you probably found necessary information and got acquainted with it. Data on the behavior of millions of users helps the neural network learn to understand the semantic proximity of the request and the found page.

It is also important for learning to consider the quality of the responses. Previously, Yandex evaluated the quality of search with the help of expert assessors. Now the ratings of volunteers - users of Yandex.Toloka are also taken into account. This is a service where anyone can complete tasks, help improve search, and receive a reward for it.

How do I start using the new search?

Nothing special is required from you. The new search will work itself one way or another. But if you want to understand what is happening - just go to the main "Yandex", wind up to the "starry sky" and click on "Start". This way you will get acquainted with your own search behavior and will be able to watch the video about "Korolev", which explains how everything works. You can also just click on the Yandex logo to the left of search string and watch a beautiful and understandable presentation.

Why "semantic" queries do not always work?

Naturally, first of all, the neural network learns to perceive popular queries - for example, about movies or music. It is about such queries that the search engine has the most data, they are asked by a large number of people. Something specific "Korolev" will also be able to learn, but this will happen a little later - when the information necessary for analysis is typed.

Full presentation


Yandex launched new version search. It is based on the search algorithm "Korolev". The algorithm uses a neural network to compare the meaning of requests and web pages - this allows Yandex to more accurately respond to complex requests. To train the new version of the search, use search statistics and estimates of millions of people. Thus, not only developers but also all Yandex users contribute to the development of search.

Words and meanings

Before talking about the present and future of the search, let us recall its past. The first search engines appeared in the mid-1990s, when the Internet was very small - there were thousands of sites. To help a person find what he was looking for, it was enough to make a list of web pages containing words from a search query. Complex ranking - that is, ordering pages according to the degree of relevance to a query - was out of the question. It was believed that the more often words from a query are found in a document, the better it fits.

The Internet grew rapidly and additional selection criteria were required. Search engines began to take into account links to documents, learned to determine the region where the request came from, and began to pay attention to user behavior.

At some point, there were so many ranking factors - signs by which you can determine how well a page responds to a request - that it became clear that it was impossible to write them all in the form of instructions. It is better to teach the machine to make its own decisions: what signs to use and how to combine them. For these purposes, Yandex invented the Matrixnet. This is a machine learning method that builds our ranking formula.

The search, however, still relies on words. Before using a complex ranking formula, search engines make a list of “pre-matched” web pages — those that contain the words from the query. We humans understand that the same meaning can be expressed in different words. The web page may not contain all the words from the request, but it is still very good to answer it. However, it is rather difficult to explain this to the machine.

Yandex took its first step towards meaningful search last year, when the company introduced the Palekh search algorithm. It is based on a neural network. Neural networks show excellent results on tasks that people have traditionally coped with. better cars: Let's say speech recognition or object recognition in images.

Launching Palekh, the company taught a neural network to transform search queries and the titles of web pages into groups of numbers - semantic vectors. An important property of such vectors is that they can be compared with each other: the stronger the similarity, the closer the request and the header are to each other in meaning.

How the Korolev algorithm works

The Korolev search algorithm compares the semantic vectors of search queries and web pages as a whole - and not just their titles. This allows you to reach a new level of understanding the meaning. Imagine hearing about Leo Tolstoy's novel War and Peace for the first time. Of course, you can make sense of the title - for example, assume that there are many battle scenes in the book. But to learn all the intricacies of the plot and give comprehensive answers to questions about the novel, you need to read it in full.

As in the case of "Palekh", the texts of web pages into semantic vectors are converted by a neural network. This operation is computationally intensive. Compare: it will take you a matter of seconds to read the title of a book, but it will take hours, days, or even weeks to read the entire book from cover to cover. Therefore, "Korolev" calculates page vectors not in real time, but in advance, at the stage of indexing. When a person asks for a request, the algorithm compares the request vector with the page vectors already known to him.

This scheme allows you to start the selection of web pages that match the query in terms of meaning, at the early stages of ranking. In Palekh, semantic analysis is one of the final stages: only 150 documents pass through it. In Korolev, it is produced for 200 thousand documents - that is, in a thousand s more than once more. In addition, the new algorithm not only compares the text of a web page with a search query, but also pays attention to other queries that people come to that page. In this way, additional semantic connections can be established.

People teach machines

Yandex believes that the use of machine learning, and especially neural networks, will sooner or later teach search to operate with meanings at the human level. But you can't do without the help of people. For a machine to understand how to solve a particular problem, it is necessary to show it a huge number of examples: positive and negative. Such examples are provided by Yandex users.

The neural network used by the Korolev algorithm is trained on impersonal search statistics. Statistics collection systems take into account which pages users go to for certain requests and how much time they spend there. If a person opens a web page and "hangs" there for a long time, he probably found what he was looking for - that is, the page responds well to his request. This is a positive example. It is much easier to pick up negative examples: just take a request and any random web page.

Matrixnet also needs help from people, which builds a ranking formula. In order for the search to develop, people must constantly evaluate its work. Once upon a time, only Yandex employees - the so-called assessors - were involved in assigning marks. But the more ratings, the better - so we decided to involve everyone in this and launched the Yandex.Toloka service. Now there are more than a million registered users: they analyze the quality of search and participate in improving other Yandex services. Tasks on Toloka are paid - the amount you can earn is indicated next to the task. For more than two years of the service's existence, tolokers have given about two billion estimates.

At the heart of modern search complex algorithms lie. Algorithms are invented by developers, and taught by millions of Yandex users. Any request is an anonymous signal that helps the machine to better understand people. Therefore, Yandex will not be mistaken if it says: a new search is a search that we did together.

This week, August 22nd, Yandex launched a new version of search with the "Korolev" algorithm... It is based on a neural network that allows it to compare the meaning of a request and a web page and respond to complex and ambiguous requests at times more accurately. To train the new version of search, search statistics and estimates of millions of people are used: it turns out that not only developers, but all users in general, contribute to the development of the system.
The presentation of "Korolev" took place, which is symbolic, in the Moscow planetarium. Andrey Styskin, Head of Yandex.Poisk, Alexander Safronov, Head of Relevance Service, Yandex.Poisk, and Olga Megorskaya, Head of Data Processing Department, Yandex.Poisk.

From Matrixnet to neural networks

Search engines appeared in the mid-90s of the last century, when the Internet was very small - only a few thousand sites. At first, search engines simply made a list of pages where there are specified words without problems with ranking according to the degree of compliance with the query. The more often words from the query appear in the document, the better. It is clear that with the current state global network this will not work anymore.

In Yandex, to process queries, they came up with Matrixnet - a machine learning method with which the author's ranking formula was built. However, the search continued to rely on words. But what about queries that users formulate allegorically or associatively? Then the web page you are looking for does not have to contain exactly all the words from the query. But how do you explain this to a machine? I wish she understood us as a person ...




In the end, scientists came up with something at the intersection of technology and biology - an artificial neural network (ANN). According to the wording of Wikipedia, it is “ mathematical model, as well as its software or hardware implementation, built on the principle of the organization and functioning of biological neural networks - networks of nerve cells of a living organism ”. Neural networks are able to process information like us and, most importantly, learn and hone skills like living beings. Actually, they are the basis of a full-fledged artificial intelligence, the appearance of which is a matter of time.

Last year Yandex introduced the Palekh search algorithm based on a neural network. He showed excellent results in solving problems that were usually only possible for humans: he coped perfectly with recognizing speech and objects in images. Palekh learned to transform search queries and web page titles into groups of numbers - semantic vectors. Their important property is that vectors can be compared with each other: the stronger the similarity, the closer in meaning the request and the header are.




"Korolev". Who understands

The next stage of development search engine on the basis of neural networks, the "Korolev" algorithm has become, which analyzes not only the title, but the entire page as a whole! The number of pages that the search compares in meaning to the query has grown from 150 documents to 200 thousand. Among other things, "Korolev" also began to take into account the meaning of other requests by which people on her go to the desired page.

The neural network learns like a child. To master this, she needed a huge number of examples. Actually, all users of the service were engaged in spontaneous training of Korolev in one way or another: search statistics and estimates of millions of people were used. Yandex is gradually learning to more and more accurately recognize semantic connections, like: [a picture where the sky swirls] - this is about a painting by Van Gogh, [a lazy cat
from Mongolia] - Pallas' cat.


Search is a very complex system. Thousands of engineers are working to ensure that she understands a person and helps to solve his problems. In Korolev, we have combined machine intelligence and the efforts of millions of people. Our users are improving search with us by asking questions and helping to train our algorithms.
Andrey Styskin, Head of Yandex Search.
In addition to analyzing the daily routine, evaluations of the quality of responses are needed to train a search engine. How more complex system, the more ratings are required. Whereas previously a relatively small group of expert assessors, members of the Yandex team, were involved in assessing the quality of search, now it was required to seriously increase the volume. This is how the service appeared Yandex.Toloki(Toloka is a form of mutual aid that was once practiced by the villagers). Any enthusiast interested in a small reward and, of course, a sense of belonging to something important can do simple tasks. Now there are more than a million of such tolokers, and the number of their ratings has exceeded 2 billion.




“Modern search is based on complex algorithms. Algorithms are invented by developers, and taught by millions of Yandex users. Any request is an anonymous signal that helps the machine to better understand people. Therefore, we will not be mistaken if we say: the new search is a search that we did together, ”reads a post on the Yandex blog.

For more than two years of Yandex.Toloka's history, the most productive and diligent participant has been identified. It was Ilya Mikhalenko from Chelyabinsk. The guy came to the presentation of "Korolev" in Moscow to receive a well-deserved award from the hands of the search engine team.




New search in business

How is the improvement in the work of our Yandex expressed in practice? Now you can talk to him almost like a brainy and erudite friend. (Even in a voice.) For example, what will you do if you need to remember the name of a film from which you remember a passage, but the names of the actors and the director flew out of your head? You can contact your friends or ask for help on some thematic forum. And you can ask "Korolev"!

Image search has been significantly improved. With them, as a rule, there is always some kind of "hell": the search engine either thoughtlessly gives out all the images, in the title of which the words from the query are used, or takes into account the text of the article, which is illustrated by the picture. If you are looking for something that would meet the vague needs of the soul, then get ready to be disappointed. "Korolev" analyzes exactly what is shown in the picture, therefore it is able to please with a non-trivial approach.






As an example, the tests cited not the most obvious request - [cat in space]. Dogs were in orbit quite often, but they did not come out of the mustachioed, disciplined conquerors of space. Only one attempt is reliably known: in 1963, the French launched the cat Felicette into a suborbital flight. Romantic, but short-sighted, - as soon as the scientists open the hatch of the landed capsule, the murka was like that. The solemn photo session did not take place.

On request, the search engine gives out not only animals in spacesuits and surreal photo-toads, but a photo of a cat in a washing machine, which is quite similar to the hatch of a spaceship. But this is not said in the description.

For the ceremonial launch of the new search engine, the entire Yandex.Poisk team took the stage. Small countdown and ... Let's go! Now everyone can experience the capabilities of the discerning "Korolev". The main thing is that its current capabilities are not static, but are in constant development.

To end the evening, the organizers have in store something completely unexpected - a communication session with real cosmonauts from orbit. They personally answered some of the popular queries from users of the search engine about space and answered questions from those present.