What is the Yandex search engine. Search engine yandex ru. Search Mathematical Models

Hello dear friends! In this article, we will continue to consider the Yandex search engine, and as you remember, in past articles, the history of the creation of this great company, which ranks first among competitors in Russia and not only, was considered.

All this is good, but newbies and seasoned site builders are interested in the most important question, of course, related to how to bring their projects to the first places of the TOP results.

So let's take a look at how it works search system Yandex, in order to understand what kind of rake you can step on, and what should you expect from a Russian search engine.

In the last article, we discussed with you. The topic turned out to be quite interesting and useful. Therefore, I decided to supplement it, deepen it, so to speak.

So, probably, with the question “Why does a search engine index documents?” I got excited - this is a no brainer. It remains to clarify the question "how".

Website ranking algorithms

First, let's take a look at some of the algorithms that are fundamental to any search engine:

- Algorithm for direct search.

What is it - you remember that you read a wonderful story in one of the books. And you start looking in turn. They took one book - leafed through - did not find, took another ... The principle is clear, but this method is extremely long. This is also understandable.

- Reverse search algorithm.

For this algorithm is generated from every page of your blog - a text file is created. This file lists in alphabetical order ALL the words you have used. Even the position of this word in the text is indicated (coordinates in the text).

This is enough quick way, but the search is already taking place with some kind of error.

The main thing here is to understand that this algorithm is not looking for the Internet, not with a blog search. And separately taken text file, which was created a long time ago. When the robot came to you. And these files (reverse indexes) are stored on Yandex servers.

So, these were the basic search algorithms. Those. how Yandex simply finds the documents it needs. There shouldn't be any problems with this.

But after all, Yandex knows not one or even 100 documents, but according to the latest data from my sources - Yandex knows about 11 billion documents (10,727,736,489 pages).

And among all this quantity, you need to select documents that are suitable for the request. And more importantly, you need to somehow rank them. Those. rank according to the degree of importance, or rather, according to the degree of usefulness to the reader.

Search Mathematical Models

To solve this issue, mathematical models come to the rescue. We will now talk about the simplest models.

Boolean mathematical model- If the word occurs in the document, the document is considered found. Just coincidence and nothing complicated.

But there are problems here. For example, if you, as a user, enter some popular word, or even better the preposition "in", which is the most common word in Russian and is found in EVERY document, then you will be given so many results that you do not even realize such a number, how many documents did you find. Therefore, the following mate model appeared.

Vector mat. Model- this model determines the "weight" of the document. Not only does a coincidence occur, but this word must also appear several times. Moreover, the more a word occurs, the higher the relevance (correspondence).

It is the vector model that ALL search engines use.

Probability model- more complex. The principle is this: the search engine found the page reference itself. For example, you are looking for information about the history of Yandex. Yandex has some kind of standard, let's say it will be my previous article about Yandex.

And he will compare all other documents with this article. And the logic here is this: the more your blog page looks like my article, the more LIKELY the fact that your blog page will also be useful to the reader and also tells about the history of Yandex.

To reduce the number of documents that need to be shown to the user, the concept of relevance was introduced, i.e. compliance.

How well your blog page really matches the topic. This is an important topic when it comes to search quality.

Assessors - who they are and what they are responsible for

This relevance is also needed to assess the quality of the algorithms.

For this there is a special forces headquarters - they are called Assessors. it special people who browse the search results with their hands.

They have instructions on how to check sites, how to rate, etc. And they manually determine in order whether your pages are suitable for search queries or not.

And the quality of search algorithms depends on the opinion of the assessors. If all the assessors say that the search results do not match the queries, then the ranking algorithm is incorrect, and here only Yandex is to blame.

If the assessors say that only one site does not match the request, it means that the site flies somewhere far away and goes down in the search results. More precisely, not the entire site, but only one article, but this is "not the point."

Of course, assessors cannot view and evaluate ALL articles with their hands and eyes. Well this is understandable.

And other parameters come to the rescue, according to which the ranking of pages is carried out.

There are a lot of them, well, for example:

  • page weight (VIC, PageRank, tumblers all in all);
  • domain authority;
  • the relevance of the text to the request;
  • the relevance of the texts of external links to the request;
  • as well as many other ranking factors.

The assessors make comments, and the people who are responsible for setting the mathematical ranking model, in turn, edit the formula, as a result of which the search engine works better.

The main criteria for evaluating the work of the formula:

1. Accuracy of search engine results- percentage of documents that match the request (relevant). Those. the fewer pages not matching the request are present, the better.

2. Completeness of the search engine results is the ratio of relevant web pages to given request to the total number of relevant documents in the collection (a set of pages in the search engine).

For example, if there are more relevant pages in the entire collection than in the search results, then this means that the search results are incomplete. This was due to the fact that some of the relevant web pages fell under the filter.

3. Relevance of search engine results- This is the correspondence of the web page to what is written in the snippet. For example, a document may be very different or may not exist at all, but it may be present in the SERP.

The relevance of the issue directly depends on how often the search robot scans documents from its collection.

Collection collection (indexing of site pages) is carried out special program- a search robot.

The search robot receives a list of addresses for indexing, copies them, then the contents of the copied web pages are sent for processing to an algorithm that converts them into reverse indexes.

Well, here "in a nutshell", if I may say so, we discussed the principles of the search engine.

Let's summarize:

  1. A search robot comes to your blog.
  2. The crawler keeps the reverse index of the page for later retrieval.
  3. Using a mathematical model, the document is processed and displayed in the search results according to the formulas and taking into account the opinion of the assessor.

This is, if very, very simplified. Just to get a basic understanding of how the Yandex search engine works.

I have written so much text now, and perhaps so many things are not clear. Therefore, I suggest that you return to this article a little later and watch this video.

This is an excellent guide that I used to study at one time.

Hope this information will help you better understand why any of your sites are in relevant search positions and do everything to improve them.

On this I say goodbye to you, if you have any questions, I am always happy to answer them in the comments. Or maybe you want to supplement the article?

In any case, give your opinion. !

In this article, I will talk about what the Yandex search engine is about the work of this search engine and point out examples of sites that the Yandex search engine limits in ranking.

The Yandex search engine, in terms of its popularity, is ranked 20th worldwide and 1st in Russia. Officially, Yandex was approved in 1997 on September 23rd, its development began within the Comp Tek International company, and already in 2000 Yandex began to exist as a separate company.

The founders of the company are Arkady Yuryevich Volozh, who is the CEO, and Ilya Valentinovich Segalovich (1964-2013), Yandex founder and director of technology and development. We got a little familiar with the history of Yandex, now let's talk about its search engine.

And so the main direction of Yandex is the search engine, distinctive feature which is fine tuning search query... The Yandex search engine allows you to search for your selected request in Russian, Ukrainian, Belarusian, Tatar, Kazakh, English, Turkish, German and French, while taking into account their morphological spelling.

Yandex has also developed a thorough algorithm for assessing the relevance and the principle of checking documents with the exception of their copies in different encodings. Unlike Google, more precisely from its PR-PageRank ranking algorithm, one more important point for the search engine Yandex, is the introduction of a thematic citation index - TIC.

Yandex search engine

http://www.yandex.ru
The Yandex search engine has robots that represent a specific program to check sites for their relevance. Search robots go to the site using direct links, indexing new pages and saving them to their database. In order for the indexed page of the site to get to the TOP, which is very important, it is necessary to take into account such aspects of indexing as the frequency of keywords on the page, the number of external links leading to your site, and the total weight of the site, which is measured by such an indicator as Yandex TIC.

An example of sites that Yandex system limits in ranking

Sites with non-unique content that has been copied or rewritten from other sites.

Sites that link to each other intensively in groups.

Sites with meaningless content.

Websites that use deceptive technology.

Forums and message boards that contain a lot of link spam.

Websites that are trying to earn relevance by placing external links, which are not an offer of the author to visit his resource.

They have long become an integral part of the Russian Internet. Search engines are now huge and complex mechanisms that represent not only a tool for finding information, but also attractive areas for business.

Most of the users of search engines have never thought (or thought, but did not find an answer) about the principle of work of search engines, about the scheme for processing user requests, about what these systems consist of and how they function ...

This master class is designed to answer the question of how search engines work. However, you will not find factors that influence the ranking of documents here. Moreover, you shouldn't count on a detailed explanation of the Yandex operation algorithm. He, according to Ilya Segalovich, director of technologies and development of the search engine "Yandex", can be recognized only "under torture" by Ilya Segalovich himself ...

2. The concept and functions of the search engine

A search engine is a software and hardware complex designed to search the Internet and responding to a user's request, specified in the form of a text phrase (search query), by issuing a list of links to information sources, in order of relevance (in accordance with the request). Major international search engines: "Google", Yahoo, MSN. On the Russian Internet, these are Yandex, Rambler, and Aport.

Let's take a closer look at the concept of a search query using the Yandex search engine as an example. The search query should be formulated by the user in accordance with what he wants to find, as briefly and simply as possible. Let's say we want to find information in Yandex on how to choose a car. To do this, open the main page of "Yandex", and enter the text of the search query "how to choose a car". Further, our task is to open the links to sources of information on the Internet provided at our request. However, it is quite possible not to find the information we need. If this happens, then either you need to rephrase your request, or in the search engine database there really is no relevant information on our request (this can be when setting very "narrow" queries, such as "how to choose a car in Arkhangelsk")

The primary task of any search engine is to deliver people exactly the information they are looking for. And to teach users to make "correct" requests to the system, ie. queries that match the principles of search engines are not possible. Therefore, developers create algorithms and principles of search engines that would allow users to find the information they are looking for.

This means the search engine must "think" the way the user thinks when looking for information. When a user makes a request to a search engine, he wants to find what he needs as quickly and easily as possible. Having received the result, he assesses the work of the system, guided by several basic parameters. Did he find what he was looking for? If not, how many times did he have to rephrase the query to find what he was looking for? How relevant was he able to find information? How fast the request was processed search engine? How convenient were the search results? Was the desired result the first or the hundredth? How much junk was found along with useful information? Will you find the information you need when you turn to a search engine, say, in a week, or in a month?

In order to satisfy all these questions with answers, the developers of search engines are constantly improving the algorithms and principles of search, adding new functions and capabilities, and trying in every possible way to speed up the work of the system.

3. The main characteristics of the search engine

Let's describe the main characteristics of search engines:

  • Completeness

    Completeness is one of the main characteristics of a search engine, which is the ratio of the number of documents found upon request to the total number of documents on the Internet that satisfy this request. For example, if there are 100 pages on the Internet containing the phrase “how to choose a car”, and only 60 of them were found for the corresponding query, then the completeness of the search will be 0.6. Obviously, what fuller search, the less likely it is that the user will not find the document he needs, provided that it exists on the Internet at all.

  • Accuracy

    Accuracy is another main characteristic of a search engine, which is determined by the degree to which the found documents match the user's request. For example, if the query “how to choose a car” contains 100 documents, 50 of them contain the phrase “how to choose a car”, and the rest simply contain these words (“how to choose the right radio tape recorder and install it in a car”), then the search accuracy is considered equal to 50/100 (= 0.5). The more accurate the search, the faster the user will find the documents he needs, the less various kinds of "garbage" will be encountered among them, the less often the documents found will not match the request.

  • Relevance

    Relevance is an equally important component of search, which is characterized by the time that passes from the moment documents are published on the Internet until they are entered into the index base of the search engine. For example, the next day after the appearance of interesting news, a large number of users turned to search engines with relevant queries. Objectively, less than a day has passed since the publication of news information on this topic, but the main documents have already been indexed and are available for search, thanks to the existence of the so-called "quick base" in large search engines, which is updated several times a day.

  • Search speed

    Search speed is closely related to its resistance to stress. For example, according to the data of Rambler Internet Holding LLC, today, during working hours, the Rambler search engine receives about 60 queries per second. Such workload requires a reduction in the processing time of an individual request. Here, the interests of the user and the search engine coincide: the visitor wants to get results as quickly as possible, and the search engine must process the query as quickly as possible so as not to slow down the calculation of the following queries.

  • Visibility

4. Short story search engine development

In the initial period of the development of the Internet, the number of its users was small, and the amount of available information was relatively small. For the most part, only research workers had access to the Internet. At this time, the task of searching for information on the Internet was not as urgent as it is now.

One of the first ways to organize access to information resources network was the creation of open catalogs of sites, links to resources in which were grouped according to topic. The first such project was the site Yahoo.com, which opened in the spring of 1994. After the number of sites in the catalog increased significantly, the ability to search for the necessary information in the catalog was added. In the full sense, it was not yet a search engine, since the search area was limited only to the resources present in the directory, and not to all Internet resources.

Link directories were widely used in the past, but have almost completely lost their popularity at the present time. Since even modern catalogs, huge in their volume, contain information only about an insignificant part of the Internet. The largest directory of the DMOZ network (also called the Open Directory Project) contains information about 5 million resources, while the search database google systems consists of over 8 billion documents.

In 1995, the search engines Lycos and AltaVista appeared. The last for many years was a leader in the field of information search on the Internet.

In 1997, Sergey Brin and Larry Page created the Google search engine as part of research project at Stanford University. Google is currently the most popular search engine in the world!

In September 1997, the Yandex search engine was officially announced, which is the most popular in the Russian-speaking Internet.

Currently, there are three main search engines (international) - Google, Yahoo and with their own databases and search algorithms. Most of the other search engines (of which there are a large number) use in one form or another the results of the three listed. For example, AOL search (search.aol.com) uses a Google base, while AltaVista, Lycos, and AllTheWeb use a Yahoo base.

5. The composition and principles of the search engine

In Russia, the main search engine is Yandex, then - Rambler.ru, Google.ru, Aport.ru, Mail.ru. Moreover, at the moment, Mail.ru uses the Yandex search engine and database.

Almost all major search engines have their own structure that is different from others. However, it is possible to single out the main components common to all search engines. Differences in the structure can only be in the form of the implementation of mechanisms for the interaction of these components.

Indexing module

The indexing module consists of three auxiliary programs (robots):

Spider (spider) - a program designed to download web pages. Spider downloads the page and fetches everything internal links from this page. The html-code of each page is downloaded. Robots use HTTP protocols to download pages. The "spider" works as follows. The robot sends the “get / path / document” request and some other HTTP request commands to the server. In response, the robot receives a text stream containing service information and the document itself.

  • Page url
  • the date the page was downloaded
  • server response http header
  • page body (html-code)

Crawler ("traveling" spider) - a program that automatically crawls all the links found on the page. Highlights all links present on the page. Its task is to determine where the spider should go next, based on links or based on a predefined list of addresses. Crawler, following the links found, searches for new documents that are still unknown to the search engine.

Indexer is a program that analyzes web pages downloaded by spiders. The indexer parses the page into its constituent parts and analyzes them using its own lexical and morphological algorithms. Various page elements are analyzed, such as text, headings, links, structural and style features, special service html tags, etc.

Thus, the indexing module allows you to crawl a given set of resources by links, download pages encountered, extract links to new pages from received documents and perform a complete analysis of these documents.

Database

A database, or an index of a search engine, is a data storage system, an information array that stores specially converted parameters of all documents downloaded and processed by the indexing module.

Search Server

The search server is an essential element of the entire system, since the quality and speed of search directly depends on the algorithms that underlie its functioning.

The search engine works as follows:

  • The request received from the user is subjected to morphological analysis. The information environment of each document contained in the database is generated (which will subsequently be displayed in the form, that is, corresponding to the request text information on the search results page).
  • The received data is passed as input parameters to a special ranging module. The data for all documents is processed, as a result of which, for each document, its own rating is calculated, which characterizes the relevance of the query entered by the user, and the various components of this document stored in the search engine index.
  • Depending on the user's choice, this rating can be adjusted additional conditions(for example, the so-called "advanced search").
  • Next, a snippet is generated, that is, for each found document, the title, a short annotation that best matches the request and a link to the document itself are extracted from the document table, and the found words are highlighted.
  • The resulting search results are transmitted to the user in the form of a SERP (Search Engine Result Page) - search results page.

As you can see, all these components are closely related to each other and work in interaction, forming a clear, rather complex mechanism of the search engine operation, which requires a huge expenditure of resources.

6. Conclusion

Now let's summarize all of the above.

  • The primary task of any search engine is to deliver people exactly the information they are looking for.
  • The main characteristics of search engines:
    1. Completeness
    2. Accuracy
    3. Relevance
    4. Search speed
    5. Visibility
  • The first full-fledged search engine was the WebCrawler project, published in 1994.
  • The search engine includes the following components:
    1. Indexing module
    2. Database
    3. Search Server

We hope that our master class will allow you to get a closer look at the concept of search engines, to better know the main functions, characteristics and the principle of operation of search engines.

We are not as unique as we think: millions of people before us have puzzled and millions after us will puzzle the search engine with almost the same questions. On the other hand, we are too unpredictable: the formulation of our request is influenced by a huge number of factors that we do not understand. And at least for this reason, the request of each of us, no matter how banal it may be, requires an individual approach.

In fact, the entire work of the search engine "Yandex" is reduced to two simple things: to understand what a person really wants to know, and in a few seconds to find suitable documents for him among the billions of documents on the Web.

Take prints

The search engine's system is somewhat similar to the Matrix, and the search robot (a complex program that makes decisions on its own) is like Agent Smith.

In order not to search the entire Internet every time someone needs to find out something, the search engine does part of the work in advance - it checks what is on the Web and where it lies with the help of thousands of search robots. They are of two types: basic and fast. The main one bypasses and processes the Internet as a whole, and the fast one - documents that appeared a minute or even a couple of seconds ago. The task of robotic programs is to select information that is useful and useful for users, to process it, filtering out all that is outdated and unnecessary. In some ways it resembles sorting garbage: paper in one container, glass in another, plastic in a third, food waste in a fourth ...

The information collected by the robots forms the so-called “snapshot of the Internet”. It is stored on thousands of Yandex servers and is constantly updated. A snapshot is like a list that tells you where you can find what information. In this list, each keyword has not one, but millions of "pages". In order for all updates to the nugget to be available to users, they are transferred from the repository to " basic search". Data from the main robot is transferred every few days, and from the fast robot - in real time.

Bring to clean water



ILLUSTRATION: EUGENE TONKONOGIY

Looking for an answer to the question asked in a prepared base, the machine faces two main difficulties. The first difficulty is language. Before looking for an answer to a question, it is important for a machine to understand in what language to do it. For example, for a Russian-speaking person on the query "Prince Igor's squad", the search will find documents with information about the army, and for a Ukrainian, the "Prince Igor's squad" will also give documents mentioning Princess Olga, his spouse, since in Ukrainian "wife" is "Squad". And in the rich Russian language, the same word or its derivatives can mean different things. For example, the word “steel” is one of the forms of the noun “steel” and the verb “to become”. The second difficulty is human psychology. When we enter a request, we expect a quick and accurate answer, without worrying, of course, about the correspondence of the formulation of the request to the principles of mathematical analysis, according to which the brain of the machine works. For example, by typing in search string the word "napoleon", what does a person want to get: a recipe for a cake or a biography of a French emperor, buy brandy or find the address of a mental hospital?


In such situations, several technologies come into play at once. You can give you a few hints under the search bar that further refine your query. Like, choose what you need: Napoleon recipes or Napoleon - Bonaparte. If the user does not respond to the request of the car and does not add words to the "Napoleon", then the "Spectrum" technology helps: without hoping for help, the machine immediately searches for information in several categories (about the cake, and about the emperor, and about the yak horse. ..). In addition, personalization mechanisms help to understand the user - the machine's knowledge of what this user was looking for from his computer a day or two or three or a month ago: if you often asked Yandex questions about cooking, the machine will first show you the results saying, that Napoleon is a cake.

Combinations: clubs of interest

The task of a search engine is not limited to simply selecting documents that contain words and phrases from a search query. The machine needs to understand which documents meet our conflicting requirements and why they meet them. Do we want to get information about Napoleon - a cake, or maybe we visited a fitness club with a pretentious name for a couple of years, or even are completely concerned about the complexes of people of short stature. In any case, solving the problem requires a non-trivial approach.


The creators of the Yandex search program found this approach by delegating the choice to the machine. On the one hand, a soulless, but very fast and intelligent machine does not know and does not want to know anything about us as individuals, and on the other, it tries to find out as much as possible about each one.

In addition to the geographic location of the user and the linguistic analysis of his queries, the search engine uses several thousand criteria that are completely not obvious to humans.

The trick is that the machine develops and updates these criteria on its own.

It simply takes data on the preferences and user behavior of millions of people and connects this “arithmetic mean” to our query history. The principles that guide the Matrix within itself, comparing the thousands of categories of user interests it has developed, often do not fit into traditional human notions of what “interests” can in principle be. There are tens of thousands of them. They create different, sometimes funny, combinations with each other. For example, one of such combinations may be that the search results match the interests of the person who bred newts. At the same time, a person is not just interested in newts, but already breeds them, but only for the first year.

Estimates. Helping hands


The matrix, of course, decides itself (with the help of higher mathematics) what and in what sequence should be shown to users based on tens of thousands of criteria. But the Matrix also uses living people - 1000 Yandex employees, the so-called assessors, evaluate the search results for a particular query (of course, not every query is evaluated, and this is done not in real time) for their compliance with expectations regular user: not as rational as a machine, not as precise in wording, contradictory and emotional.

Today we set off on another long journey along the ornate paths of search engine development ( Яndex, Yandex). I think that the domestic giant of network search has long ago grown to such a level that it is not too lazy to dig all its lobbies, remember how the Yandex search engine developed, what was interesting over the years of its existence.

Moreover, it receives a lot of visitors from the Yandex search engine. Many of them leave by contextual advertising, I recently accepted the blog, so I think this company is more than worthy of a big post about it.

If we take into account the Russian Internet, then here Yandex is the undisputed leader. In Russia, this is the first search engine in terms of importance. There are regional search engines, a kind of branches in Belarus, Ukraine, Kazakhstan. Yandex is very popular among residents of these countries. I can judge this at least by statistics, seeing that a lot of visitors come from other regions.

Currently Yandex is not only a search engine, it is also numerous services that can be accessed by absolutely all users of this search engine. Here you can find and necessary information, and navigate the choice of leisure, find pictures, goods, compare prices, watch the weather, communicate on a social network, watch the schedule of TV programs, transport. There are numerous corporate solutions. You can even go to Narod.ru. Yandex has a convenient system that implements functionality for working with your sites. Among the latest available innovations of the service - which remained paid for a long time, but in December 2011 this service became available to absolutely everyone.

You can go on about the wonderful technologies and useful services of Yandex for a very, very long time. Therefore, for the convenience of perceiving information, I will break our journey into components. I will describe the entire path of the search engine in chronological order by year - from its creation to the present time.

Yandex development history

1980s - 1990s

History of Yandex development takes its roots in the now distant 80s, the existence of the USSR. It was then that development began for the first time software for search in the Arcadia company. The work was carried out under the leadership of Arkady Borkovsky and Arkady Volozh. It was that first search technology received the name "Яndex". And the Yandex site itself, the same one that we can see today, appeared in 1996. The developments that were carried out at that time were recognized as promising, as a result of which the management of CompTek (sale of computers and components) and the developers of the system decided on the advisability of further developing the technology and introducing it to the masses. In this regard, a concept for the development of the project was prepared, aimed at a wide audience.

Yandex was officially announced only on September 23, 1997. And in fact, at first it was one of the divisions of CompTek International. That is, independence there was generally a gulkin's nose. And only in 2000 Yandex became the company that you can see it today. In the sense that the company has already become completely independent. Independent Yandex.

By the way, long before the announcement of the search engine Yandex, the companies came up with a name. Yandex - means "Language index". If translated from English, it turns out "Yet Another indexer". However, later, as the search engine developed, other interpretations began to appear. For example, if you translate the first letter (I - I) from English into Russian in the English Index, you get "Яndex".

They came up with the name "Yandex" Ilya Segalovich (current director of technology), and Arkady Volozh

A year before the official release of the company, on October 18, 1996, the Netcom'96 exhibition took place, at which CompTek presented the first products of the developing search engine. These were Yandex.Site and Yandex.Dict. Then, six months later, Yandex.CD appeared - searching for documents on CD-ROM, and then the Yandex.Lib project started. It was a Yandex package library that was designed to be embedded in all sorts of applications and databases.

At the time when Yandex.ru was officially presented to the public, the following can be distinguished from the interesting:

    Assessment of the relevance of documents. At that time, Yandex was pretty good at finding copies and excluding them. At the same time, documents were searched in various encodings.

    Search by exact word form. Yashka knew how to search based on morphology

    Search based on distance. Yandex was able to search within a paragraph, by exact phrases

    Functioned core for assessing the relevance of pages. For each request, documents were selected taking into account their relevance (relevance) to the request. In addition, when selecting documents for search results, the frequency (density) of the keyword on the page was taken into account. By the way, precisely because of the imperfection (at that time) of this algorithm, densely packed pages appeared in the top of the search results. keywords virtually meaningless.

    Also, the search took into account the distance between words, and how the words are located in the document.

Yandex website design

The very first design for the Yandex site was rather primitive and imperfect. It was developed by the well-known Artemy Lebedev. He looked like this

By the way, the Yandex forum was opened in the same year. It was intended for communication between system users and developers. The idea was good and the forum functioned well. True, it existed until 2008. Then there was a slight reshuffle of priorities. As far as I can tell, the preference was given to socialization. Yandex also began to actively develop its own social network, on the basis of which the current blog appeared, where all Yandex announcements are published, and where, in fact, the communication between users and developers takes place. You can see for yourself, the old forum url ( http://forum.yandex.ru/yandex/) today redirect to the well-known http://webmaster.ya.ru/.

1998 year

The project, which got started, showed good potential, they continued to work on it. In 1998, the search engine was improved, many others were introduced functionality for users. In particular, it became possible to search in what was found, search for similar documents and much more. Work is also in progress on the design home page Yandex. Now she has changed a little

As you can see, outwardly, nothing has changed much. To a greater extent, technical work was carried out

1999 year

For the year audience Russian share the Internet has grown significantly. Together with it, the quality and technology of Yandex have grown, the developers have introduced many improvements. The Yandex search engine has introduced a new search bot, which significantly increased the speed of crawling documents on the web.

The innovations that affected the custom parts of the functionality were as follows:

    Now you can search more specifically - by annotations, signatures, pictures, titles

    We have introduced a search restriction for a group of sites

    Separately highlighted documents in Russian

By the way, it was in 1999 that the well-known concept (thematic citation index) was first introduced to everyone today. True, then it was calculated rather primitively. The authority of the site (aka TIC) depended to a greater extent on the number of sites that linked to the domain of interest to us.

The design of the main page, by the way, has also changed. Now it has become already something more similar to the current one.

It was in 1999 and another significant event. It was then that free constructor sites, better known to all of us as Narod.ru ( free hosting and file hosting). By the way, this project still exists. The motto of this project was as follows - in 60 seconds.

year 2000

Perhaps it was the introduction of new services that allowed Yandex to reach a fundamentally new level of development. Over time, the search engine firmly cemented its status, which made it possible significantly. In fact, it was already a new project, not the one that started under the auspices of CompTek.

In 1999, Arkady Volozh, realizing the prospects for the development of the project, began to engage only in promoting Yandex. But the difficulty at the same time was that it was necessary to find experienced partners who would have the skills of corporate building. The only difficulty was that it was necessary to find such partners who would invest in the development of the project, but at the same time would not require a complete transfer of management to their own authority.

And such a partner was found. It was the company ru-Net Holdings... In the spring of 2000, an investment agreement was signed with this company. Here, however, there were some casualties. A certain share of the search engine still had to be given. According to the agreement, 1/3 of the search engine left the company. That is, from that moment on Yandex ceased to be a structural division of CompTek, but became an independent company, which had its own offices, its own management, its own budget, etc. The Director General Arkady Volozh became the company.

I think Yandex was very lucky with its first head, because Volozh turned out to be not only a specialist in finding potential partners, but also a good innovator. After the start of an independent "voyage" in the company, tremendous changes began. The staff has significantly increased, and the resource itself has received a new kick from its leaders.

In total, ru-Net Holdings invested about $ 5 million. What can I say, the deal turned out to be very profitable, especially considering the fact that today the number one search engine in the Russian Internet will cost at least several hundred million dollars. This is at the most conservative estimate.

The year 2000 became significant also for the reason that it was in this year that Yandex's multiportality began to emerge more clearly, for many services began to appear that were not directly tied to search. Such services are Yandex.News, Yandex.Mail, Postcards, a search bar at ya.ru. In addition, there were many services that later merged, becoming what we know today as Yandex.Market. In addition, another significant innovation was the introduction of specialized software for integration into user browsers - Yandex Bar.

year 2001

This year became a turning point, because in 2001 Yandex became the leader of the Russian Internet in terms of attendance. In addition, the amount of information stored on the company's servers has grown. Its size was 1 terabyte. By the way, Yandex.Pictures appeared this year as well. In addition, an electronic payment system Yandex money

In addition, the design of the Yandex home page has been significantly improved. Links to new services and news appeared here. We can say that, in general, the outlines of today's Yandex have already appeared.

2002 year

This year, the developers have been actively working to improve the communication service - Yandex.Mail. A lot of work has been done to filter the correspondence. 2002 was the year of the merger of three services - Goods, Guru and Pick into one - Yandex.Market. By the way, you can see for yourself that this service is very relevant even today. Perhaps, for the first time in all the years of investing in 2002, a goal appeared - to reach self-sufficiency. It was necessary to develop a strategic system for the monetization of the project. Moreover, one that would bring stable and large profits in the future. It has become such a model, and it is precisely the income that the company began to receive from this advertising model that made it possible to reach self-sufficiency much earlier than expected. Therefore, we can say that 2002 was a turning point in terms of entering a business-oriented model, which, moreover, has already begun to bear fruit.

2003 year

Active work on the Yandex.Mail service continued this year. Here, the next massive changes were introduced that affected all users of the system. Of course, Ya. Mail has become more functional and convenient. Looking ahead, I want to say that in the future the service has also developed very actively, and its users have seen many new interesting features more than once. In particular, users received an unlimited mailbox size and a new spam filter "Spam Defense". In 2003, the Yandex design was renewed again.

By the way, each design corresponded to a certain version. The 2003 version of the design was the eighth in a row, and it looked like this

Any rollout new version design initially goes through a beta testing period. And if before that beta tests were conducted in a closed mode, then this time, in two weeks of trial testing of the new interface, anyone could get access to the new interface. True, a year later the main one was upgraded again, but more successfully. And it existed in this form until 2007.

Yandex even at that time was already a fairly reputable company, because already in 2003 the Yandex search engine was successfully introduced into the presidential website. In the fall of 2003, the developers rolled out the next product updates: Yandex.Publisher (Yandex.Publisher), Yandex.Server (Яndex.Server), which became the legacy of Yandex.Site.

2004 year

The business model employed by the number one search engine in the Russian Internet worked very well, as a result of which the profit received by the company in 2004 was already tens of millions of dollars. This gave impetus to the development of new services, for example, a map search service, blogs, forums. 2004 is also notable for the fact that it was then that at Russian market a serious competitor appeared in the face of Google. There was an urgent need to join the struggle for leadership, as a result of which the management of Yandex decided to increase the number of employees tenfold. Initially, there were 200, after the renewal of the staff, there were 2,000 employees. But the main thing is that after the personnel update, nothing has changed for the worse. The traditions remained, the technologies were also up to par. On the whole, we can say that Yasha has not turned into a dry corporation.

The battle of technology: Yandex vs Google

2005 year

This year passed under the slogan of the geographical expansion of the company's representative office. Because a Ukrainian representative office of Yandex appeared - Yandex.Ukraine. By the way, the director of this representative office is Sergey Petrenko, the founder of the well-known serch and the author of the interesting blog BloGnot

2005 became significant also because my beloved was opened. This is a service on the principle of "kolotibablo webmaster", and if in Russian, it is a service that allows webmasters to place advertisements on their sites.

Yandex.Dictionaries appeared in the same year. Changes have also taken place in the Yandex.Money service. Now all users have the opportunity to manage their account through the Internet wallet.

2006 year

This year will be remembered for the appearance of the now well-known service blogs.yandex.ru. This is a kind of marketing tool. Allowed to study public opinion, reviews in blogs, forums. Yandex.Maps introduced a traffic display tool.

From 2006 to 2010 Yandex was located in the old office on Samokatnaya street in Moscow





As you can see, they were a bit cramped earlier. It's not now, a huge building with 2,000 employees.

In 2006, there was another interesting event - the first remote development office was opened in St. Petersburg. Then, of course, the scale was not yet the same. It's not like there are 11 offices in Yandex today in Russia, Ukraine, Turkey and even California. Offices are differentiated by type of activity. There are development offices, sales offices, offices that work in the direction of product localization

2007 year

This year there have been more webmaster-oriented events. In particular, the Yandex.Photo service has appeared. But for me, as a webmaster, a more interesting event is the emergence of the service Yandex.Metrica... True, at that time it was a completely crude service, and it was focused not on webmasters, but on Yandex.Direct advertisers. In the same year, a Ukrainian representative office was opened - Yandex.ua. Today, according to LiveInternet data, almost 14% of Russian-language traffic falls on yandex.ua

Also in 2007, he started a project known to all webmasters, which is not used now, probably only lazy

2008 year

We can say that this year Yandex's sphere of influence has increased so much that it was decided to open a branch of the search engine in the USA, California. At the same time, significant additions were made to the algorithms. In particular, the international standards Sitemap, MediaRSS, etc. began to be supported. That is, as you can see, the spheres of interest have gone far beyond the limits of the Russian Internet. The number one search engine of the Russian Internet has now become English-language sites. Before that, the problem was that the domestic search engine did not support international standards, in connection with which there was a problem with the indexing of sites from the burzhunet, but after the 2008 upgrade, this problem was solved. And after that, the Yandex logo began to be written entirely in Russian.

year 2009

This year was significant because before that there was no division of search by region. That is, before the introduction this algorithm was built on the principles of uniformity. You enter, for example, in the search line the query "" in Moscow and Novosibirsk, and you get the same results. Now everything has changed. And the search results are mixed with results based on the principle of geo-dependence. Simply put - if you search with Yandex in Moscow and Novosibirsk, the results will be different.

In 2009, work continued in the previously selected western direction. In particular, a service was tested with the help of which foreign sites were translated. Then this service evolved, and in 2011 it became known as Yandex.Translation.

An equally important event was the introduction of a new method of machine learning - Matrixnet... This technology applies various patterns as an assessment, and takes into account various ranking factors. But the main thing is that the technology is self-learning. When assessing assessors, only real patterns are assessed, finding nonexistent ones is completely excluded.

The revolutionary nature of this technology lies in the fact that Matrixnet uses an incredibly complex ranking formula that takes into account a huge number of factors. This, on the one hand, makes it possible to achieve better search results, and on the other hand, it does not allow webmasters to understand this pattern, and, therefore, to influence it in their own interests.

MatrixNet technology in detail:

2010 year

The old office on Samokatnaya street is a thing of the past, and the whole company has moved to new mansions. Actually, it became the main event of 2010