Basic Concepts of Search Engines

Search engines include several basic concepts such as crawling, indexing, retrieval, coarse ranking, fine ranking, and re-ranking.

Crawling & Indexing

Search engines first crawl a huge number of pages on the entire web, evaluate the basic quality of these pages, and then filter out the higher quality pages. Next, the search engine creates an inverted index so that users can retrieve relevant documents through keyword queries.

Retrieval

When a user initiates a search request, the search engine first corrects the keywords, then splits them into multiple terms, and looks for documents containing these terms in the index. Based on the user's search terms, the search engine selects and displays a subset of documents related to the search terms.

Coarse ranking

In the massive set of retrieved documents, the search engine removes highly duplicate content and then selects the most relevant documents to show to the user. Typically, about 760 documents are displayed for the user to view.

Fine ranking

The process of sorting 760 documents displayed to users is called precision ranking. The goal is to make the sorting results most in line with users' expectations, increase the possibility of user clicks, and improve the business value of search engines. This step is more complex than rough ranking, involving multiple algorithms such as data mining, machine learning, user behavior analysis, and user intent recognition.

Re-ranking

After precision ranking, there is a re-ranking step to adjust the sorting results in real-time based on users' search scenarios and media hotspots. For example, using different network environments or devices to search may produce different results.

How to study rankings (one's own opinion)

Study inclusion vs Study ranking

In the past, studying rankings may be more challenging than studying inclusions. However, dealing with issues such as advertisers' problems and domain name resolution may be more challenging. For large websites, the probability of participating in rankings is higher, but for massive small websites, the chances of participating in recalls are smaller.

Study large websites vs Study small websites

For both large and small websites, the methods of researching rankings are different. Large websites may focus more on how to get keywords to the front page, while small websites are more concerned with which keywords they can use to get to the front page.

Positive Push

Positive push is to assume a series of conditions, test them one by one, and see if the expected results are achieved. When studying the rules of Baidu SEO rankings, the threshold for positive push may be relatively high.

Reverse Push

Reverse push is to analyze existing results to find patterns. When studying the rules of Baidu SEO rankings, reverse push may be more effective. Reverse push requires a lot of observation and analysis of existing results.

Generally speaking, the basic concepts of search engines include crawling, indexing, retrieval, rough ranking, refined ranking, and re-ranking stages. These are introductory common knowledge of search engine systems.

Introduction to Reverse Push

In the process of website optimization, reverse push is easier than positive push. Long-tail keywords play a crucial role in SEO.

The importance of long-tail keywords

In the past, the keyword-bearing program in website cluster construction did not have a broad parsing function, so it could only carry a limited number of keywords. Therefore, if the keyword library contains keywords that cannot be ranked on the homepage, it is undoubtedly a waste of system resources. For the limited new domain names we want to index, we naturally hope that the indexed pages have a higher probability of ranking.

A traffic word in the search results has four characteristics: High Aizhan weight in the Top 10 results, Low Aizhan weight in the Top 10 results, Low search result count, High and low search results.

Based on different characteristics, we can analyze different situations: High competition; may be a banned word or not yet discovered; banned words already deleted by Baidu; normal words. For cleaning measures for prioritized ranked keywords, we need to identify the first and third types of words and delete them.

Therefore, when selecting keywords, it is necessary to clean them first and then go online. In the case of the same number of domain names, this cleaning operation can increase traffic by 25%.

Core word ranking strategy

In previous optimizations, there is a method of keyword brushing, which involves improving rankings by brushing core words and creating new words simultaneously. By adjusting the ratio and extending the time of brushing appropriately, the original core words and newly created words can be associated to increase the probability of ranking for core words.

In the processing of user queries by search engines, error correction is done first followed by word segmentation recall. Therefore, adding irrelevant symbols after search terms can influence the sorting results. Sites with fluctuating rankings are easily affected by click weighting. To identify this situation, trying to add some irrelevant symbols can be helpful.

When dealing with multiple core words, we need to consider how to effectively implement ranking operations. By analyzing the frequency of Baidu's search box drop-down words, choosing suffixes with the highest frequency and few search results that completely match the keywords in the title as a supplement to the webpage title. This strategy can effectively improve the ranking of multiple core words.

Overall, in SEO optimization, reverse thrust strategy can be more time-saving and labor-saving, through reasonable keyword selection and ranking strategy, better optimization results can be obtained.