Reverse Article Search

About Alerts and Matching

Generic selectors
Exact matches only
Search in title
Search in content

About matching

What is matching?

We are talking about matching when we determine which content on a URL has sufficient similarities with your original article. Matching results in a snapshot. If a snapshot meets a number of requirements, it becomes an alert and is displayed to the customer.

There are a number of variables that play a role in this. Below we will discuss them step by step.

Reverse Article Search checks whether there is content on the internet that has similarities with the articles you have supplied to us. The search for articles takes place on the basis of six input variables: match-start, match-interval, match-language, match-region, match-depth and match-rhythm.

Match variables linked to the article

Match-engine

Match engine concerns the search engine that is used. By default this is Google Custom Search.

Match-method

Match method concerns the way of searching. This could be reverse, domain, source, source-domain, string or website. The default method is reverse.

Match-frequency

The number of times that matching takes place per article.

Match-start

Match-start is the moment when the first (and perhaps only) matching takes place after the moment of publication of an Article.

Match-interval

Match-interval is the time between the moments when matching of Articles takes place.

Match-language

Match-language is the language in which the matching takes place. By default, it is the language in which the article is written.

Match-region

Match-region is the area on which the matching is aimed.

Match-depth

Match-depth stands for the number of possible urls examined per Article.

Match-rhythm

Match-rhythm determines the protocol with which the matching takes place.

What is happening during the matching process

At one or more moments, snapshots are made of found urls that could contain content that has similarities to the Article. Data is collected about the URL and the original content is compared step by step with the content on the URL. When in doubt, additional research is carried out so that it can still be determined whether there is an Alert that meets the criteria. As said, the customer determines what the criteria are.

Match variables linked to the alert

Match% (match percentage)

The Match% indicates the similarity between the number of words in the original and on the url. If the original article consists of 100 words and the found alert contains 68 identical words, the match% is 68 percent.

Match-amount

Match-amount shows the number of matching words between original and url.

Match-spread

The Match-spread indicates the number of consecutive words of the Alert expressed as an index. 100 means that all words fit together. 1 means there are 100 or more interruptions. With a high Match%, the Match-spread is not very relevant. With a low Match%, the Match-spread is relevant. The spread can be used to determine whether the Alert is about the same subject (low spread) or whether part of the text has been copied (high spread).

Match-aspects

Match-aspects is a 10-digit code that indicates which aspects a Alert meets. The first digit is the sum of the aspects that have been met. The following 9 digits represent 1 specific aspect. If this is not met, the number is 0. If met, the number is one.

    1. The total of Match-aspects the Alert has met;
    2. A substantial string of the original title and the title on the url are similar (match_title_complete);
    3. The original title and content of the url as determined by the search engine ( used to detect the url) correspond substantially (match_title_content);
    4. The copyright holder is mentioned in the title of the url (match_title_copyrightholder);
    5. The original article and content of the url as determined by the search engine ( used to detect the url) correspond substantially (match_text_content);
    6. The search engine used to detect the url cites the copyright holder as the source (match_text_copyrightholder);
    7. The original text and the text on the url contain at least 1 string that is substantially similar (match_details);
    8. The copyright holder is mentioned in the text on the url (match_details_copyrightholder);
    9. An image of the copyright holder is posted on the url (match_image);
    10. Relevant keywords are listed in the url or title (match_keyword);
Match-snapshot

Match-snapshot is the moment in time the Alert was detected and captured.

Alerts and settings

You can filter the Match% yourself in the app. With Match-spread and Match-aspects that goes through us, because the correct settings are more complex. By default, all Alerts are shown in the app. Separate settings can then be set for each custom report.

Do you want to know more about reports? Click here

Frequently asked questions

Is it possible that an Alert is not always found?

Yes, that’s possible.

  • A snapshot is made when matching. If the url with the Alert is created after the snapshot is taken, it will not be indexed.
  • We can find and match any search engine indexed URL. If a website blocks the search engine, we will not find the url via the Reverse Search method. It is also possible that the url can be indexed based on, for example, the title, but its content cannot be read.

Can a URL with an overview of Articles contain a real Alert?

Sure! The condition is that the string found meets the minimum match criteria. Because the Alert is a snapshot, there is a chance that the overview page has changed when you view it and no longer contains the matching content.

What do you do with a URL whose match% cannot be determined?

We always also look at other data. For example: is the copyright holder mentioned, what does the search engine say about the content and was is the similarity of titles.

By default, the Alert is shown if one of the match-aspects (see the part about aspects) is met. The Alert will then have the minimum match percentage and a match-amount of 10.

But it is also possible that the app will set the Match% to 50 or 100 based on available data. In that case, the Match-amount is also set to 50 or 100. You can recognize an Alert that is not matched based on word-count but based on estimation, because it has the same Match% and Match-amount.

Does it make sense to report that I find an Alert that is not in the app?

Please do! With that information we can determine whether we can improve the matching process or whether Articles have been delivered incorrectly. The condition is that you are sure that your information is complete and correct; so that we don’t waste time verifying the information. Click here to contact us via the contact form. In addition to the URL found, at least also provide the title of the original Article, otherwise we will not process your question.