What this checker can - and cannot - detect
Every plagiarism checker has blind spots. Most bury them. This page is ours, in full, because a report you cannot interrogate is a report you cannot trust.
How the engine works
The check runs in two phases. First, retrieval: we sample distinctive 8-12 word sequences from every region of your text - preferring rare words, which retrieve far better than common openers - and search each one on the live web as an exact quoted phrase. That produces a set of candidate pages.
Second, verification: we download each candidate page and align every sentence of your text against the page's actual content, locally. Word-for-word overlap is detected with rolling word-window comparison; close paraphrase with token-overlap and edit-similarity thresholds. A search engine ranking a page is never treated as evidence by itself - only verified text comparison counts, and each match carries a confidence figure in the report.
Before any of that, your text is normalized: Unicode lookalike characters (a standard trick for fooling checkers) are collapsed back to their plain equivalents, so swapped Cyrillic letters do not hide matches.
When we will miss matches (false negatives)
- Paywalled and subscription sources: academic journals, news archives behind logins.
- Private databases - including academic submission archives like Turnitin's. No public tool can see them; any free checker implying otherwise is lying to you.
- Offline sources: printed books and papers that were never put on the web.
- Very fresh content: pages published minutes or hours ago that search engines have not indexed yet.
- Heavy paraphrase: rewriting that changes most words falls below the near-match threshold. Detecting ideas (rather than wording) is beyond any text matcher.
When we will flag innocent text (false positives)
- Correctly quoted material - a quote is supposed to match its source. Check the citation, not the highlight.
- Common phrases and boilerplate: stock expressions, legal formulae, methodological descriptions repeated across a whole field.
- Bibliographies and reference lists, which naturally match the works they cite.
- Coincidental wording in short, factual sentences.
This is why the report shows the matched source text beside yours, sentence by sentence, with exact and near matches marked separately: the tool finds overlap; a person judges what it means.
What the score means
The matched percentage is the share of your words that sit in sentences aligned to a retrieved source. The gauge bands - 0-5% looks original, 6-20% some matches, 21%+ significant matching - are review guidance calibrated for typical prose, not accusation thresholds. A 4% report can still contain one fully copied paragraph worth fixing; a 25% report of properly quoted material can be perfectly honest work.
We also tell you how many sources were actually examined for your check - the real number, printed on the report, because coverage claims you cannot verify are marketing, not accuracy.
Results are indicative, not conclusive. We compare your text against publicly accessible web pages at the moment you run the check. We cannot detect matches in sources that are offline, paywalled, unindexed, or held in private databases (including academic submission archives). Common phrases, correctly quoted material, and coincidental wording can appear as matches. Use this report as a guide for review and citation - not as standalone proof that text was or was not plagiarized.