Authors: Glenn Yamada and Remu Ogaki

The humble typo can be an unexpectedly steep challenge in eDiscovery.

The reason a simple typo can be confounding is that many modern technological attempts to provide efficiencies in eDiscovery practice rely on keyword searching, or word frequencies in statistically based AI analysis models.

When the word “Widgets XYZ Corporation” appears near “pricing,” perhaps that has a particular significance in a litigation. Such documents might be flagged as being of greater likelihood of importance by a statistical model or a keyword model. Both words might be highlighted within Relativity or other review platforms. Such highlighting helps reviewers to identify more quickly important terms appearing in documents and to identify key evidence.

But if the word is spelled “Wigets” or “Pircing,” these simple typos can throw off such measures. A technology-assisted review may fail to push such a document to the top of the pile as a document of importance, or the term fails to be identified as important on a page and might be overlooked.

The more casual the conversation—internal emails, Slack or WeChat chat messages, text messaging via cellphones—the more frequent such issues tend to present themselves. And ironically, in the least guarded and most casual conversations are often found the most important evidence in investigations and cases.

While the typo presents eDiscovery challenges in English, the issue is far more confounding in a language like Japanese. This is owing to the unique way that Japanese is written by a keyboard.

Japanese makes frequent use of a type of letter called kanji. Kanji are a type of logographic writing, as opposed to the English alphabet that is phonemic.

Phonemic writing systems use a letter to represent a sound. Those sounds are combined to create words, conveying meaning through how they sound when read aloud.

By contrast, logographic writing systems like kanji assign a unique character to each meaning.

For example, the character for poetry in kanji is 詩 (poem). The character can be pronounced shi or uta depending on context. It can be combined with other characters to create words. For example, the character for person (人) can be combined with poem (詩) as 詩人(shi-jin, poem-person i.e. a poet).

One of the most difficult things about learning to read Japanese is that there are literally thousands of kanji to memorize. There are 2136 kanji that are designated by the Japanese government as “common usage” kanji that students must learn to be functionally literate.1 However, this merely gives you a middle-school reading level—to read at a business or university level, you need to learn an estimated 4200 kanji.2 A survey of published literature found that there are 8474 different kanji used at least once in published literature in a single year, the year 2000.3

For obvious reasons, the volume of kanji characters makes it impractical to assign a key on a keyboard for each kanji. A typical computer keyboard only has around 100 keys.

The most common way to write Japanese on a computer is to use a Western-style phonetic alphabet keyboard. The keyboard is used to type out the Japanese word phonetically. Then the writer is prompted to choose the correct kanji the writer wants the computer to use.

For example, to type “poem” (shi) like the example given above, you would write “s-h-” on the keyboard. The computer then prompts you to select the kanji you want that corresponds to that phonetic sound.

kanji writing

Each of the 5 selections listed above are also pronounced as “shi” but have a different meaning than “poem.”

This brings us to “Japanese typos.”

One common typo in Japanese is called a “conversion error” (変換ミス).This occurs when a writer correctly inputs the phonetic sound of the word, but chooses the wrong kanji out of the list. Perhaps, the writer meant to type “shi” (poem) but instead chooses the 2nd option—死, also read as “shi” but meaning death. Or 史 “shi” meaning history, 誌 “shi” magazine, or 市 “shi” meaning town.

It can be remarkably easy in Japanese to mean to say “here is a poem”(これは詩です), but to make a simple typo and to write “here is death.” (これは死です)

The complexities presented by typos in Japanese expand when the person mistypes the phonetic sound, and then converts to a kanji without looking carefully.

Figuring out when someone writes Shu instead of Shi is simple enough in English, but the possible intended meanings can expand geometrically in Japanese.

When someone makes the same typo in Japanese, “Shi” (poem) to “Shu” could result in the person accidentally writing shu—which could be any of 手 (hand) 種 (type) 主 (master) 朱(crimson) 酒 (alcohol) 腫 (tumor), and numerous other possibilities (suggested conversions for “shu” below). A sentence that seemingly wrote “Bring me the tumor” could have been intended to be “bring me the poem.”

What this means is that a “simple typo” requires a person to quickly assess whether

  1. Was there a phonetic typo?
  2. Was there a conversion typo?

And consider from context the possible meanings that the writer intended. When a crucial text message that might serve as important evidence at trial includes a typo in Japanese, needless to say, extensive knowledge of Japanese language and culture can be critical in assessing the possible intended meanings of the writer.

What’s more, when typos exist, there can be room for legal arguments as to what the proper translation should be, or what the writer intended. These types of challenges underscore the importance of cultural and linguistic expertise to properly assess evidence in foreign languages.

About the Authors

Glenn Yamada

Glenn Yamada, Counsel; Foreign Language Review Services

Glenn Yamada is a seasoned e-discovery attorney at Hilgers, where he brings over a decade of expertise working in high-stakes, complex litigations. Follow him on LinkedIn here.


Remu Ogaki

Remu Ogaki,
Counsel; Foreign Language Review Services

Remu has well over a decade of experience managing large teams of foreign language attorneys, paralegals and translators in the US and Japan. Follow him on LinkedIn here.


1 See https://www.bunka.go.jp/prmagazine/rensai/kotoba/kotoba_009.html
2 See https://kanjibunka.com/kanji-faq/history/q0006/#:~:text=%E7%A8%AE%E9%A1%9E%E5%88%A5%E3%81%AB%E5%88%86%E3%81%91%E3%82%8B%E3%81%A8%E3%80%81%E5%AE%9F%E3%81%AB,%E3%81%97%E3%81%8B%E3%81%97%E3%80%81%E3%81%82%E3%81%8D%E3%82%89%E3%82%81%E3%81%AA%E3%81%84%E3%81%A7%E3%81%8F%E3%81%A0%E3%81%95%E3%81%84%E3%80%82
3 See https://kanjibunka.com/kanji-faq/history/q0006/#:~:text=%E7%A8%AE%E9%A1%9E%E5%88%A5%E3%81%AB%E5%88%86%E3%81%91%E3%82%8B%E3%81%A8%E3%80%81%E5%AE%9F%E3%81%AB,%E3%81%97%E3%81%8B%E3%81%97%E3%80%81%E3%81%82%E3%81%8D%E3%82%89%E3%82%81%E3%81%AA%E3%81%84%E3%81%A7%E3%81%8F%E3%81%A0%E3%81%95%E3%81%84%E3%80%82