Expect the Unexpected: Challenges in Foreign Language Key Word Searches and Language Expertise
Author: Remu Ogaki
If someone were to walk up to you and say “Japanese and English are really different,” I think a common reaction might be to stare back as if they had commented that one is greater than zero, or that the surface of the sun is a hot place.
Litigators generally understand that working with foreign languages presents unique challenges, and that specialized support is required to grapple with these complexities. However, even when recognizing that a challenge exists, people consistently underestimate how this impacts litigation and discovery. Recognizing that challenges exist without understanding exactly how those challenges may play out when working with a specific foreign language can lead to unexpected pitfalls and inefficiencies.
Let’s examine a seemingly simple task: identifying key actors within a dataset using keyword searches.
If someone were to call you on the phone and tell you that they need you to identify “Jeffrey Johnson” in the dataset, you might hang up and realize there are a few possible ways this could be spelled.
Jeffrey might be Jeff, Jeffrey, Geoff, or Geoffrey.
Johnson might be Johnson, Johnston, or Jonsson.
When working with the English language, which is phonetic, there may be variations in the spelling of names, but the number of possible variations is limited. You might, quite reasonably, assume that names in other languages also have limited alternate spellings.
That assumption can get you into unexpected trouble.
Imagine your key actor is named Hiroshi Kato. Of course, not being fluent in Japanese, you know that after you have identified the body of evidence related to this key actor, you will need to bring in a Japanese language expert to help you interpret it. And this is where you have made a grave miscalculation.
Kato is the 10th most common family name in Japan.[1] Hiroshi is one of the most common given names in Japan. It would be quite reasonable for you to assume that there are at most a few variations on spelling this name in Japanese, just like “Jeff Johnson” in English. On this assumption, you might simply ask your team to run a search for all the plausible expressions of “Hiroshi Kato”.
One problem: there are an astonishing number of commonly accepted ways to write
“Hiroshi Kato.”
There are 31 different ways to write “Kato”:[2]
加藤、加登、加唐、加東、加当、加統、加頭、加當、加籐、下藤、何東、加斗、可藤、嘉東、嘉藤、嘉籐、家登、歌藤、歌頭、河東、河藤、河隝、花等、華藤、迦統、賀藤、鴨東、香東、香藤、鹿藤、鹿頭
And, there are 378 commonly accepted ways to write “Hiroshi,”[3] including:
博、宏、弘、浩、洋、寛、博司、広志、博史、宏司、優、博之、哲・・・
And many, many more.
Once you start considering every possible combination, you get:
31 x 378 = 11,718 possible variations
To a litigator unused to the vagaries of Japanese names, it may come as a shock that simply knowing how a key actor’s name is written in the English alphabet and how It sounds may not provide sufficient information to identify evidence related to that individual. .However, any person well-versed in Japanese culture and language will immediately recognize this insufficiency. A Japanese language expert would be positioned to help curate the request in such a way as to target the necessary information, potentially saving an inordinate amount of time and energy.
It’s one thing to be aware that challenges and unknowns exist when working with foreign languages, and that expertise is needed. But it is even more important to recognize that being unaware of the particularities of those challenges can lead to assumptions that can be quite costly, in terms of time and effort.
The only way to avoid the pitfalls of what you don’t know when working with foreign languages is to consult with an expert in that language early, and continuously through every stage of litigation.
About the Author

Remu Ogaki,
Counsel; Foreign Language Review Services
Remu has well over a decade of experience managing large teams of foreign language attorneys, paralegals and translators in the US and Japan. Follow him on LinkedIn here.
1 https://myoji-yurai.net/prefectureRanking.html




















