A corpus is a body of documents. You can use a term extraction (also called term harvesting) process to find terms from a corpus. Up until now, we have created a “Sports” termbase but we started with terms related to cricket and we have neglected other sports. In this course, we will use a corpus-based process to find terms related to the sport of soccer.
The sport of cricket is famous for its colorful terminology. Words like “googly,” “chin music,” “gardening,” “teapot,” and “dibbly dobbly” come to mind. Perhaps it is because this sport has such a rich cultural history. We wonder if soccer, also a cultural phenomenon, is similar in this respect. Let’s find out.