| Call for Papers: Motivation and Background | |
| Motivation and Background | Topics | Goals | Submission Details | |
| Despite the wide experience gained in the compilation of written language corpora, working with spoken language data is not immediately straightforward as spoken language involves many novel aspects that need to be taken care of. The fact that spoken language is transient is sometimes offered as an explanation for why it is more difficult to collect spoken data than it is to compile a corpus of written data. However, it is not just the capturing of data that is anything but trivial. Once the (audio) data have been collected and stored, the next step is to produce some kind of transcript (whether orthographic or phonetic). Further annotations such as POS tagging, lemmatisation, syntactic annotation, and prosodic annotation may then build upon this transcription. Among the problems encountered in the processing of spoken language data are the following:
|
|
Go to the frameless version of these pages