Home - Conlang - Valdyan - Charyan

Thinking of a grammar I dream...

When I think about the next grammar I'm going to write, I often start to dream - and nothing ever gets written. But then, making my next grammar wouldn't just be a question of writing. Paper, even though it smells nice, feels good and looks impressive, cannot begin to do what I want my next grammar to do. I need a computer for that - but what doesn't need a computer, nowadays?

Of course, I'm talking about a real grammar, a grammar based on real data, collected through months, even years of patient research, whether in the field or between a pile of dusty manuscripts from a tomb in lower Baluchistan, not about some fancy connect-the-dots tree-view in a generative rehash of a nouvelle-cuisine portion of Teach Yourself English exercise sentences.

The first demand my new grammar will have to satisfy is that it offers all available data. The enormous amount of storage available on a cd-rom makes that possible. I don't want to be fobbed off with a few hundred words in a list at the back of a book, when the author has thousands more in his personal database - to withhold a fact is to publish a falsehood. Presenting all the data will make all statements verifiable.

Data exist in several forms. Most fieldworkers carry tape recorders and have a large store of tapes of conversation in their drawers - and those tapes are allowed to slowly degrade into mere rusty dust clinging to flimsy plastic and nobody but the original fieldworker ever has access to them. But with the advent of good quality audio compression formats it should be possible to compress that amount of data and present it along with the analysis.

Of course, I could opt for the easy way out, and present a few real-audio files in my chapter on phonology; but that's just laziness. Every bit of text in my grammar should be click-able; and every click should result in the playing of the relevant audio sample. If you click on a convenient icon at the start of the example, I want you to hear the whole example. If you click on the space before the word, I want you to hear the word. If you click on the morpheme, I want you to hear what I thought was the discrete sound associated by that bit of meaning. I want to you to have a taste of my cake, and eat it, too.

As for the data I painstakingly collected in the frozen vaults of Yob Vombis, those priceless manuscripts, centuries old - they don't make a sound. But if I present an example taken from those manuscripts, I don't want to refer to some micro-fiche - I want to refer to a scan of the manuscript itself - and let you see what I saw. A tall order? Of course. Problems with copyright? Naturally. But it is silly that a museum can claim copyright on the manuscript of a monk who would be four hundred years old if he hadn't been human and died of old age three hundred and something years ago.

I want a flexible mechanism to link between my example, other examples, my lexicon, the text my examples came from, the audio files, the manuscript scans and any other interesting tidbit, too. That's perhaps the most difficult part, but the Computer is my Friend, and will help me with this bit. Besides, thinking of the way data are interlinked is exactly what the job of a scholar is.

And when I present you with my example, I want you to be able to view it on all its levels of abstraction - from the example sec, so to say, via the syntactical analysis, all the way down to the phonetic details as deduced from the actual utterance - which you can hear, of course.

Now that I have given you all my data, I want you to be able to do something with it - no really, I do! So there has to be a structure to the data. Of course, a table of contents is mandatory: that shows the structure I had in mind. But a flexible search engine that lets you save the results, will enable you to impose the structure you need on my data. And a lexicon that's only available in the form of an alphabetic list, is just too passé. Underneath the data there should be an intelligent lexicon that can be used from examples and texts, that can bind together audio data, manuscript data, analyses and translations. That's a database application, and just what a computer is good at - see? He's our friend after all.

And then, if that database could be used to link the data from my language to the data from your language to produce some kind of offspring - for instance in the form of an ancestral language, I would know that my dream has come true.


© 1999 Boudewijn Rempt - Optimized for Lynx