|
Germanic Lexicon Project
Message Board
|
|
|
Author: Ondrej Tichy (Faculty of Arts, Charles University in Prague)
Email: ondrej dot tichy at jitro dot net
Date: 2007-05-02 17:22:58
Subject: Re: LaTeX version of B-T
> > Is anyone working on a LaTeX conversion of B-T, so we can get a properly
> > typeset pdf file? If it would not be duplicating any existing work, I might
> > have a go. My experience gained on my edition of charters
> > (keithbriggs.info/AS_charters.html) would make this fairly easy.
> > I might additionally be useful to have an index of every word in the entries
> > as an extra cross-reference tool.
>
> As far as I know, nobody is working on typesetting the text. This would be great if you wanted to undertake it.
>
> If you come up with a typesetting solution, I recommend hanging onto the scripts or code or whatever you come up with, so that you can run it again later. The major hand-corrections are done, but we're not at a final product.
>
> The Greek, Hebrew, and Runic letters are in the process of being filled in. Ondrej Tichy is overseeing this.
>
> Future work which hopefully will eventually be done includes:
>
> -Joining the entries which are broken across two pages
>
> -Marking up the whole text with XML tags to indicate headword, etymology, definition, etc.
>
> -A really good thing to do would be to shake out errors by automatically aligning the Old English passages with the Toronto corpus of Old English. We'd need to work out the exact way to do this, but I'm sure the increased accuracy would make this worth doing.
>
> --Sean
Hi
As Sean said, there is a work that needs to be done, though having a typeset printable version would be very nice.
I would like to point out that although main corrections are finished, the text is still inconsistent, so a great flexibility or a previous "normalisation" is required when processing it.
In my opinion, it would be best to wait till we fill-in the non-Latin characters, go through ERRORS and UNCERTAIN readings, check the reported errors, joint the entries and re-tag the data structurally (or semantically) rather than "visually" (as they are now).
This would make the typesetting easier and it would add some possibilities (improving the original typesetting strategies).
I have a grant proposal pending as to the editing of the data, so I'll keep you informed about our progress. I'd be also quite happy to provide any other info that might help to any future processing of the data.
Good luck!
Ondrej