Results of the August 2004 Technical Meeting
in Philadelphia
Location and Date:
University of Pennsylvania, Philadelphia, 22-27 August 2004
Participants:
-
Robert Casties, Max-Planck-Institute for the History of Science, Berlin
-
Peter Damerow, Max-Planck-Institute for the History of Science, Berlin
-
Madeleine Fitzgerald, University of California
-
Malcolm Hyman, Harvard University
-
Steve Tinney, University of Pennsylvania
-
Dirk Wintergrün, Max-Planck-Institute for the History of Science, Berlin
Critical Issues:
-
Web presentation (access system):
-
FileMaker Solution
-
We will continue to use this solution until a better solution is working (e.g., eXist, Zope-SQL).
-
We will not migrate to FM 7 Server Advanced (replaces FM Unlimited) at this time because the effort is about the same as for migrating to another solution and FM Server Advanced is not yet available anyway.
-
UCLA will add a separate FM server and use FMWVCS and Apache on port 80 for stability, security, and to resolve the XP/Explorer issue.
-
The Berlin CDLI server arrangement (see diagram) is not appropriate for UCLA.
-
Zope-SQL Solution
-
Will be developed in parallel with current CDLI system by Berlin team.
-
Features will include login facilities for editing user privileges, preferences, and profiles.
-
RDF trees would be nice.
-
Will integrate zogilib (digilib) and Berlin's version control technology.
- Extended Zope solution with rdf-generated tree is now available on-line.
-
eXist Solution
-
Will be developed in parallel with current CDLI system by Steve Tinney.
-
Will integrate multiple projects with templates based on identifier for contributing institutions/projects (branding).
-
Dynamic preparation of static solutions.
-
May integrate zogilib (digilib) and Berlin's version control technology.
-
Workflow:
-
Catalogues
-
Decision made to review and document CDLI catalogue database (last defined in October 2001). Madeleine has begun
on-line documentation
of exemplar and composite catalogues.
-
The system for incorporating additions and updates to the centrally administered catalogues at UCLA was discussed. Finding the differences between versions of the catalogue is very difficult, and the problem has yet to be resolved.
-
The question was discussed of how to handle joins in the catalogue so that fragment-specific information is not lost and so that all joins are presented together/linked. The issue was not resolved.
-
Latest catalogue of unpublished Jena Ur III texts with numerous corrections was integrated into the full CDLI catalogue.
-
Transliterations
-
Discussion of workflow problems to be resumed at December meeting.
-
Berlin version control system will probably be very useful for transliteration correction.
-
Madeleine will provide full inventory of current DCCLT witness and composite transliterations to Steve shortly.
-
Web presentation
-
Problems with automatic mirroring in Berlin were resolved.
-
ATF to web presentation is currently done by formatting ATF files for import into the FileMaker transliteration database using a perl script. The ATF files may or may not produce valid xml files. Steve Tinney is developing tools to present transliterations in their orginial ATF format if they do not conform to the DTD and in a cleaner format using CSS if they do.
-
Currently an image file database is maintained to display images in the text display pages. Steve Tinney is working on a system of mining the image directories to produce lists of available images for web presentation. He is also experimenting with zogilib.
-
Backup and maintenance
-
Robert Casties and Madeleine Fitzgerald established an automatic daily backup of the UCLA CDLI web site to the archival server at UCLA for automatic backup and mirroring to be established in Berlin.
-
Rsyncing of the data was also established between the UCLA archival server and the PSD server in Philadelphia by Robert Casties and Steve Tinney.
-
Tools:
-
Arboreal doc specs now include "Show page image" function for CDLI texts
-
Web service under development to run PSD linguistic services on ATF documents
- Tinney-Hyman Action List
-
Arboreal requirements
- must be able to send out entire XML document in "raw" format
-
ping donatus to find out what backend wants?
-
Donatus requirements
- XML-RPC integration
- capable of farming request to another server using XML-RPC
-
SumKit requirements
-
XSL: maps existing parser output to Donatus backend return format
-
parser needs to produce a single analysis string
-
add slot numbers to analysis
-
XML-RPC integration
-
accepts parse request via XML-RPC
-
raw input is base64 datatype in XML-RPC (XTF file sent as a single base64 base64lob)
-
return analysi/es via XML-RPC
-
a single hash consisting of the following key/value pairs
-
morphHits context-free morph identifications (.hits file) : base64
-
morphContext context-analysis morph identifications (.context file) : base64
-
morphMisses misses list (.misses) : base64
-
morphErrs messages/stderr junk : string
-
documentation of XML-RPC method name and URI
-
Future work
-
integrate addAnalyses function with SumKit hints database
-
Next Meetings:
-
Early December 2004 in California
-
Semi-technical meeting to be attended by Robert Englund, Madeleine Fitzgerald, Steve Tinney, and Niek Veldhuis in preparation for full technical meeting in Spring 2005.
-
Define and document exemplar and composite catalogues
-
Work on join handling issues
-
Discuss coordination of projects
-
March or April 2005 in Los Angeles
-
Full technical meeting to be attended by Robert Casties, Peter Damerow, Robert Englund, Madeleine Fitzgerald, Malcolm Hyman, Steve Tinney, and Dirk Wintergrün.
-
Demonstration of Filemaker alternatives
-
Demonstration of latest versions of zogilib and version control technology
-
Finalization of catalogue definition and documentation and join handling
|