Cuneiform Digital Library Notes
2011:4        «              »
The State of CDLI’s Ur III Transliterations

Robert K. Englund
University of California, Los Angeles

In October of 2008, Dan Foxvog posted an announcement through the Agade list registering his completion of additions and collations to the then 2051 ED IIIb administrative texts in CDLI files. Through his efforts, the ED IIIb administrative corpus thus became, after that of the Late Uruk period, the second set of reliable transliterations in CDLI’s full dataset. Our Ur III transliteration files are on a different order of magnitude, consisting of slightly less than 59,000 entries and 822,000 lines of text in conventional, so-called ASCII transliteration format (ATF; note that Oracc describes transliterations formatted for CDLI archival storage and use as Canonical ATF, or C-ATF [<http://oracc.museum.upenn.edu/doc/builder/cdli>]). The transliterated corpus represents 62% of the 95,781 Ur III texts currently catalogued in CDLI. This communication is to announce the reintroduction to the CDLI website of a more standardized full set of Ur III transliterations that has been cleansed of many irregularites, but that remains a work in progress.

A number of factors played into the realization that our Ur III transliterations were in need of greater attention. The major hurdle we needed to overcome to increase the usability of this very large data set was the fact that the transliterations were gathered together from a wide variety of legacy sources, including the initial capture of all electronic transliterations prepared in the 1990’s by the Leiden researchers Bram Jagersman and Remco de Maaijer (following a data migration to CDLI, their pioneering site was retired in September of 2003). Conversion of the Leiden data, at the time 270,000 lines of text (incl. text header and format lines), was slow, and in the end incomplete for a number of reasons, in particular due to CDLI’s policy that numerical notations must, as closely as feasible, adhere to a strict mirroring of actual cuneiform notations and therefore should not reflect the often non-standardized decimal interpretations of various cuneiform specialists. In many ways, the Leiden files were in this conversion our easiest target, since they had an internal, if still imprecise method of numerical transcription, and, as the creation of a closed and professional set of collaborators, followed limiting rules in the interpretation of signs and words. However, the CDLI continued to gather electronic transliterations of Ur III texts wherever we could locate them, thus bringing in at the same time all forms of idiosyncratic computer work by a variety of Assyriologists, including files submitted for publication with all the horrors of Microsoft formatting. In terms of text structure and font description, such files invariably contain a myriad of greater and lesser irregularities that, if not found and eliminated directly, can hide in dark corners, multiply, and eventually emerge as incompatible characters in text parsing. In the critical case of numerical notations, capacity or surface measurement notations may be interpreted by the conversion program as sexagesimal, or may disappear altogether, misidentified by the processor as text line numbers rather than counts of objects. The standardization of signs and words, on the other hand, is in itself just a matter of some pattern recognition, and a measure of perseverence. Quite a number of the choices made in this process will be challenged by experts, but the eventual correction of a unified, if incorrect reading is substantially less onerous than would be the correction of multiple variants, and the search string via sign names is very much facilitated.

Some of these errors can, in a large data set like that of our Ur III files, be detected using some of the same inline markers that Steve Tinney used in the initial cleansing of legacy transliterations. Where, for instance, the signs sze (sz = š in CDLI ATF) and gur (“barley” and “kor,” respectively) were located in one line, the parser would generally assume that the numerical notation preceding these signs was a grain capacity notation and therefore that 12.4;4,3 should be converted to 1(gesz'u) 2(gesz2) 4(asz) 4(barig) 3(ban2), but in many hundreds of instances the conversion program interpreted slightly differing notations, or true grain capacity notations without inline word markers, as sexagesimal, and wrote to our files 1(szar2) 2(gesz'u) 4(gesz2) 4(u) 3(disz).

The next and remaining obstacle in correcting such files rests in the fact that specialists have dispensed with preparation and publication of hand copies of texts in publications, most notably in the Ur III texts that make up the bulk of both unpublished artifacts in established collections, and of texts that since the Kuwait War have left Iraq via the antiquities market. We must have understanding for the decision of colleagues to dispense with the time-consuming hand-copying of Ur III texts, but we are less sympathetic with the too easily reached decision that no image documentation whatsoever was needed to support published transliterations, where these publications exhibited a weak adherence to the principal that the original cuneiform should be directly and correctly reconstructable with nothing more than a set of rules attached to the text transliterations. This publication policy has, admittedly, probably facilitated the appearance of text transcriptions more quickly than would otherwise have been the case, given the very frustrating guidelines often imposed by collections officials, that all imaging should be performed in-house by photography departments and that these images should enter the public domain only under the strictest of conditions; but the experience of CDLI has demonstrated that the initial scanning of tablets can be efficient and inexpensive, and that no book need wait for CD insertions (that are rightly anathema to librarians anyway), since the Web is an ideal medium for such file dissemination–indeed, CDLI is happy to host such images in its pages, and to care for their permanent free access through its University of California and Max Planck Society partners. Without such image documentation, the correction of existing transliterations is hindered; and in all of this, we should restate the charter of CDLI and other web services in cuneiform studies that the primary purpose of images rests in their exploitation to prepare exacting, searchable transliterations–paleographic studies are, really, a distant second.

Some elements in our project history, though, could be exploited to ease the necessary work on CDLI’s Ur III files. First and foremost, we have enjoyed what must be considered an unprecedented level of support from national and private funding agencies in the United States and in Europe, and the continuing institutional support of UCLA and the Max Planck Institute for the History of Science, Berlin, with which project staff and collaborators have digitized a substantial number of physical cuneiform artifacts in a variety of collections worldwide, as well as a set of published hand copies that, in the case of Ur III documents, is nearing completion. Some few major collection administrators, or the legal officers they consulted, have resisted efforts to access, catalogue, and scan their artifacts, in particular defending caches of tablets still unpublished after many decades, in some cases more than a century of museum storage (we cannot address the shielding from public view of such cultural heritage artifacts as unique witnesses of Babylonian history, now often reduced to Old Babylonian administrative archives, without asking, who is harmed by their dissemination to the heirs of those early cultures?), while to date only one Ur III specialist has threatened legal action to, sadly with success, thwart CDLI’s posting of his published hand copies, a practice of research data dissemination that CDLI contends falls under the realm of fair use of limited content of publications for non-commercial, academic purposes–and one that, we might add, is simply correct and proper. Second, we have, in the Madrid project site BDTNS (<http://bdtns.filol.csic.es/>) directed by Manuel Molina, a stable and growing source of specifically neo-Sumerian data that acts as an invaluable corrective to lapses or hypercorrections in our files. My own computer work on CDLI files has a BDTNS browser window running in the corner of my screen, brought up regularly to check the Madrid team’s collated readings, or simply to learn what our pre-conversion numerical notations were without having to sort back through our old files.

What began, with an announcement in June of 2010 to CDLI collaborators that I would, for purposes of correction, block access to Ur III transliterations in our ATF management system for a period of perhaps half a year, was, after thirteen months of time-devouring work, completed last week with the re-entry of my file by Robert Casties and Dirk Wintergrün of CDLI’s Berlin offices. Final cleansing of the updated files was facilitated by the use of Steve Tinney’s ATF processor at <http://oracc.museum.upenn.edu/util/atfproc.html>, and of Perl scripts written for me by Wenjae Chang, a computer science graduate student at UCLA who is currently programming CDLI’s SQL data management software. The scripts written by Ms. Chang will be uploaded shortly to a new interface for transliteration entry that identifies the period of Babylonian history of some set of new texts, and runs their sign readings, and their words through a full grapheme/lexeme glossary of Ur III ATFs in CDLI, challenging new, and therefore probably incorrect or non-standardized, transliteration. In this, there will be some few inconsistencies with the grapheme/lexeme lists employed in the Oracc toolkit that will need to be addressed regularly–but I think so few (for instance, the seven introduced readings noted below) as to raise no serious alarms. While this update represents a major standardization and improvement of CDLI’s Ur III files, still the process of converting the many irregular legacy files to ATF did create quite a lot of corruption, and we call again on collaborators to offer their time to collate individual transliterations using available images on CDLI pages, or, where these are not available, cross-checking CDLI files against those in BDTNS or in original publications.

I offer below some remarks that are directed more to CDLI ATF contributors, but that might be of interest to other Ur III specialists and to cuneiformists generally who use CDLI datasets in their research. Comments, criticisms and/or corrections are of course welcome. Specialists will recognize our close adherence to the readings of Borger/Ellermeier’s SG and ABZ, and of Borger’s MZL, with some very few additions. The principal behind these, and generally the readings of German Assyriololgy, is that our transliterations should, so far as possible, reflect the best estimate of how Sumerian would have sounded at the time of the recovered text artifacts; thus short values are preferred where they are indicated in Proto-Ea and in 3rd millennium orthography (primarily phonetic glosses, allography, and Auslaut continuation), and we avoid recently proposed new sign readings where they, no less than long values, would not inform specialists, and would confuse more general users (to cite one of innumerable instances, a recent volume of ZA carried the Sumerian value bešeŋ, that in Google search brings up [5 August 2011] only this ZA reference [vol. 101, p. 5 n. 13] and one other page written in Chinese, and that draws a blank in ePSD search, itself a confusion of “b/pisaĝ1-3” ’s eventually leading to the common value pisan of the sign GA2 best known in the header “pisan dub-ba” of tablet baskets; Ur III specialists will follow, and ultimately reject, the lexical reasoning in bešeŋ and many other uncommon readings, while, through its use instead of common pisan, specialists from related fields, academic generalists, and the informal learners who finance such exotic research are unnecessarily excluded from participation in Assyriological discussions). Nasal-g, further, has not been introduced to our files, and, at least in the case of Sumerian 1st singular possessive, should be (other nasal-g readings are given by the common and unique readings [gar, ga2, kin, nigar, etc.] and by published lists in circulation, and can be globally corrected in the future); yet even here the variant characters chosen by authors and editors to represent this phoneme make its citation in publications, and certainly in a data repository like that of CDLI or BDTNS, undesirable.

CDLI’s current Ur III transliterations may be accessed in full (with the exception of some few files not freed for distrubution) at our download page (<http://cdli.ucla.edu/downloads.html>) together with copies of the grapheme and lexeme lists deriving from those transliterations. The current work concerned itself above all with the standardization of graphemic and less so with that of lexemic readings–it is, for instance, often a challenge to decide whether zi3 sig15 should be considered one (as in BDTNS) or two words (in our files), in CDLI practice signaled by the use of hyphens and other boundary characters; cf. Tinney’s primer at <http://oracc.museum.upenn.edu/doc/builder/cdli>.

The corrections of the current update have, as stated, been dependent either on hand copies and images of originals, or, lacking these aids, on the “rule of an imposing majority,” when this appeared prudent. Where, for example, SAT, BPOA, UTI or Nisaba appeared to be in singular conflict with sign readings found in other texts with images, they were corrected to show the more likely readings, usually with a question mark added, or, where the evidence was not entirely clear from other texts, they were left as is, again with a question mark. For instance:

  • in the case of the 11 recorded cases of mu us2-sa2 (instead of expected mu us2-sa), 10 derive from the transliteration publications of Yale texts by M. Sigrist in the series SAT, and one is in the text ZA 53, 61 6–without exception published without image documentation; one text–SAT 2, 594–includes a comment correcting written sa2 to sa. BDTNS corrected the SAT readings to us2-sa without notice;
  • there were 19 instances of the PN “ugu-dul” in file, most from publications without image documentation; those that had images were in all cases “ugu2-dul” in accordance with 30 instances of ugu2-du6 in CDLI Ur III (often as the father of Šeškalla). These were usually correct A.KA.DUL in BDTNS;
  • “lah4” is another example of readings retained where there is no likelihood that the sign [DU over DU] was used, resulting from careless transliteration of texts with no image documentation–images in all cases confirmed the reading lah5 [DU followed by DU]; thus, undocumented lah4 are now “lah5?”; there is only one clear example of lah4 in our files, in the unedited Michigan text KM 89348 obv. 3 (PN ma2-lah4-e), but the boatman Ur-Damu is in the other attestation of his name with ma2-DU.DU written ma2-lah5 (ASJ 19, 226 72 obv. iii 6');
  • dli-si4” (li instead of NE = li9) was found in published copies of three Nippur texts;
  • a final example is ad6 (LU2×BAD) vs. ad7 (LU2šesig; KWU 82 and 81, resp.), and potential uses of KWU 81 for dim3 (“doll”) in the field name sur3-KWU 81 (with a ‑ma Auslaut continuation in one case); ad8 is AD7(LU2šesig)×BAD (see below).

CDLI ATF notations for “sub-totals” up to “grand totals” now comprise:

   |SZU+LAGAB| → |SZU+NIGIN| → nigin2-ba → nigin-ba
   (a good example is Amorites 18 rev. iii 22 - v 13).

The sign LAGAB interpreted as nigin2 in the context of bundles of wood and reed removes kilib as a value of LAGAB from CDLI’s Ur III sign readings list. LAGAB has been left as lagab in cases of na4, “stone,” but we might wonder what formulations like n(umber) “nigin2” {gesz}ma-nu mean. It seems that nigin2-ba-bi can represent the total of 2 nigin-ba’s, for which see SAT 2, 163 end–no image is available–and note M. Civil, Fs Sigrist 36 rev. iii:
   23. gu-nigin2-ba 1(u) 3(disz) sa-ta
   25. gu-nigin2-ba 1(u) 2(disz) sa-ta
   26. gu-nigin2-ba-bi 2(gesz2) 5(u) 5(disz) (= 5× 25, but this seems to be just counting the numbers of bundles)
as well as UET 3, 1058 rev. 4.
This complex, still unresolved (as is the reading of ŠU+LAGAB vs. ŠU+NIGIN[=ŠU+LAGAB+LAGAB], both true ligatures [complex signs made up of two or more graphemes that, in cuneiform, share at least one wedge] in the Ur III text corpus and therefore not to be transliterated šu-nigin2/nigin), is to be noted to Heimpel 2003.

Note to chronological notations: CDLI ATF isolates month and year names on one line; thus, where a tablet might indicate full year names over several tablet line cases, CDLI merges them to one; where, as is often the case with spacious left edge notations or with one-sign month names such as diri or RI (CDLI “dal”), month names and year names are formally in one line, CDLI adds a line number to isolate both, preceded by comment line “# text moved to next line”; to disambiguate long lines that might include content formally part of numerical notations together with month and year names, one formal case line might be cut into three or four.

CDLI ATF introduces the readings saga (for the sign SIG5), nigar (for NIGIN3, clear since Krecher 1966: 128-129), udru (for AŠ2), ad7-8, nag4 and šakkan. Only saga and nigar are otherwise not found in the standard sign lists.

  • the reasons for the reading saga of SIG5 (instead of sag10, and this should not conflict with Borger’s saga from SAG since that is a nasal-g Auslaut) are: the already often noted numerous instances of seal legend sa6-ga corresponding to SIG5 in personal names of recipients found in records (most recently Wilcke 2010: 12 n. 28, and confer Waetzoldt 2010); SAT 1, 434, obv. i 22 with dutu-SIG5sa6-ga; there is no instance in availalbe files of a /g/ continuation of SIG5; its reading /seg/ or /sag/ is indicated beginning only in the OB period (PrEa 411 SIG5 = sa3-ag (vars. sa6-ga, sa3-a; MSL 3, 38 351-352 has SIG5 = se-eg)
  • udru is a new entry based on the various readings of AŠ2 in the month usually read ZIZ2-A but corrected by Cohen (1993: 118-119) to ud2-duru5 (add to his remarks the instance of u32 in AUCT 2, 28 obv. 5)–and following the udra series and likely amissible-u d(u)ru value of a in haydru and so on (we wait yet a moment before correcting iti GAN2-maš to Cohen’s iti burux(GAN2)-maš(2) [Cohen 1993: 43-44])
  • Carcasses in current Ur III ATFs:
  •    ad3 is LU×BAD
       ad6 is LU2×BAD
       ad7 is LU2šesig
       ad8 is (LU2×BAD)šesig
  • nag4 avoids an abundance (and exclusive use) of readings naga4ga2 in munu4/mun/ŠIM nag4-ga2-de3 / al-nag4-ga2; I find no instance in 40 relevant attestations that would indicate an intended 3rd millennium reading naga4 of KUM (despite PrEa 607);
  • šakkan for ŠAGAN (szakkan and SZAGAN in ATF) derives from an attempt to make sense of Borger’s splitting of kk and k values in the šakan series in MZL, whereby for GIR3 he allows both
  • (NB: qur8 might be helpful to distinguish its use in qur8-ad and zi-qur8 from that in ma-gur8)

Note the reading du (meaning “geläufig” or the like) of DU instead of gin or gen as a qualifier of animals or commodities, indicated by the unlikelihood of a nasal-g Anlaut of /gi(n)/ “firm,” which seems to be the thinking of most who use it; by the lack of a consonantal Auslaut indication in the texts; and see ITT 3, 5235 obv. 5 with SZIM du2 (though note possible reference to du8 as in CT 10, pl. 48, BM 19067 obv. 9 (3(gesz2) 3(u) la2 3(disz) SZIM du8 in sequence saga, du, du8, also used with ninda); parallel to now saga instead of sig5/sag10.

To uruda/zabar: where uruda stands before sexagesimally counted objects, often followed by ki-la2-bi, transliterate as a semantic gloss {uruda}x except in cases of uruda {d}nn, uruda e2 nn; if weighed, then just uruda; where zabar stands after an object it qualifies, transliterate simple zabar; where zabar is before the object, consider as uruda to be a semantic gloss ({zabar}).

The sign dub = kišib3 is written with two verticals at beginning, and the reading dub is now reserved for dub gid2-da and dub didli, then dub = šap?kum.
The sign mes (= kišib) is written with one, used in ur-mes, mes-lam-ta, gilgames3, etc.
The sign um is usually written with no initial vertical (see Allred [forthcoming]).

The complicated /kudr/, to enter, will be dealt with systematically by Dahl (forthcoming). In short,
   lil = “fool” (lil2/kid/e22 = “wind”)
   ku4 = “to (cause to) enter”
   tu/du2 = “to give birth”
      (watch for potential instances of ŠE+LIL, such as Old Akkadian RTC 142 rev. ii 1)
Ur III    ku4 = “ku4?” in ATF, uncertain since the sign occurs in 700+ instances in texts with no image documentation
   kux(KWU636) (=ŠE.ŠU)
   kux(KWU147) (=LIL)
   kux(DU)
   kux(TU) (all copies need collation; if correct, this may be ku4)
   du2(d) = TU
Old Babylonian
   ku4 = TU
   ku4 = ŠE+TUG2 in some 6N-T texts

The reading of “e2-a” is in many Ur III accounts not sufficiently justified, and thus now read E2-A in CDLI, including esir2 E2-A, where A might be duru5 for “fluid” as opposed to had2 “dry,” weighed out where esir2 E2-A is in capacity notations; e2-a might otherwise often refer to “affiliated workgang,” “village,” usually e2-duru5.

CDLI ATF uses DIB = dab, not dib, and retains for now nin in nin-dingir, etc.

CDLI follows in its transliteration of MS 2064 the section numbers of the Ur-Namma code adopted by Civil 2011: 237-246.

dusu2-munus etc. is now used (instead of dusu2 munus) because of parallels to u8 udu-nita2 etc.; nita/nita2 are generally left as in their originating file–they do alternate with no apparent semantic nuance.

There are four instances of “sag nig2-gar-ra” (instead of expected sag nig2-gur11(GA)-ra) in CDLI files, none with image documentation.

We note that the very confused matter of dug dida versus kasz dida in ATFs is not resolved in this update–this includes both the use of diš (vertical stroke) and aš (horizontal) and the readings of dug or kaš. The ca. 3200 instances of dida (U2-SA) include very nearly 1500 each of the qualifier following dug or kasz, respectively; the labor involved in resolving this matter seemed too great to me, given also that a great number of the notations cannot be clarified in the absence of image documentation, and the presence of less than careful transliterations of DIŠ as opposed to AŠ in publications.

Again, many decimal notations such as MVN 9, 140 rev. 9. sze-bi 134 sila3 appear to be defective, but cannot with final certainty be corrected (to 2(barig) 1(ban2) 4(disz) sila3) in the absence of image documentation.

CDLI’s UET 9 transliterations are incomplete, as are many ITT 2 entries.

A final note to ATF text structure:

To eliminate inconsistencies such as “some number missing,” “a few lines missing,” “around 4 or 5 lines missing,” and so on that, as free text, plagued earlier transliterations, Ur III ATFs follow now simplified rules for describing preservation of damaged artifacts. Thus, if it is clear how many lines are missing, reconstruct them with 1. [...], 2. [...] etc. If not, use only “$ beginning broken” or “$ rest broken” at the beginning and end of surface or column. Within surface or column, either reconstruct broken lines (for instance, 12. [...], 13. [...], etc.) or if the number of missing lines is unclear, use “$ n lines broken” (use “n”) and number the following lines accordingly, though after break not restarting with 1'., 1''., etc., but successively with the number that follows the last preserved line number. This should reduce break variables to a manageable number and should give strict rules in assigning numbering and level of preservation to (partially) preserved lines, but even if still found to be flawed it results in consistent IDs that can be systematically corrected.

@bottom is allowed only where used for subtotals (for instance, in LoC 11); ATF definition must clarify use of further qualification added to @bottom to avoid confusion of same-surface duplication.

Some imagined examples follow.

A single-column tablet:
&P500000 = JCS 89, 222 no. 12
#atf: lang sux
@tablet
@obverse
$ beginning broken
1'. [...] 2(disz) [x]
2'. [...] SIG7-a giri3 [...] nu2-a
$ n lines broken
$ blank space
$ n lines broken
3'. [...] ga6 [...] dumu-gi7-lil#-[la-am3]
$ rest broken
@reverse
$ broken

means: the beginning of a one-column obverse is broken, and an unclear number of lines are missing. Lines are numbered successively 1'. ff., including over breaks of an unclear number of lines. In the example above, the reverse is completely missing.

A multi-column tablet, first column preserved:

&P500001 = JCS 89, 222 no. 13
#atf: lang sux
@tablet
@obverse
@column 1
$ beginning broken
1'. [... ki] sumun
2'. 5(bur3) nu-banda3
@column 2
$ beginning broken
1'. [...]
2'. [...] dub-sar
3'. 1(asz) dub-sar
@reverse
@column 1
1. a-sza3# u3-x-ku-tum
2. 2(barig) ugula
$ rest broken
@column 2
1. [... ki] sumun
$ rest broken

A multi-column fragment, left column(s) missing:
&P500002 = JCS 89, 222 no. 14
#atf: lang sux
@obverse
$ beginning broken
@column 1'

means: some unclear number of columns are broken, followed by @column 1' etc.

If the tablet is reconstructable, do so:
&P500003 = JCS 89, 222 no. 14
#atf: lang sux
@tablet
@obverse
@column 1
$ broken
@column 2
$ broken
@column 3
$ beginning broken
1'. [... ki] sumun
2'. 5(bur3) nu-banda3
@column 4
$ beginning broken
1'. [...] dub-sar
2'. 1(asz) dub-sar
$ rest broken
@column 5
$ broken
@column 6
$ broken
@reverse
@column 1
$ broken
@column 2
$ broken
@column 3
1. a-sza3# u3-x-ku-tum
2. 2(barig) ugula
$ rest broken
@column 4
1. [... ki] sumun
$ rest broken



BIBLIOGRAPHY

Allred, Lance
forthcomingReview of M. Sigrist and T. Ozaki, Neo-Sumerian Administrative Texts from the Yale Babylonian Collection I-II. BPOA 6-7 (2009)
Civil, Miguel
2011“The Law Collection of Ur-Namma.” In A. George, ed., Cuneiform Royal Inscriptions and Related Texts in the Schøyen Collection. CUSAS 17. Bethesda, MD: CDL Press, pp. 221-286
Cohen, Marc
1993The Cultic Calendars of the Ancient Near East. Bethesda, MD: CDL Press
Dahl, Jacob
forthcoming“A Paleographic Study of Sumerian ku(dr), ‘to enter’.”
Heimpel, Wolfgang
2003“gu-nigin2, “bale”.” CDLN 2003:003
Krecher, Joachim
1966Sumerische Kultlyrik. Wiesbaden: Harrassowitz
Waetzoldt, Hartmut
2010“Die Bedeutung von igi–sag̃/sag̃5/sag9/sag10.” In A. Kleinerman and J. M. Sasson, eds., Why Should Someone Who Knows Something Conceal It? Cuneiform Studies in Honor of David I. Owen on His 70th Birthday. Bethesda, MD: CDL Press, pp. 245-255
Wilcke, Claus
2010“Sumerian: What We Know and What We Want to Know.” RAI 53, 5-76
ISSN 1546-6566    © Cuneiform Digital Library Initiative | Archival: 2011-08-20