The Technologies of the HyperStack

The Technologies of the Saint Patrick’s Confessio Hypertext Stack by Randall Cream by Randall Cream

1. Introduction

The Saint Patrick’s Confessio HyperStack, considered at its most basic, delivers an updated edition of Saint Patrick’s Confessio and transposes the printed page into the electronic medium; in this, it follows in the well-established genre of Digital Editions.

But the HyperStack Confessio project undertakes more than merely transposing the printed page into the new medium. In addition to offering all that the printed page provides, the project works to build layers of accessibility to the Confessio, allowing users to negotiate the byzantine documentary history of this important text.

But why construct these layers at all? Why not just provide unencumbered access to the thing itself? Unfortunately, the Confessio that we have enjoyed for 1500 years is just that—a layered text. All of the surviving manuscript copies have sophistications, errors, emendations, and other artifices that require the reader to navigate the text as a layered document.

The sophistications and errors of this ancient text are more easily revealed, by comparing surviving witnesses, than the text itself. What is it that Patrick actually wrote? With the oldest surviving manuscript more than three hundered years younger than Patrick himself, this question is difficult, if not impossible to answer.

But answering this question is the aim of the Confessio HyperStack project.

2. Technologies of the HyperStack

The technologies of the Confessio HyperStack project can be divided into four components: Technologies of the Text, Technologies for Images, Multimedia Technologies, and Server-Side Technologies.

2.1. Textual Technologies

At the core of the HyperStack is the text of the Confessio, written by Patrick in the fifth century. Taken from Ludwig Bieler’s authoritative edition of 1950 the text of the Confessio and Epistola is at the heart of the HyperStack project.

Much of the work of the HyperStack project has consisted of generating the highly-specialized XML files that anchor the Confessio and Epistola, allowing for easy connection with the other elements of the HyperStack. In order to facilitate a granular level of connection with each individual word of the original text, both the Confessio and the Epistola were tokenized, with every single word receiving a unique XML ID. Constructed by Dr Franz Fischer, these XML files represent the backbone of the project and coordinate each of the other elements of the stack.

By tokenizing the documents, the project facilitates a word-by-word connection between the text, the rich manuscript history, multiple layers of commentary and the detailed critical apparatus. Tokenizing the document allows the project to link each and every word to the bibliographical applications, the imaging applications, and the other resources of the Stack. But aside from the original Latin text, tokenizing the documents also allows users to move back and forth between any of the translations on a passage-by-passage basis.

The significance of this level of work should not be underestimated: it represents a substantial investment in the future of the project, allowing the work of this three-year grant-funded activity of the Royal Irish Academy to enjoy a lifespan far beyond the exhaustion of the original grant terms.

Users of the project, of course, do not interact directly with the XML files. Instead, server-side transformations dynamically structure the text for viewing, presenting the pages of this 1500-year-old document in a format that conforms to the expectations of 21st-century users and modern web technologies. XSL transformations have been written, or planned, to transform the underlying document into all of its forms, including each of the manuscript witnesses with distinct layout, distinct errata, and distinct problems of interpretation. This work is ongoing, but lies within the reach of the project.

Drawing from the best exemplars of modern digital editions—The Codex Sinaiticus is but one stellar example among many—the HyperStack project uses TEI-compliant technologies to present the text, to search the text, and to transform the text. But users should be aware that innovation in the Digital Humanities is a rapid and untidy process; in the light of this, the project has endeavoured to compromise between delivering future-compliant technologies (XML that conforms to ever-evolving best practices) and delivering to users a streamlined web application for interacting with the documents.

One technology that facilitates this compromise well is the project’s decision to provide PDF packages for all of its texts, freely available for download. With the ability to imitate the printed page exactly, the PDF files represent for many users the most comfortable means of interacting with the texts of the HyperStack project.

But the design of the project is best realized in the web application. The collaborators of the project humbly suggest that users experiment with the web application, taking advantage of its ability to coordinate multiple texts, formats, technologies, and ancillary materials in order to facilitate a rich exploration of the monumental document at its core.

2.2. Technologies for Images

The Confessio HyperStack project has the goal of enabling users to read the words written by this iconic figure. Therefore, it invests heavily in the technologies of this text. However, the characters that one reads on the page can’t adequately represent everything that is involved in the act of reading. There are innumerable features of the manuscript that are not represented by the rich XML transformation of the document that we described above: the smell of the book, the feel of the page, the uncertainty of the scribe’s quill on the vellum, the differing thicknesses of the characters as the scribe refreshes the ink, or turns the feather ever-so-slightly in his hand. These are all readily apparent to readers of the original manuscripts, but are not strictly speaking elements of the Confessio itself. The project endeavours to provide full access to the difficult and worthwhile act of interpretation, and therefore embraces not just the words of the Confessio, but also the appearance of the documents that comprise its manuscript history.

Drawn from eight surviving manuscripts held in repositories across Europe, the manuscripts that allow scholars such as Ludwig Bieler to assemble the Confessio and Epistola are significant and complex works of art themselves. Whether damaged by fire, or excision, or water, or other accidents in its thousand-year history, such a document—which is itself only a copy of something else, something prior—takes on a life of its own. For each of these eight surviving witnesses, the particularities of the document are important. Some of these particularities are embodied in the textual code of the project—column and line breaks; missing pages; uncertain characters. But most of these particular accidents of history are only knowable by interacting directly with the document in question.

The Confessio HyperStack project facilitates this level of interaction by providing detailed, high-resolution images of surviving manuscripts1 of the Confessio. Using the best software available—the open-source ImageMagick application—the project constructed pyramid JPGs and a browser-based viewing application that allow users of the project to view the manuscripts easily and seamlessly, whether at maximum zoom or in full page. The application uses pre-fetching algorithms and intelligent tiling to send only a subset of the enormous image files to the browser.

These technologies are still evolving, and undoubtedly will be quickly superseded in the months and years ahead. HTML 5 code should be able to replace much of the PHP and javascript that runs these applications, allowing the manuscript images to function more readily on mobile devices such as tablets and smartphones. Nevertheless, the underlying work of the project is stable as far as these images are concerned. By using open source technologies and platform-independent browser-based standards, these valuable artefacts will be easily adaptable as the technologies evolve.

Users who have difficulty with the image application should consider visiting the site in either Google Chrome or Firefox, both of which offer multi-platform, standards-compliant browsing.

For ease of off-line viewing, low-resolution PDFs are provided for those images that accurately mimic the printed page of a facsimile edition. The best experience, of course, will be gained through the browser and its web application.

2.3. Multimedia Technologies of the HyperStack

Although principally a textual project, the HyperStack does provide some multimedia files in order to enhance the interpretative experience, not least for vision-impaired users: these include a dialogue performance of part of the Confessio, and a substantial, specially commissioned audio novel (also available in text form).

While the project has a principled commitment to open-source technologies, audio files represent a conundrum for responsible Digital Humanities projects. Currently, there exists no convenient open-source format for audio files that will readily play in browser-based web apps. When we consider the rapid growth of tablet computing and smartphones, the necessity for access quickly came to outweigh the commitment to open-source embraced by us.

The audio files of the project, therefore, are available in MP3 format, easily playable on all platforms. As a widely-recognized MIME type, the files are playable within the browser, readily downloadable, and easily recognized by platform-specific software. We are unaware of any platform that cannot stream or download MP3 files, despite the proprietary history of the motion picture association.

We believe this compromise represents the wisest choice for the present and the future, as mp3 files are so ubiquitous that any emerging standard would have to adopt a translation from the existing format into the new standard.

2.4. Server-Side Technologies for the HyperStack Project

Large projects such as this inevitably involve a degree of compromise regarding funding, technologies, and deliverables. For the HyperStack project, the backend technologies adopted represent a significant tradeoff in securing a workable experience in the present and the ability to grow as new technologies and platforms emerge.

Our project uses Drupal, a very common PHP and MySQL-based content-management system. While this platform is extremely common in the world wide web, it is not a widely-adopted platform for Digital Humanities projects.

First, a word about the benefits of using Drupal for this project. Drupal is such a commonly used web platform that the project was easily able to identify and recruit talented technology specialists to develop the web presence using Drupal. Drupal's large presence online means that the project is comfortable knowing that whatever difficulties we encounter will probably be experienced by a large user-base and will rapidly be addressed. Drupal is modular, with extensions, patches, and plug-ins that meet a wide array of needs. And Drupal is easy to work with, rapid, and technically simple.

Nevertheless, Drupal does present some liabilities to the project. By using an SQL database to store all of its information, Drupal requires a level of abstraction from the XML code that is the heart of the project. These XML files are not stored natively, but rather require transformations at each stage of the process. Because of this level of abstraction, Drupal renders XSL transformations problematic—but not impossible.

In exchange for its ease of use, Drupal presents some unwieldy choices to the project. Customizing pages in the HyperStack is a clunky affair at best, and impossible at worst. The node-based architecture creates a somewhat isolated experience for parts of the website. And it limits some of the technologies we could embrace in the browser.

Nevertheless, the project adopted Drupal for its security, its rapid evolution, its standards-compliant architecture, and its (hopefully) future-proof pedigree. With all of the source files for the project in XML, JPG, MP3, and PDF, we feel that the project can rapidly evolve in another direction should a strong platform present itself.

At present, despite its limitations, we feel that Drupal offers a handsome experience for the widest range of users, internal and external.

Questions or comments about the project and its technologies are extremely welcome. Please contact us by e-mail to DMLCS at ria.ie.


Footnotes

1 Unfortunately the 2 Salisbury Mss are digitized from microfilm only. ^