Category Archives: ECCO

1748: ‘Fiction’ in the Database:

Not so long ago I was reviewing a lecture I regularly present to students studying Samuel Richardson’s Clarissa. Looking back, I had no idea that this would lead me to speculate about how bibliographic data relating to English literary history is recorded in electronic databases.

It was a lecture that aimed to give some context about the ‘rise of the novel’ and I always had fun by reminding them just how illegitimate the novel was in the first half of the eighteenth century, and how even literary works formed a tiny proportion of what was published. But this time around, I thought I would actually present them with evidence. Some actual quantities. I came up with the idea of homing in on the year Clarissa was first published: 1748. My first attempt was quick and dirty.

'Gasp!'
‘Gasp!’

Using the English Short Title Catalogue (ESTC) I typed in the year 1748, left everything else blank, and noted how many publications it returned (2550). I then thought it would be instructive to see how many of these were literary (in the loosest sense), so I went to Eighteenth-Century Collections Online (ECCO) and narrowed the 1748 search down via their category ‘Literature and Language’ (c.250 hits).[1] Now to find how many of those were novels. Both databases yielded results with the subject ‘fiction’ and I then – to ram home the point – narrowed that list down to new titles published that year. Only 0.5% of all works produced in 1748 could be classified as new fiction.[2] In a culture which perceives imaginative writing as practically synonymous with the novel, the result was a gratifying gasp of surprise from my student audience.

However, this rough-and-ready exercise set me on a different path, and made me think about how these databases, upon which we rely so trustingly, categorise our literary heritage. The simple exercise above revealed clear disparities between these databases in both the numbers and the titles returned, and some odd things about the way ESTC and ECCO had tagged these works. For a detailed breakdown of tags and titles I found, see my spread-sheet here.

A quick bit of history. The page images available in ECCO are digital scans of microfilm photographs of the original physical copies; in other words, as Ben Pauley has pointedly remarked, a remediation of a remediation.[3] The original microfilming was contracted out by the British Library in the mid- to late-twentieth century. These were then purchased and sold to research libraries by a US company called ‘Research Publications’ (I have a sudden flash of memory from my postgraduate days, seeing that name on the microfilm boxes as I painstakingly loaded a film into the reader). In the 1990s that company was then bought by Gale.[4] By 2002 the microfilms had been scanned and ECCO was launched as commercial database in 2003. A second tranche of material (ECCO Part II) was published in 2009.

ECCO got its bibliographic meta-data (for example, details about printers, publishing history, physical description, holding libraries) from the ESTC. However, the ESTC itself has a tangled history. It began life as the Eighteenth-century Short Title Catalogue in 1977. In 1987 it extended its remit to include material from c.1472 to 1700 (incorporating data from the Short Title catalogue of books printed in England, Scotland, and Ireland, and of English books printed abroad, 14751640 and the Wing catalogue which covered the period 1641-1700), and was then renamed the English Short Title Catalogue.[5] Indeed, the precise relationship between ECCO and the palimpsest that is the ESTC is an obscure one, echoing (if you’ll forgive the pun) that between Pro-Quest’s database Early English Books Online, the ESTC and the Short Title Catalogue, as Bonnie Mak has elegantly pointed out.[6]

When it comes to the question of how subject headings were assigned, there are few hard facts. However, Gale-Cengage gives some clues about this metadata on their website FAQs. At some point around 2009, just before the second tranche of digitized texts were published, the MARC (Machine Readable Catalogue) records for ECCO were ‘enhanced’ by adding Library of Congress (LoC) subject headings.[7] These were obtained from ‘existing’ library records which held the physical copy. However, where this was not possible, ‘ESTC licensed the work of adding LoC headings.’ This process resulted in ‘[o]ver 274,000 subject headings’ being added; Gale notes that these ‘were added through the combination of harvesting and manual assignment.’[8]

It seems there was at least considerable potential for divergence between these two systems of gathering and assigning subject headings, driven as they were by different organisations and groups of people. This might well have led to the bibliographers or cataloguers at Gale to adopt a different way of tagging and searching for subject headings.

Returning to the oddities I encountered in preparing my lecture: ESTC enables a search via ‘Subject (genre)’ and ‘Subject;’ ECCO has a drop-down option for ‘Subject.’ However, while ESTC tagged the genre field with ‘novel’ or ‘fiction’ and its subject field ‘fiction’, ECCO tagged the subject fields as ‘fiction’ and/or ‘English fiction’ (note the ‘and/or’ for further confusion). In all, this yields five different sets of results. Moreover, just looking at the widest set of results for the subject heading of ‘fiction’ (including reprints and new editions), the most striking aspect was the far larger number of results returned by the ESTC than by ECCO. There are no instances where ECCO identifies a work as ‘fiction’ that the ESTC does not. Even when ECCO tags A spy on Mother Midnight: or, the Templar Metamorphos’d as ‘fiction’ and the ESTC does not, the ESTC nevertheless tags it as ‘novels.’ However, there are some notable instances where ECCO does not follow the ESTC’s lead.[9] For example, where the ESTC rightly categorises Henry Carey’s Cupid and Hymen: a voyage to the isles of love and matrimony as ‘fiction’ it is not listed as such in ECCO. Even more obviously missing as ‘fiction’ in ECCO is Henry Fielding’s canonical novel The History of the Adventures of Joseph Andrews! Conversely, someone at ECCO must have thought tagging Ovid’s Heroides. English Ovid’s epistles … Translated into English verse as ‘fiction’ – as did the ESTC – was, at best, misleading.

Perhaps this goes beyond the issue of the management of data? It is intriguing to speculate on the human intelligence behind the original LoC headings and how they were assigned. Are we talking about individuals who were re-interpreting the nature of the actual texts themselves? How else to account for some of these idiosyncrasies?

Let’s go back to Fielding’s The History of the Adventures of Joseph Andrews (first pub. 1742; 4th ed. 1748) which is tagged by the ESTC as ‘Tobacco-fiction,’ a subject heading that is at least consistent across the ESTC and ECCO. But this is assigned to just three texts in the whole catalogue; the other two are novels by Tobias Smollett: The Adventures of Peregrine Pickle (1751) and The Expedition of Humphry Clinker (1773). Now, it’s true that there are people who smoke in these novels; but there are plenty of other protagonists from the fiction of the period who smoke too and it’s not as if tobacco is a significant plot-device. To take one more example, the anonymous Suite des lettres d’une Peruvienne. Again the subject heading is consistent across the two databases: ‘Epistolary fiction, French-18th century;’ but it is the only title from the entire database that is associated with this subject heading.

More interesting still is what happens to the two variants of Nehemiah How’s A narrative of the captivity of Nehemiah How. For the first on my list (ESTC Number W014008) ECCO seems to agree with its status as fictional, although its ESTC category ‘novels’ has been changed to the less contentious ‘fiction.’ Was someone working for Gale more astute in their reading of eighteenth-century narrative form? Human interpretation in the database is also evident when it comes to the other variant (ESTC number W34168), which looks to have been added later since it appears in ECCO Part II. Notably any tags formally declaring its fictionality have gone: in the ESTC it is replaced with the more precise genre tag of ‘captivity narrative.’ However, in the ECCO even this slight hint of narrative is ignored, and instead opts to follow ESTC’s more historical-sounding subject tag of ‘Indian captivities.’

More anomalies could be found (help yourself!) but these few examples are intriguing. How this metadata has been assigned seems to have been the result of a tangled history of cataloguing and bibliography, machines and human agency, and the messy process of translation between academic projects and commercial digital publishing. It’s a warning – just in case we need another – about how we use the meta-data available to us via resources like ECCO, EEBO and the ESTC. While invaluable, careful use also requires knowledge about the historical processes behind the creation of these databases. We might also say that human database bibliographers faced the same problems of interpreting and categorising the eighteenth-century novel as literary scholars do, and as critics in the eighteenth century clearly did. So one more thing: it’s easy to forget that behind the search interface on your computer screen, that black box of the database, what we are looking at is evidence everywhere of human intelligence, diligence, error, and above all, interpretation.

 

 

[1] Characteristically, ECCO returns slightly different numbers even when the same search is repeated. See Joseph Dane, What is a Book? The Study of Early Printed Books (University of Notre Dame Press, 2012), pp.224-7.

[2] In this essay I make no claim for a comprehensive list of fiction published in 1748 or even to define what fiction is or was. For example, Jerry Beasley’s Check List of Prose Fiction Published in England 1740-1749 (University Press Virginia, 1972), might also be a good place to start. But would we want to include, for example, chapbooks as fiction? Quite possibly, but neither Beasley’s checklist nor ECCO includes them, and the ESTC’s coverage of this genre is unclear.

[3] Thanks to Ben Pauley; also to Scott Gibbons, Giles Bergel, and Elizabeth Grumbach for helpful conversations.

[4] See Laura Mandell, ‘The Business of Digital Humanities: Capitalism and Enlightenment’, Scholarly and Research Communication, 6.4 (2015). http://www.src-online.ca

[5] http://www.bl.uk/reshelp/findhelprestype/catblhold/estchistory/estchistory.html

[6] Bonnie Mak, ‘Archeology of a Digitization.’ Pre-print, pp.10-11. http://illinois.edu/ds/search?search_type=userid&search=bmak

[7] For the Library of Congress subject headings and genre terms see http://loc.gov/aba/cataloging/subject/

[8] http://gdc.gale.com/products/eighteenth-century-collections-online/acquire/faqs/#marc-enhance

[9] As well as a number of texts which do not exist in ECCO at all.

Bring your laptop: BSECS 2016 session

The following post contains links and materials used for the demonstration workshop ‘Bring your laptop: working with digital texts from the Text Creation Partnership.’

Slides: Bring your laptop-Presentation

Text Creation Partnership [home]

Oxford Text Archive

Text Creation Partnership [Oxford Text Archive]

Early Modern Print [graphs don’t always display properly on Firefox]

18thConnect

Voyant Cirrus

chart-1
Graph generated using ‘Artemis’ via ECCO

 

 

 

 

Liberate the Text @BSECS conference 2015

18thConnectgrabI was extremely pleased with such a positive response to my workshop on digital editing, ECCO, 18thConnect, the Oxford Text Archive, and EEBO-TCP (whew!). Thanks to all who attended and especially for the fascinating discussion that ensued. As promised here is the PDF of the slide show liberate-bsecs-2015.pdf (thanks to Laura Mandell for some of the images).

I’m already thinking about next year at BSECS – maybe a session that really is a workshop, called ‘Bring your laptop’?

Eighteenth-century literature and the digital undergraduate

Bw1i4-yIcAAuCSj[This is a slightly amended version of a post that originally appeared on the blog of the North American Conference of British Studies]

Over the past couple of years I’ve been guiding some final year undergraduate students to create online digital editions of literary texts from the eighteenth century (see here, here, and here). To me, getting students to work with digital technology alongside eighteenth-century British Literature is now an exciting, but also essential, facet of my teaching. So I thought I would share how I got here with a brief overview of some developments, exercises and courses I’ve picked up in my own browsing over the past few years that teach eighteenth-century literature and are inspired by digital humanities.[1]

Digitisation

The huge acceleration of the digitisation of historical texts in the past decade and a half has been the catalyst for a trickle-down effect from research to teaching practices. Released in 2003, and as one of the biggest databases of eighteenth-century material, Eighteenth-century Collections Online (ECCO) arguably generated some the first reflections on using digital resources to teach eighteenth-century literature at undergraduate level: see my own 2007 paper and the many posts on teaching with ECCO on Anna Batigelli’s Early Modern Online Bibliography blog. The issue of cost and accessibility aside, the exponential rise of such resources – such as the Burney Newpapers database, English Broadside Ballads, and Old Bailey Online – has enabled students to enrich their knowledge of eighteenth-century literary culture: they were able to see unusual and non-canonical texts, to examine literary works in the light of historical or cultural ideas specific to the period or even decade, and to pose invigorating questions about literary value.

Blogging and wikis

This initial phase crossed over with tutors and professors experimenting with writing assignments and the different engagement with literary texts that might be enabled by digital platforms such as the wiki or the blog post. See for example, the work of Tonya Howe (Marymount University); the course run by Emily M. N. Kugler (Colby College) Histories and Theories of the 18thC British Novel; and Prison Voices 1700-1900, which has for example, this piece on Daniel Defoe’s Moll Flanders (this via Helen Rogers, Liverpool John Moores University). Adrianne Wadewitz (now sadly deceased) was also a leading experimenter using Wikipedia as a teaching tool. In this vein, Ula Klein has also recently written about her summer course on eighteenth-century women poets that involves the creation of wikis (here).

Beyond the blog

Sharon Alker (Whitman College) and Benjamin Pauley (Eastern Connecticut SU) reflected on using a variety of tools to teach Defoe including Second Life and Google maps. Laura Linker (High Point University) asks her Gothic novel students to use Google Earth to map narrative journeys, and even Second Life as a way of entering into characterization. In a course entitled ‘Remediating Samuel Johnson’, John O’Brien (University of Virginia) set up a collaborative digital anthology of Samuel Johnsons’ works using texts accessed via 18thConnect (significantly, a platform that begins to deal with the problem of access). John’s aim was explicitly student-centred: ‘[m]y hunch is that students will have a good idea of what students like themselves need to know to make sense of challenging eighteenth-century texts.’ Students of Rachel Sagner Buurma (Swarthmore College) experience hands-on work with a wonderful digital resource the Early Novels Database – see the students’ own blogs here. In a different course Rachel asks students create experimental and imaginative bibliographical descriptions of unusual and non-canonical eighteenth-century novels, see here.

Media shifts

Also fascinating are those courses and projects that use the very medium of digital technology to enable student to grasp the eighteenth-century’s own preoccupation with changing forms and media. As Rachael Scarborough King (New York University) suggests: ‘[d]rawing such connections between the experimentation and advances of eighteenth-century print culture and our own period of media transformation can offer a crucial foothold for students encountering eighteenth-century texts for the first time.’ Rachel asks students to write blog posts incorporating different adaptations of English literature as a way of getting a sense of these texts’ original meaning, form and transmission. In a course devised by Mark Vareschi (Wisconsin-Madison) he sets an ‘experimental assignment in digital composition and adaptation’ tasking students to tweet, 140 characters at a time Samuel Richardson’s Pamela as they were reading the novel. The course designed by Evan C. Davis (Hampden-Sydney College), Gutenberg to Google: Authorship and the Literature of Technology, also pays close attention to the form of literature in this period. In ‘Friday assignments’ there are intriguing tasks such as comparing how we read via print and via e-readers, and using online resources about typography and the Letter M Press app to enable students to re-create and reflect upon the physicality of print in the hand-press era.

I’m about to run my own digital literary studies course focusing on the eighteenth century this coming academic year, and I’ve found the work of others in this field fascinating and tremendously inspiring.[2] My thanks to everyone for letting me link to their courses and students’ projects.

[1] See Rachel Schneider’s blog post Eighteenth-Century Literature meets Twenty-First Century Tech, which reviewed the SHARP roundtable at ASECS 2014, organised by Katherine M. Quinsey, ‘Wormius in the Land of Tweets: Archival Studies, Textual Editing, and the Wiki-trained Undergraduate.’ Quotations in this post are from the authors’ proposals for the Digital Humanities Caucus panel ‘Digital Pedagogies’, organised by Benjamin Pauley and Stephen H. Gregg. The phrase ‘inspired by digital humanities’ is my deliberately broad definition that covers the wide variety of uses of digital technology and digital resources across the courses I’ve found. Since my particular interest is in eighteenth-century literature, if you are interested in syllabi that are focused on digital humanities beyond literature, or beyond the eighteenth century, then there are superb bibliographies here. Because I’m most interested in how these tools have been brought into the undergraduate classroom, I’ve not discussed here the (impressive and exemplary) graduate work in courses run by Lisa Maruca (see Mechanick Exercises), or Allison Muri’s Grub Street Project. For an excellent set of tips and examples see Adeline Koh’s essay ‘Introducing Digital Humanities Work to Undergraduates.’

[2] In this context I should acknowledge my debt to George Williams (University of South Carolina Upstate). George’s own course – despite being an eighteenth-centuryist – is focused on an earlier media shift, and is organized around Sir Gawain and the Green Knight.

Using ECCO in the Undergraduate Classroom: Reviewing Gale Cengage’s Trial Access

More excellent remarks by Anna Battigelli on using ECCO at undergraduate level!

Early Modern Online Bibliography

Gale Cengage gave SUNY schools a great opportunity this semester by offering free trial access to ECCO, Burney, and NCCO.  I, for one, learned a lot from working with undergraduates in my Gothic Novels course as they searched ECCO for relevant material for their final research papers.  Those papers were mixed, with some outstanding essays and some less successful attempts.  I  summarize my experience below:

  • ECCO must be part of a strong digital collection in order to be fully usefuL.  Spotty digital holdings make using ECCO difficult.  For instance, without a subscription to the Oxford Dictionary of National Biography, new users find it difficult both to identify the author of a lesser known work and to assess that work’s historical or literary significance.
  • Using ECCO requires both competency with secondary sources and access to those sources.  Though some students used many secondary sources, even ordering books on interlibrary loan…

View original post 229 more words

Digital editing with undergraduates: some reflections

Digital Editing Project outline and Digital editions criteria

[Added 2015]

In 2012 I started supervising an English undergraduate dissertation: this was a online digital edition and it was my first experience of supervising a student’s digital project. What follows is a joint blog post of two parts – one from me and the other from Jess MacCarthy (the student) – that reflects upon our experiences. You can see the final online edition here:

pillory banner

 

Thoughts from the me, the supervisor

A couple of years ago, I decided to learn a little more about the back-end end of digitized primary resources. I attended a boot-camp into the why and how of encoding, using XML encoding and the protocols of the TEI, at the Digital Humanities Summer School at Oxford University. Just over a year ago (late Spring 2012) I decided that the best way to learn is to teach. Simultaneously, I wanted to conduct a trial on producing a digital edition of a Defoe text that used up-to-date protocols of digital editing as well as the open-access ethos of the great majority of current digitization projects. So I asked our 3rd year English undergraduates whether anybody would be willing to do this for their dissertation project. Luckily, I had a volunteer, Jessica McCarthy.

I left it up to Jess to decide which Defoe texts she would like to work on: like any large-scale project, sustaining enthusiasm is essential. But it also meant that Jess would find a lot out for herself about Defoe’s writings. However, an important factor was that I was not expecting Jess to spend time transcribing the text and so we had to source a reliable electronic copy in plain text. This would give Jess the freedom to decide how she wanted to encode it and how it would be presented online. However, it also occurred to me that the question of a ‘reliable’ electronic copy in plain text was an interesting issue of discussion in itself: what different kinds of texts and what kind of reliability are offered by, for example, Project Gutenberg, Google Books, Jack Lynch’s Eighteenth-Century Resources, or Romantic Circles? Examples that directly raised other questions were close by: at Bath Spa University we are lucky enough to have access to the large-scale digital resources of EEBO and ECCO. Texts accessed via these different resources come in various forms: digital facsimiles, plain text transcriptions from post-1800 print editions, hyperlinked and encoded texts, or a combination of plain text and facsimile texts. So this first stage of the project actually involved a deeper understanding of the nature of existing electronic resources, databases and archives, and would more effectively immerse Jess in important questions concerning the format, usability and access to historical literary texts. How are issues of access related to the kind of texts one was accessing? What does the format of these texts have to say about how they can be used and who are using them? What processes are involved with the type of text available on these resources? What is a ‘text’ in a digital context anyway?

Such questions are important, first, because undergraduate students do not often understand why different online resources look and feel the way they do. So I try to make explicit to students the differences between a facsimile, an edition, and an encoded text and the significance of those differences for how the text is to be used and for whom. The facsimile usually presents no problem to understand; although, for example in the case of ECCO, the relation between the image and the text (unseen and what one actually searches) is not fully grasped by many undergraduates, which provokes some interesting discussion. Second, this contextual understanding is essential for students to decide what kind of edition they are going to create. In this I ask students to consider their readership or, as Dan Cohen put it in ‘The Social Contract of Scholarly Publishing’, the ‘demand side’ of  Cohen argued that the print model has built-in assumptions about value and audience: ‘The book and article have an abundance of these value triggers from generations of use, but we are just beginning to understand equivalent value triggers online.’ Jess, for her own project – as you can see – decided to provide two editions to appeal to a variety of readerships: one an online edition with hyperlinked notes and a textual commentary; the other an encoding of that text. (In this, we looked to an edition on Romantic Circles as our model).

So, back to an earlier stage of decision-making. If we were after plain text copies of eighteenth-century editions, and not texts that were edited at some point later, that left two options for sources: the Oxford Text Archive and 18thConnect. There are currently 728 texts attributed to Defoe available via 18thConnect and 121 via OTA. Despite the ease with which one can download texts in a variety of file formats from OTA, I deliberately steered Jess towards 18thConnect because of its use of TypeWright. This software enables users to correct a number of individual 18c texts released to 18thConnect by ECCO (as frequent users of ECCO will know, the text that users are able to search is a rather mangled version, the product of now dated OCR software trying to decipher 18c typography via microfilm).

TypeWright

I may well continue to use this, since the advantage for any student is not only the knowledge gained about the workings and limitations of large-scale digital resources like ECCO that might be normally taken for granted, but also the added perspective gained on the processes of transformation from material document to electronic text.

Why encode and why TEI/XML?

Most databases allow one to perform searches based on a variety of categories (author, place of publication, title, date etc) because the texts have been ordered and sorted according to these categories. One can perform ‘all text’ searches. But I struggled, at first, to explain the limitations of this kind of markup to my students. So I’ll give you a similar kind of example I gave to Jess in relation to ECCO. Let’s imagine I’m searching some works by Defoe and I want to find references to High Church clergyman Henry Sacheverell (bap. 1674, d. 1724). Unsurprisingly there are quite a few, but it misses a number of important Defoe poems. Now I happen to know Sacheverell is mentioned in More Reformation and in The Double Welcome but ECCO didn’t find these. Why? Because in The Double Welcome his name is spelt ‘Sachevrel’, and in More Reformation it is ‘Sachavrell’. We could of course put in alternative spellings or use fuzzy searching. But this wouldn’t find more oblique references such as the one in Hymn to the Pillory where his name is pseudo-anonymously presented as ‘S———ll’. A machine does not know this is Henry Sacheverell. Similarly, it would not correctly identify this if Defoe had ever called him ‘Henry’ or ‘old Sacha,’ or something more figurative like ‘the Devil in a pulpit’ that we human readers would be able to interpret. More importantly, what if we didn’t know how Defoe alluded to Sacheverell at all?

A machine searches for strings of symbols and cannot recognise that one string of symbols represents another different string of symbols unless we tell it that each of those particular combination of symbols represent the same named entity. As Lou Bernard put it “only that which is explicit can be digitally processed,” or to put it another way encoding is to “make explicit (for a machine) what is implicit (to a person)”.

For me, then, the project has enabled me to reflect upon strategies for teaching digital technology and identifying – or beginning to – what issues are essential to introduce to students: the how and why of digital editing.

Jess McCarthy’s perspective: decentering authority?

I’m going to be going on a slightly different track; I’ll be talking about how in some ways my edition decentres some of the authority of a traditional printed edition of a text.

It wasn’t until I’d starting researching my reflective essay that I realised that my edition achieves this, to an extent, through my encoding of variants in the XML version. Most modern scholarly editions of texts work on the basis of editorial interpretation and intervention in creating a definitive edition which most closely presents the editor’s understanding of the author’s intentions. These editions are usually created through extensive use of textual apparatus, such as tables of variants and considered reasoning supporting the inclusion of one variant and the exclusion of another. Digital methods of presenting texts have brought into sharper focus how this approach to assembling an edition is based largely on limitations of its publication media. Marilyn Deegan and Kathryn Sutherland pointed out that,

for some the new technology has prompted the recognition of the prescriptive reasoning behind such editions as no more than a function of the technological limits of the book, less desirable and less persuasive now that the computer makes other possibilities available; namely, multiple distinct textual witnesses assembled in a virtual archive or library of forms. [1]

I aimed to achieve a presentation of multiple textual witnesses in my own edition by encoding variant readings into my XML document. This made it possible to present the different states of the text without privileging one state over another. This approach questions the idea of an ideal or more representative version of the text by presenting each state as equally valid and as existing simultaneously. Although I was able to present variants within my encoding without making any claims as to which witness was more authoritative, this was only really achievable within the encoded document. For example:

<l n=”19″>The undistinguish’d Fury of the Street,</l>
<l n=”20″><app>
<rdg wit=”#Q2″>With</rdg>
<rdg wit=”#Q1″>which</rdg>
</app> Mob and Malice Mankind Greet:</l>

To present the text on the website I had to choose a copy text based on what I considered to be the most complete representation of Daniel Defoe’s intentions in A Hymn to the Pillory. I based my edition of the text on the second edition, corrected with additions. This decision was reached early in the project and it was based on the logic that this was the earliest edition available that presented a fuller version of the text. Given the common editorial practice of selecting either the first available edition or the last edition known to have been produced by the author, I would reconsider my choice of copy text were I to start again. However, despite being an unorthodox approach to a copy text, contemporary editions of A Hymn to the Pillory based on the first edition include the later additions found in the second edition, and given that variants between the two texts have been included, I don’t think that my earlier decision undermines the authority of the text presented in a significantly damaging way.

This concern might seem to conflict with my encoding of variants. There I have deliberately not identified a lemma and chosen instead to present multiple, simultaneous witnesses that destabilise the assumption that there are readings that are more valid. This approach works well if you are concerned with textual criticism or data mining to create distant readings of texts. However, I wanted my edition to be as useful as possible to the widest possible audience, so the traditional concern of the humanities with close readings and interpretation had to be considered, and which depend on a stable text to interpret. Marilyn Deegan and Kathryn Sutherland acknowledge this, pointing out that ‘the editor’s exercise of proper expertise may be more liberating for more readers than seemingly total freedom of choice.’[2] Although digital technologies are highlighting how text can be treated differently in electronic formats, the primary concern for most readers of literature is still in interpreting the meaning of the text (rather than how it was composed or its variant states); and to interpret the meaning rather than the textual history, a stable edition needs to be presented.

I wanted to support the authority of my edition as a serious scholarly work so I included all of the textual apparatus that you would expect to find in a scholarly print edition. C. M. Sperberg-McQueen argues that ‘electronic editions without apparatus, without documentation of editorial principles, and without decent provisions for suitable display are unacceptable for serious scholarly work.’[3] While this doesn’t necessarily mean that apparatus for digital editions has to work in the same way or with the same concerns as print editions, it situates intellectual integrity as remaining a key concern for supporting the authority of an online edition.

I used hyperlinks as a way to discretely point to textual annotations from A Hymn to the Pillory and also in order to direct readers to further online points of interest, either from the annotations themselves, or from further reading. Phillip Doss argues that ‘by allowing escape from the context of a single documentary sequence, hypertext allows a reader to escape the linearity imposed by print media.’[4] There are positive and negative implications to the use of hypertext links that I tried to consider within my edition. An obvious limitation of using hypertext is exactly that it allows readers to escape the linearity of the text. On the other hand, by using hyperlinks I have been able to provide easy access to extra-textual material that would not be possible to include in a print edition. For instance, where I have been able to find them, I have included works by people that are mentioned in A Hymn to the Pillory. This has meant that intertextual relationships can be explicitly explored, rather than simply acknowledged. In this way the text is shown to be the product of many various influences in a way that is more difficult to achieve using physical means of publication and although the text is still the main focus of the edition it is presented less in isolation.

Lisa Spiro’s essay ‘“This Is Why We Fight”: Defining the Values of the Digital Humanities’ argues that ‘for the Digital Humanities, information is not a commodity to be controlled but a social good to be shared and reused.’ This is very much an attitude that I adopted in my approach to this project. My website is open access, making it freely available to anyone who wants to use the information presented. However, although this project is not formally associated with Bath Spa University, as an undergraduate studying there I had the privilege of institutional access to specialist resources that I would not have been able to use to support my research otherwise. Access to services such as the Dictionary of National Biography (DNB) and Eighteenth Century Collections Online (ECCO) allowed me to work using facsimiles of the copy text and research biographical annotation with confidence in the reliability and authority of my sources. I chose to hyperlink these sites where I have relied on them for my research to maintain the integrity of my sources. Although this means that some users may not be able to access the sites at the end of the hyperlinks I believe that being able to present information based on what these resources provide goes a small way to democratising the information that they contain. Working with the knowledge that not all users will be able to reference my sources, I tried to make my annotations as comprehensive as possible while still maintaining a focus to how they are relevant to the text.

At its core this project has an engaged interest in making specialist information freely available in the most useful, reliable form possible. It has supported ongoing work to make other scholarly resources more reliable by using 18thConnect’s TypeWright and hopes to engage with the widest possible audience by providing not only what is traditionally expected from an authoritative edition of a text but also by incorporating the formats that digital encoding supports for more specialist pursuits and longevity.


[1] Marilyn Deegan and Kathryn Sutherland, Transferred Illusions: Digital Technology and the Forms of Print (Farnham: Ashgate, 2009), p.87.

[2] Transferred Illusions, p.71.

[3] C. M. Sperberg-McQueen, ‘Textual Criticism and the Text Encoding Initiative’, The Literary Text in the Digital Age, ed. Richard J. Finneran (Michigan: University of Michigan Press, 1999), p.41.

[4] Phillip E. Doss, ‘Traditional Theory and Innovative Practice: The Electronic Editor as Poststructuralist Reader’, The Literary Text in the Digital Age, p.218.

Teaching with ECCO

A fantastically informed and informative post on using ECCO in eighteenth-century teaching, with a really useful set of follow-up comments.

Early Modern Online Bibliography

As posted yesterday, Gale Cengage is providing SUNY colleges with trial access to ECCO (Eighteenth Century Collections Online) and NCCO (Nineteenth Century Collections Online) this fall. Gale Cengage is also sponsoring
essay contests for SUNY students using these tools. This is a great opportunity to test these products, to think about how best to teach with them, and to evaluate students’ responses to them. So how best to introduce these resources?

Thinking about my undergraduate Gothic Novel class this fall, I decided that short videos would be the most effective way to introduce students unfamiliar with eighteenth-century texts to ECCO. I prepared three brief videos (below). I would love to hear how others introduce students to these tools.

There are a number of other videos on using ECCO. Below are a few from Virginia Tech:

View original post 204 more words

Digital Humanities and Archives @ ASECS 2012

I think it’s fair to say that this year’s annual meeting attracted more panels on digital humanities than ever before (and that doesn’t even include the pre-meeting THATCamp workshops: for a good review of that see Lisa Maruca’s post on Early Modern Online Bibliography). I’ve posted already on the use of digital technology in teaching 18thC culture, but there were still quite a large number of panels that included discussions of digital humanities – whether explicitly labelled ‘digital humanities’ or not. What interested me were the issues that kept cropping up about how digital archives design data to be searched and how they are actually searched.

I was especially intrigued, in the roundtable ‘Digital Humanities and the Archives’, by Randall Cream’s (West Chester) call for digital archives to try to mimic the joyful moment of “serendipitous discovery” in traditional archives: such “interpretive moments” produced through unexpected answers to “unthought” problems may be difficult to reproduce in digital archives which depend so much upon naming, cataloguing, and tagging. Michael Gavin addressed how one manages the digitization of plays, with the special nature of a play as text and as a theatrical performance. For Michael Gavin, this is not addressed in the current tagging models of TEI, and outlines how he modified the tagging to produce an archive whose searches can be sensitive to these two play-contexts. Clearly, all were agreed that the move towards semantic tagging would enable a more human and sustainable interaction with digital data (semantic tagging, using XML for example, has the ability to describe concepts and meanings; as opposed to HTML which describes the nature of the document and its relation to other documents. If anybody wants to, I’m perfectly willing to be corrected on this very rough definition). In the ‘Poetry and the Archive’ roundtable, questions of use and searchability were again implicit. Jennifer Batt’s (Oxford) description of how the Digital Miscellanies Index could be searched was a good example of a digital resource that, perhaps paradoxically, is a more open-ended research tool: since this is in index of first and last lines and not a digital archive of texts, researchers are perhaps left to their own intuition. It is, of course, arguable: both Andreas Mueller (Worcester, UK) and Kyle Roberts (Loyola, Chicago), in the panel ‘Digital Approaches to Library History’, outlined digital archives that were, in effect, archives with a thesis and so imagined ways of searching that would be directed towards research problems specific to their archives (in this case, library collections that are extant or are now dispersed). Roberts, on the Dissenting Academies Online project, aimed to create a “virtual library” system able to comprehend multiform library catalogues and records including author catalogues, short list catalogues, borrowing registers for 12,000 titles, 45,000 borrowings and over 600 borrowers. What was described was a process of tagging that enables the user to track borrowing by individual “borrower profiles” and the borrowing of individual books; profiling the development and use of a particular library collection over time; and to reveal shelving habits and systems. Mueller’s collaboration with the Hurd Library (the still-extant library of Bishop Richard Hurd (1720-1808)) also aimed at a “virtual” library, but by through digital visualization. Using shelving catalogues and the few surviving original shelf marks together with digital images of the shelves and a digital schematic loaded with data may enable users to research how this man of letters interacted, not only with the books in his collection, but also  with the space of his library. The data mapped into the visualization would be garnered from Hurd’s annotations, letters and entries in his commonplace books. While I have to declare an interest in the Hurd Library collaboration, it seems to me that these two projects have an important contribution to make in rethinking library history.

But design is only one half of the process, and while designing digital archives involves thinking carefully about the questions a user asks of the archive, two panellists on the ‘Digital Humanities and the Archives’ roundtable raised interesting questions about the ways and results of searching a digital archive for the user’s perspective (in both cases here, this was ECCO). Bill Blake (NYU) asked “what makes a good keyword search”, and produced a list of popular search terms (“slavery” coming top). He suggested that many users had an impulse to “retrieve” rather than “search” and that the poorest keyword search terms effectively reproduced what was in the archive (one of the most popular search terms “slavery” was a good example of this). He argued that the best searches operated on a conceptual level. Indeed, that is what I’ve been training my own students to do, many of whose first try at ECCO was using a broad topic-based search term: they discover that the results of such search terms are useless and relatively quickly begin to think about the processes involved in deciding on a better search term (a factor I thought Bill Blake’s paper rather underplayed). Sayre Greenfield (Pittsburgh) posed a rather different problem with search results: what about “interpreting lack of results”? He argued that one can only “confirm the validity of negative results” by comparison to positive results elsewhere. Using the example of a phrase search “Ay, there’s the rub” resulted in only two (!) hits in ECCO; searching the Burney Collection resulted in a much larger number of hits, evidence that in the eighteenth century this particular phrase of Shakespeare’s inhabited the “cultural micro-climate” of journalism and not literary discourse (ECCO doesn’t include journals and newspapers).

Managed serendipity anyone?

The present and future of digitisation projects: an interview with George Williams and Seth Denbo

I was very lucky to have the chance to talk to two of the leading voices on digital humanities when they very kindly agreed to take part in a filmed discussion at ASECS annual meeting, in San Antonio, March 2012. George Williams is an associate professor of English (specialising in the 18thC) at the University of South Carolina and will be familiar to many from the ProfHacker pieces in The Chronicle of Higher Education; Seth Denbo is a historian of eighteenth-century England and involved with MITH, Project Bamboo, the IHR Seminar in Digital History and is on the faculty of the Maryland Institue for Technology in the Humanities. (Using iMovie to film the discussion in my hotel room was a bit of an experiment – which is by way of an apology for any impairment in sound and /or visual quality. The interview is split into two parts).

ECCO in teaching 2012

What happened after 2007? Well, the move to using ECCO just for presentations worked better with the exploratory aim of this strand of the module, but the module itself ceased to exist shortly after a wide-ranging re-organisation of Bath Spa’s undergraduate degree system.

I ended up devising a second-year module on Gender and Eighteenth-Century Fiction and -surprise, surprise – I wanted to embed the use of ECCO within that. This was a very different proposition to the introductory level module within which I first experimented using ECCO. This was a more demanding module and more focused topic and resulted in some successes and partial set-backs

On the one hand it enabled me to still use ECCO ‘Mark lists’: this time I had list-titles such as ‘Early Feminisms’, ‘Femininity and Sexuality’, Femininity and Manners’, ‘Gentlemanliness’, ‘Unmanly behaviours’ and ‘Sensibility and Sentiment’. These links, usually containing 4-6 titles, included poems, conduct essays, tracts and sermons. The aim was to give students the materials to historicize their readings of the novels on the course (Fantomina, Roxana, Joseph Andrews, A Sentimental Journey, Evelina, The Wrongs of Woman). And, as in my first-year module, aware that some of the texts were very long, I advised that judicious use of the search function and the ‘e table of contents’ might help navigation.

The problems with that was, given the course’s primary object of study had to be the novels, any use of ECCO was necessarily subordinated to the understanding of the modules themes of gender and 18thC fiction. Related to this, students’ use and understanding of this material had to be more than a rather fun exploration of 18thC culture (as was the case in the 2007 first-year module): students at this level and for this type of thematic module would have to demonstrate an ability relate the novels to this historical material in a meaningful way. What this meant in practice was that the use of ECCO material had to be restricted to the one written assessment large enough to enable students to produce that kind of historicized reading (a 2,500 word essay – and even then I think that may be too small). Secondly, I had to allow seminars in the course to be set aside for students to look specifically at this 18thc contextual material, but not too many so as to take time away from the study of the novels.

On the whole, using ECCO in this way has again enabled students to see with their own eyes (‘long s’ and all – although one student has recently claimed to stop noticing it!) what the eighteenth century was thinking and writing about male and female behaviour. In seminars there is quite a lot of goggling at the attitudes towards women, some lovely critical comparisons between 18thC feminisms and 21st feminisms; and some initial surprise that male behaviour had its own policing too. In their written work, those students who took some time over selecting their contextual material produced more sophisticated essays; those students who relied rather too much on key-word searching tended to drop in unsuccessful or uncontextualised quotations.

At this stage, I have a nagging feeling that there’s a better way of embedding ECCO. Watch this space.