This article is available at the URI http://dlib.nyu.edu/awdl/isaw/isaw-papers/20-4/ as part of the NYU Library's Ancient World Digital Library in partnership with the Institute for the Study of the Ancient World (ISAW). More information about ISAW Papers is available on the ISAW website.

©2021 Gabriel Bodard; text and images distributed under the terms of the Creative Commons Attribution 4.0 International (CC-BY) license.
Creative Commons License

This article can be downloaded as a single file

ISAW Papers 20.4 (2021)

Linked Open Data for Ancient Names and People

Gabriel Bodard, Institute of Classical Studies, University of London

In: Sarah E. Bond, Paul Dilley, and Ryan Horne, eds. 2021. Linked Open Data for the Ancient Mediterranean: Structures, Practices, Prospects. ISAW Papers 20.

URI: http://hdl.handle.net/2333.1/zs7h4fs8

Abstract: This chapter discusses the kinds of information that are recorded about persons and names from antiquity and other periods of pre-modern history, and the ways in which this information can usefully be modelled in Linked Open Data and integrated with the Linked Ancient World Data graph. It begins by introducing some key concepts, in particular the importance of understanding data ‘modelling,’ and limiting the scope of the discussion to fairly basic information about historical persons. The body of the paper summarises the main trends in recording and encoding prosopographical, onomastic and other personal data in previous and current scholarship, both traditional and digital. As an example of the use of Linked Open Data to encode person- and name-data, the recommendations of the SNAP:DRGN project are outlined, noting that these are designed only to represent a small set of disambiguation data to enable interoperability and cross-platform searching among projects, rather than the full richness of prosopographical and onomastic data. The chapter concludes by pointing out the limitations of the current model, and suggesting some areas for future work and development.

Library of Congress Subjects: Prosopography--Data processing; Linked data.

Introduction

In this chapter I shall consider the kinds of information that are recorded about persons and names from antiquity, and the ways in which this information can usefully be modelled in Linked Open Data, and therefore integrated with the Linked Ancient World Data graph.1 I shall discuss the concrete recommendations of the SNAP:DRGN project, although these are designed only to represent a small set of disambiguation data to enable interoperability and cross-platform searching across projects, rather than the full richness of prosopographical and onomastic data.

An essential starting point to this discussion will be the, perhaps evident, observation that persons and names from the ancient world are known through sources (such as literary texts, inscriptions and papyri, coins, personal seals), which themselves need to be read, recorded, interpreted, modelled and related to other sources, objects and data.

A historical source may contain a name, assuming that we are reasonably confident about a reading and interpretation of the characters – even the fairly secure presence of an illegible or incomplete name is often worthwhile evidence. We may make linguistic and onomastic observations about the name, among which its etymology, dating, gender assignment, relation to names in the same or other languages, religious or ethnic associations, sometimes even hints as to class or profession. With a greater degree of interpretive subjectivity, we may also associate the name with a particular individual from history, either known from other sources or only attested in this single text, but about whom one can adduce geographical, linguistic, dating, and other information, including perhaps relationships with other persons referenced in the same or other sources.

There are also persons referenced in literary sources who are generally thought to be fictional, that is, not having a referent in the real historical world; or names that refer to religious or mythological entities that may or may not be thought to exist or have existed, but about whom dating and geographical statements are less commonly made. There are of course also explicitly expressed direct relationships between figures we would consider mythological and ahistorical (Aeneas, say) and figures we consider historical (Augustus, in this example), making the distinction between “real” and “unreal” persons rather less clear than we might like. Nevertheless, I do not intend to discuss the issue of fictional or mythological persons in this chapter, leaving them for discussion elsewhere. (See in the meantime: Lawrence 2007; Lawrence, Jewell and Rissen 2010; Bodard et al, 2017.)

Although this chapter focuses on ancient – and primarily ancient Mediterranean – historical persons, we should note that there are very few issues relating to the encoding or linking of classical person records that would not be equally applicable to discussions of prosopography relating to almost any other period of history or part of the world. The sources may differ in quality and quantity, types of person and degrees of certainty may be unique to some cultures, and even interpretations of the concepts of individuals, groups, relationships and mortality may be contingent, but the shape of a person database and the things a researcher may want to do with the data are not likely to be so different as to warrant a completely different discussion for each context. Substitute “ancient” with “historical” throughout this chapter, and a scholar of ancient China, Pre-Columbian Mesoamerica, or eighteenth-century Persia will find this discussion as useful as a historian of ancient Greece or Rome.

Some distinctions should be noted, although will not be discussed at any length here. There is a significant difference in the needs of and modelling performed by scholars attempting to record modern and contemporary historical individuals and populations. The scarcity of data about ancient (and most historical) populations, which drives the historian’s interest in every scrap of information, fragmentary text, allusion, speculation and emendation, proves a qualitative difference from historical periods where census data and similar very rich sources survive at a much higher rate. A prosopographer is interested in the smallest piece of evidence for a person from, say, the fifth century BCE (even without an attested name), in the way that a similar datum would be trivial or irrelevant to most scholars of, for example, twentieth century CE history. If one adds to this the further issue of privacy and sensitivity toward persons who are living or whose immediate family are, it becomes clear that the interests and therefore the design of prosopographical and population databases for the modern period will be quite different in several significant ways.

A word about the types of database containing information about ancient persons, each of which provide somewhat different data, and therefore have implications on the way persons will need to be expressed as Linked Open Data. Most large datasets of ancient names or persons that I shall discuss in this chapter are not prosopographies in the technical sense – studies of historical populations and societies that collate and interpret genealogical, onomastic and demographic information on individuals and groups, with particular reference to primary sources and scholarly discussion (cf. Keats-Rohan 2007). All of these types of dataset are nonetheless extremely rich as records of people and their associated names and data, and offer different facets of information to express in LOD. Prosopographical datasets more properly include the Prosopographia Imperii Romani (PIR), Prosopography of the Byzantine World (PBW), and Hellenistic Babylonia: Texts, Images, Names (HBTIN), which tend to focus on specific and delimited time and place, language(s), and perhaps social class or categories of person. Equally valuable, and often orders of magnitude larger, are projects such as the Lexicon of Greek Personal Names (LGPN) or Trismegistos People (TM), which while they do not gather the richness of biographical and other information typical of a prosopography, do identify individuals and core information about them, including relationships with others, while often focusing principally on primary source citations and gathering onomastic data.

A third category that is also a source of very large numbers of names and person references is the institutional or collection catalogue: the Perseus Catalog, the British Museum person database, the Virtual International Authority File (VIAF), the Zenon Catalogue of the Deutsche Archäologische Institut library, or other authority lists of persons such as potters (Kerameikos.org) or coin issuers (Nomisma.org). Datasets such as these are a very valuable source of disambiguation records, since librarians have historically put a lot of effort into creating authority records and adhering to interoperable standards as far as possible, but they usually contain very little or no contextual information such as dating, geography, onomastics and interpersonal relationships. While the geographical and chronological specificity of prosopographies and personal name databases means that the overlap in population between any two databases is likely to be very small, the presence of a Western canon means that the overlap in ancient persons between any two library catalogues is generally very large, even almost complete.

One of the most interesting elements of collecting large bodies of person data is the relationships of various kinds that the entities therein bring with them. Persons are at the core of a large network of places (of birth, death, residence, travel), events (participated in, witnessed), objects (created, owned), texts (authored, copied, attested in), and other people (related to, interacted with, corresponded with, co-occurred in texts with). In fact most statements that can be made about a person in a biographical or prosopographical resource can helpfully be expressed through structured data as typed (and attributed, dated and otherwise qualified) relationships with and between other entities of various kinds – if there are identifiers for such entities that can be used to express these relationships.

In this chapter I am talking about “expressing” or “modelling” person-data as Linked Open Data (LOD), on which more in a moment. I should say a few words about the concept of modelling data, or turning organic knowledge into structured, consistent, and therefore limited information for processing – by computer or otherwise at scale. The body of information about any given person or collection of people is often expressed, or held in the scholar’s mind, as a fluid, more or less chronological narrative, or as a collection of events and statements given to us by sources and placed in a historical context. Adding structure, or applying a model, to this information inevitably means distorting it: forcing some kinds of information into controlled vocabularies or restrictive typologies, leaving out details or elements that don’t fit the model, making assumptions where required fields are missing from history. A model, although (and indeed because) structured, is always misleading, distorting, imperfect.

We have to mitigate this degrading of information by: assiduously documenting the model and the decisions made to fit our data into it; by supplementing the structured data with free-text descriptions for the human reader of features not expressed in LOD; by ensuring that the use we or a computer will make of the economies of scale or pattern detection made possible by implementing the model are worth the signal-loss; and by being explicit about the historical (and secondary) sources that feed into our data, so that a reader can go back to the start and check the raw data that we are thus perverting.2 In other words, if we extract information from a variety of sources to populate a uniform database record with names, dates, titles, relationships, and other data about a person, all of the sources, definitions, decisions and mechanisms for populating the individual fields should be preserved.

Having said this, the nature of person-data as involving relationships between entities, places, persons, and other sorts of data and vocabularies, means that once structured or modelled as described above, this data can be very conveniently and effectively expressed in LOD, using the conceptual language RDF. As its name suggests, Linked Open Data is all about modelling data in terms of relationships between entities (people, places, events, things), properties (things we can say about entities, qualifiers, relationship types), or classes (vocabularies of predicates or descriptors, such as occupations, types of settlement, etc.). RDF is formed of statements that have three parts: entity A → has relation or property B → with entity or predicate C (where A, B and C are explicitly referenced by identifiers). Two examples of this “triple” format might look like:


	Aristophanes (LGPN:V2-9254)
			is from (SNAP:associatedPlace)
					Athens (Pleiades:579885).
					
	Lepcis Magna (Pleiades:344448)
			has type (PleiadOnt:hasFeatureType)
					settlement (PleiadType:settlement).

You will notice that all the entities (as well as properties and classes) are defined by identifiers; they would in practice be URLs, but I have abbreviated them for this example.3 This means that for RDF to make a statement about people, places, or anything else, the entities first have to be defined, and a list of the identifiers for them should ideally exist online (the LGPN, SNAP and Pleiades websites in the example above). If there are also informative pages about the entity in question at the URL – i.e. if the identifiers are dereferenceable – this potentially enables the discovery of new resources via information on the web pages about these entities. Likewise, the properties and classes should be defined in vocabularies, thesauri or ontologies, which also assign to each one a unique identifier and a web address. This is where the “linked” in Linked Open Data gets its power: each RDF statement or triple makes a connection between entities and datasets, a link that can be followed back to sources, definitions, indexes, and collections of entities. This fits very nicely onto the definition of person-data I started with above.

Modelling People And Names

Prosopographies

Even traditional prosopographies and catalogues explicitly model person-data, which is to say, they do not simply recount all the known facts in an organic and unique way, but present a structured record, selecting certain elements of interest across the corpus or database, and conversely adding, in some cases quite speculatively, those required elements that are missing from the historical record for a given person. (See Varga 2017 for recent discussion of these issues.)

One of the most widely cited prosopographical works relating to classical antiquity is the Prosopographia Imperii Romani (“prosopography of the Roman Empire”), the second edition of which was published in Berlin in fourteen fascicles between 1933 and 2015. In this work, the structured data about individual persons (which Verboven et al. 2007 call the “questionnaire”) are embedded in a paragraph of prose text (in Latin), and so the presence or absence of any given element is not clearly marked. For example, the entry for the second century proconsul Titianus (PIR² VIII.1, p. 76) reads:

248      [–] TITIA[NVS], [proconsul?] (Cyprus) saeculo secundo t. mutilus in tessera scriptus Salamini rep. I. Salamis 24.

Sed res valde incerta, cf. Thomasson, qui eum non in Laterculum suum recepit. Str

Another huge prosopography, the Prosopographie chrétienne du Bas-Empire (PCBE), currently in six fascicles (organized by region) published between 1982 and 2013, also uses a prose-based approach (in a modern language – French – at least) to the “questionnaire” of each person record, but uses a certain amount of page formatting to make a few key fields easy to identify: date, bibliography and the like. A very brief example, the priest Pateras (Destephen 2008, p. 759) reads:

PATÈRAS 2, prêtre de Savatra ? (Lycaonie)       IVe s.

Il est mentionné avec son père, le personnage précédent (→ Patèras 1).

W. M. Cᴀʟᴅᴇʀ et J. M. R. Cᴏʀᴍᴀᴄᴋ, MAMA, VIII, p. 46, no 253.

Both examples list name, titulature, location (of residence/citizenship/office) and some indication of date; then gives some prose narrative of history or life, which may include primary and secondary bibliographic sources and cross-references to other persons in the same prosopography or not. These highly structured records, abbreviated and designed for reading by an (initiated) human, can of course be readily generated from structured data, but the reverse is not the case – in other words, they are not easily machine actionable data.

Factoid Model

Digital publications of prosopographies make use of databases for structuring, sorting and presenting data in an explicitly structured way, and of advanced search and faceted browsing for organization and filtering of records. Among the more popular recent approaches to this digital representation of prosopographical data, and most suitable for expressing as Linked Open Data, is the so-called “Factoid Model” (see Bradley & Short 2005 and Bradley 2016). In this model, the core entity in the database, as in the study of historical persons discussed above, is the attestation or source – that is, the text in which an event, relationship, or trait of one or more people are mentioned. The attested event other piece of information (“factoid” rather than “fact” because counter-factual or internally inconsistent attestations are valid entities in a prosopographical dataset) is both the glue that links together persons, primary and secondary texts, places, dates, and the other tables of the database, and one of the building blocks from which all person records are built.

For example, a factoid relating to a database of person information relating to ancient Lepcis Magna might be the funerary inscription of Publius Lucretius Crescens, which was set up by his father at some point in the Imperial period.4 This source would then be linked from several places in the database: as source for the existence, name and (very rough) date of Crescens himself; as one of several sources for the existence, name etc. of P. Lucretius Rogatinus, the erector of the monument; and as source for the relationship between the two.

One “factoid” may therefore be a component of several person records, and almost every person record will be made up of several such attestations. Their name, their family relationships, professions or titles, marriage, participation in battles, treaties, coronations, and other historical events, are only part of our person record because they are attested in one or more sources. This model is therefore not only intellectually rigorous, making sure that all claims about a person are backed up in primary sources or secondary scholarship, but admirably transparent in communicating these sources, and other bases for statements, to the reader. It is possible to browse the factoid database (see e.g. PBW) by any of the top-level entities, including – in addition to people – places, events, occupations, names, and indeed sources and factoids themselves.

Text Encoding

A much simpler scheme for encoding information about historical persons – although one in which it is possible to model many of the same entities and relationships as I have been discussing above – would be in encoded text, for example in TEI (Text Encoding Initiative) XML. XML is a strictly hierarchical language, so a single <person> element must completely encompass those elements containing core information about that person: name, dates, occupations, locations, etc. However, the TEI content model also defines several attributes for linking to other elements or even external files, allowing the “stand-off” annotation of sections of text, or entities in a separate hierarchy from the person-list. A <person> element might be referenced from several <relation> elements, for example, which define family relationships with several other persons. Several <person> elements might all point to the same <event> element defining an event in which they all participated in some way.

A simple person record (with external <relation> given for comparison) in TEI XML might look something like:


	<person xml:id="person7">
	   <persName xml:lang="grc">Εὐθήριος</persName>
	   <floruit notBefore="0392" notAfter="0393"/>
	   <occupation>κόμης</occupation>
	   <affiliation ref="pleiades.stoa.org/places/226564">Cherson</affiliation>
	</person>
	<relation active="#person7" ref="snap:BrotherOf" passive="#person8"/>

While this is a much flatter and more linear way of encoding person-data, it would probably be possible (if perhaps not the most efficient) to encode a basic factoid prosopography in this format.

Catalogues

Although they typically contain less information about each person, another very useful source of lists of persons is the library catalogue, or similar authority list of persons involved in a bibliographical, archival, or museum item database. The catalogue may say very little more than each person’s name and a bare minimum of biographical data (life or work dates, language, genre, nationality), the purpose of the list is to be completely unambiguous and very clear about the identity of individual persons, so that a book (or item, or record, etc.) can be ascribed to the correct person no matter how many homonyms may exist. However the data itself is encoded, library catalogues also follow very strict metadata schemas and vocabularies, and often expose their identifiers for the purpose of cross-referencing with other libraries. It is useful to know that the Library of Congress’s record for Alexander of Aphrodisias has the same referent as records in the British Library, the Hellenic and Roman Library, or Zenon catalogues.

The very useful task of collating the many library catalogue authority lists in use is performed by the Online Computer Library Center’s Virtual Internet Authority File (VIAF), which includes approximately 1,000 author records from the ancient world. VIAF and other authorities serve as the glue in the prosopographical data-space, rather than primarily as providers of new person and historical data.

Onomastics

Other databases contain authority information about names themselves, in addition to or instead of identification of individual persons. This onomastic data – listing of unique names along with philological and historical information such as etymology, geographic or religious significance, morphology and variant spellings or language and script versions – does not disambiguate between several persons all called Diomedes, but provides a record to which one can point to clarify that a name form “Diomedes,” “Diomède,” “Διομήδης,” “دیومدس” or “Диомед” (each with one or more language codes attached) is a variant form of a particular ancient name.

Person data resources such as Trismegistos, the Lexicon of Greek Personal Names, and Celtic Personal Names of Roman Britain all list unique names as separate entities from the person data. As with persons, name records may be expressed as fields in a database record, elements in TEI XML, or RDF statements, and the more formalized the data and entities that are attached to each name, the easier it will be to connect these names to the web of open data, other projects, person and name records. Such onomastic data enriches and enables discovery in and interoperability between individual person-databases in a combined ecosystem.

Disambiguating Person References

In many of the kinds of database discussed above, the process of compiling an unambiguous person authority often begins by disambiguating the many possible duplicate person references identified in a text or corpus. Each string of text that contains a personal name or other appellation refers directly or indirectly to a person. Identifying a person-reference is relatively unambiguous, but disambiguating the person or entity to whom it refers is an intellectual activity, and may involve uncertainty, the citation of evidence or argument, and be facilitated by the existence of a list of already known persons. For example, the Trismegistos database of text and person data from ancient Egypt, collects data in several related but separate databases.

  1. Texts (c. 680,000 records), the texts, mostly papyri, from ancient Egypt and the Nile valley; 
  2. References (c. 500,000 records), listing each attestation of a person; will be duplicates even if, for example, the same person is clearly named twice in the same line of the same text; 
  3. People (c. 370,000 records), listing individual persons once only, however many times they are attested; 
  4. Names (c. 33,000 records), giving unique names, many of which are of course used by several different persons (there are also larger tables of name variants and declined name forms).

This workflow results in an important collection of data that is both a reflection of the activity of moving from source texts to disambiguated name and person data, and a working tool that can be used in the process of further refining and adding new texts and person records to the database. Each of these database tables are exposed on the project website, and each record in them has a stable identifier that means all texts, attestations, persons and names can be used as an authority for disambiguating or relating from other internal or external data. Person records can be related to each other (or to those in LGPN, PIR or other databases), name variants can be used to expand the value of poorer records, and so forth. The person references in this model are an example of an entity broadly analogous to the “factoid” in the source-centered model discussed above.

Modelling Person-Data in Linked Open Data

As yet there is no single, widely accepted ontology for the representation of the full richness of prosopographical data in RDF. Indeed, as I discussed above, there is no single, widely accepted database format or even print-based approach to the representation of name and person information. Since the value of Linked Open Data is above all the massive interoperability between many discrete bodies of data with differing origins and structures, the development in the meantime of an ontology for interchange of the essential elements of person-data common to make such datasets would serve an important role.

This was the premise of the Standards for Networking Ancient Prosopographies: Data and relations in Greco-Roman names (SNAP:DRGN) project, which in 2014 proposed an ontology and set of guidelines (the “SNAP Cookbook”) for representing core person disambiguation data in RDF (Bodard et al. 2014; id. 2017). The information represented in the SNAP Ontology is by design minimalist, the intention being to provide those data that would help a user disambiguate between the potentially thousands of persons with the same name in our records, and then to refer them to the original (digital) publication for the full prosopography or biography.

The SNAP project recommends RDF representation of person-data by means of eleven categories of information: four are required for any meaningful person-record; two are strongly recommended, but may be impossible with some categories of data; five are optional, but of course the more information the greater the likelihood of successful disambiguation.5

Required Fields

  1. URI. Each person record needs to have a Uniform Resource Identifier, as discussed above. If the originating dataset does not provide a unique, stable and digital identifier (e.g. it is a print prosopography) then a digital surrogate will be needed to mint identifiers and serve as a target for links to point to. The URI is the subject node in all the RDF triples below.
  2.  
  3. Type. The RDF needs to be told that the URI refers to a person (or deity, monster, group, or other person-like entity or agent). The LAWD ontology provides a vocabulary for agent types that is used by SNAP.6
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    rdf:type lawd:Person .
    
  4. Citation. The identifier that would be used to uniquely refer to this person entry in a traditional publication, given as a text string which may or may not be recognizably related to the URI. Need not be unique, but should be understood by and useful to a scholar in the field.7
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    dct:bibliographicCitation "Leon 103" .
    
  5. Publisher. The identity (disambiguated by URI or web address) of the publisher or database responsible for providing the information herein about the person entity. Together with the citation, above, this should be enough to unambiguously identify the person record in an original digital or print resource.
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    dct:isPartOf <http://www.pbw.kcl.ac.uk/> .
    

Recommended Fields

  1. Name. The name or names by which the person is known in the original sources, in scholarship, or any other useful context. Names may be expressed as simple text strings (ideally with a language code attached), or as a URI which points to further detailed information, such as multiple language-forms of the name, in RDF or in an originating database. This information would be required, but for the case of many prosopographies or databases that contain entries for anonymous persons.8
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    foaf:name "κυρῶ Λέοντι"@grc .
    
  2. Attestations. A link to the specific information resource (publication, database record, inscription, etc.) in which the evidence for the existence of the historical person appears. This may be a digital resource, such as the “References” table in Trismegistos, as discussed above, or may be represented as a Citation in plain text, e.g. “IG IX (1)² (1) 145”. This is invaluable data for disambiguating homonymous persons in prosopographical data, but many person datasets, such as library catalogues, will not contain this sort of information.9
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    lawd:hasAttestation
    		   <http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658#ref> .
    		   
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658#personref>
    	    lawd:hasCitation
    		   <http://db.pbw.kcl.ac.uk/pbw2011/entity/boulloterion/1688> .
    		   
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/boulloterion/1688>
    	    cnt:chars "Boulloterion 1688" .
    

Optional Fields

  • 7-9. Disambiguators. While SNAP does not attempt to model the full prosopographical record for a person, some information is so useful for the purpose of disambiguating records, that it is modelled in simple form. This includes:
  1. Associated place. A link, ideally to a Pleiades or similar gazetteer place URI, indicating the primary geographical association of the person record (their home, city of citizenship, seat of their power, etc.). There may be more than one place associated with a person, for example a religious official who held different bishoprics over their career.10
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    snap:associatedPlace
    		   <http://www.geonames.org/729147/mesemvriya.html> .
    
  2. Associated date. A simple (W3C formatted) date expression giving the best indicator available for the dates of the person – dates of reign, floruit, known or estimated birth and death dates, or simply the date of attestation, such as a “probably second century CE” inscription. More detailed dates and periods are possible in other ontologies, in many cases expressed as URIs, but SNAP currently requires only a simple delimiting range. This property is expressed as a single date or range in a text string, such as the example below which means “some time between 1067 CE and 1133 CE” (presumably “late 11th to early 12th century” in the original edition).
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    snap:associatedDate "1067/1133" .
    
  3. Occupation. In the absence of a useful vocabulary of ancient titulature, a simple text string recording the title, occupation, or other epithet commonly attached to the person’s name, for purposes of disambiguating, for instance, Plato philosophus from Plato dramaticus in a library catalogue.
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    snap:occupation "Protovestes" .
    

Other disambiguating terms, or fields in a source database that do not distinguish between the above categories, may be indicated simply as “disambiguator,” the super-class to which all three belong.

  1. Relationships. Family relationships and other strong bonds with other persons in the historical record are expressed by the creation of a “bond,” which has properties including the two persons linked, and the type of the relationship, multi-classed to allow labels such as “foster,” “in-law,” “half” and similar qualifications. The SNAP ontology provides classes for the various kinds of (mostly family) relationships that were attested in our pilot datasets.11
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    snap:hasBond pbw:cousin-107466 .	   
    	 pbw:cousin-107466 rdf:type snap:CousinOf ;
    	    snap:bond-with
    		   <http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107466> ;
    	    rdfs:label "Cousin of Kale 102" .
    
  2. Other identifiers. If the person-record is annotated with a URI or other identifier from a common authority of vocabulary that is likely to help link the record with other instances of the person (VIAF, Wikidata, DBpedia, DNB), these should be included in the SNAP data.12
    
    	<http://db.pbw.kcl.ac.uk/pbw2011/entity/person/107658>
    	    skos:exactMatch
    		   <https://de.wikipedia.org/wiki/Leon_Diabatenos> .
    

Conclusions

The SNAP ontology is not suitable for encoding the full richness of prosopographical and onomastic data, nor was it ever designed to be. Nonetheless these recommendations may serve as a useful example of the use of RDF to represent certain elements of person and named data as Linked Open Data, using a mix of standard ontologies and new terms to capture the concepts as required. One can begin to imagine the extensions that would be required for representing other prosopographical, biographical or demographic data in similar ways.

Much work remains to be done, both to build the specifications of LOD for person-data, and to populate datasets such as SNAP:DRGN with enough person records to make cross-project querying and comparison truly useful. Once a sizably body of data exists in such a compatible format, vast possibilities begin to open for new discovery and research across the dataset, detection and proposal of matching across records, new identifications, alignment, and annotation by scholarly and crowdsourcing communities. The linked onomastic data will also usefully serve as “gold standard” data for machine-assisted process such as named entity extraction, spell-checking, optical character recognition and the like.

The integration of person-data will lead to a new aggregating dataset containing at least hundreds of thousands of new records, counting only those already in traditional prosopographies and databases recording people from the ancient world. Alongside already huge databases of person and place references, such as those collected by the Pelagios Project, events, dates and time periods, text and archaeological object records, the scale of the contribution to digital humanities research of a field of Linked Open Person-Data is almost unimaginable.

Abbreviations and Databases

CPNRB (Celtic Personal Names of Roman Britain) – http://www.asnc.cam.ac.uk/personalnames/

DNB (Oxford Dictionary of National Biography) – http://www.oxforddnb.com/

HBTIN (Hellenistic Babylonia: Texts, Images and Names University of California, Berkeley – http://oracc.museum.upenn.edu/hbtin/

LAWD (Linking Ancient World Data) – http://lawd.info/

LGPN (Lexicon of Greek Personal Names, Oxford University) – http://www.lgpn.ox.ac.uk/

Kerameikos.org: an ontology for the intellectual concepts of pottery – http://kerameikos.org/

Nomisma.org: stable digital representations of numismatic concepts – http://nomisma.org/

Perseus Catalog – http://catalog.perseus.org/

PIR (Prosopographia Imperii Romani second edition, Berlin, 1933-2015) – Information and name search: http://pir.bbaw.de/

PBW (Prosopography of the Byzantine World) – http://www.pbw.kcl.ac.uk/

RDF (Resource Description Framework) – https://www.w3.org/RDF/

TEI (Text Encoding Initiative) – http://www.tei-c.org/index.xml

TM (Trismegistos) – http://www.trismegistos.org/ref/index.php

VIAF (Virtual Internet Authority File) – http:viaf.org/

Zenon (Library catalogue of the Deutsches Archäologisches Institut) – https://zenon.dainst.org/

References

Bodard, G., Cayless, H., Depauw, M., Isaksen, I., Lawrence, K.F., Rahtz, S.P.Q. (2014). SNAP:DRGN Cookbook. Available: http://snapdrgn.net/cookbook/.

Bodard, G., Cayless, H., Depauw, M., Isaksen, I., Lawrence, K.F., Rahtz, S.P.Q. (2017). “Standards for Networking Ancient Person-data: Digital approaches to problems in prosopographical space.” Digital Classics Online 3.2. Available: http://dx.doi.org/10.11588/dco.2017.0.37975.

Bradley, John & Short, Harold (2005). “Texts into databases: the evolving field of new-style prosopography.” Literary and Linguistic Computing 20, Supplement. Pp. 3-24.

Bradley, John (2016), Factoids: A site that introduces Factoid Prosopography. King’s College London. Available: http://factoid-dighum.kcl.ac.uk/.

Cameron, Averil, ed. (2003). Fifty Years of Prosopography: The Later Roman Empire, Byzantium and Beyond. Oxford: Oxford University Press.

Depauw, Mark & Van Beek, B. (2009). “People in Greek Documentary Papyri. First Results of a Research Project.” Journal of Juristic Papyrology 39. Pp. 31-47. Available: http://www.trismegistos.org/ref/depauw_vanbeek.pdf.

Destephen, Sylvain (2008). Prosopographie chrétienne du Bas-Empire: 3. Diocèse d’Asie (325-641). Paris: Association des amis du Centre d’histoire et civilisation de Byzance.

Eck, W., Matthäus Heil & Johannes Heinrichs (2009), Prosopographia Imperii Romani Saec. I. II. III. Pars viii, Fasciculus 1. Berlin/New York: Walter de Gruyter.

Groag, Edmund & Arturus Stein (1933), Prosopographia Imperii Romani Saec. I. II. III. Pars i. Berlin: Walter de Gruyter.

Lawrence, K. Faith, Jewell, M. O., Rissen, P. (2010). ‘OntoMedia: Telling Stories to Your Computer’, in Proceedings of the First International AMICUS Workshop on Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts, Vienna, Austria. Available at: https://ilk.uvt.nl/amicus/amicus_ws2010_proceedings.html.

Lawrence, K. Faith (2007). The Web of Community Trust – Amateur Fiction Online: A Case Study in Community Focused Design for the Semantic Web, Doctoral Thesis, University of Southampton. Available at: https://eprints.soton.ac.uk/264704/.

Keats-Rohan, K.S.B (2007). “Biography, Identity and Names: Understanding the Pursuit of the Individual in Prosopography.” In Keats-Rohan (ed.) Prosopography Approaches and Applications: A Handbook. Oxford: Oxford University Press. Pp. 139–181.

TEI-C (Text Encoding Initiative Consortium) (2016), “13.3 Biographical and Prosopographical Data”. TEI Guidelines, Chapter 13, Names, Dates, People and Places. Available: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html#NDPERS.

Varga, Rada (2017). “Romans 1 by 1 v.1.1. New developments in the study of Roman population.” Digital Classics Online 3.2. Available: http://dx.doi.org/10.11588/dco.2017.0.35822.

Verboven, K., Carlier, M. & Dumolyn, J. (2007). “A Short Manual to the Art of Prosopography.” In Keats-Rohan (ed.) Prosopography Approaches and Applications: A Handbook. Oxford: Oxford University Press. Pp. 35–69.

Vitale, V. (2016). “Transparent, Multivocal, Cross-disciplinary: The Use of Linked Open Data and a Community-developed RDF Ontology to Document and Enrich 3D Visualisation for Cultural Heritage.” In Bodard & Romanello (eds.), Digital Classics Outside the Echo-Chamber: Teaching, Knowledge Exchange & Public Engagement. London: Ubiquity Press. Pp. 147–168. Available: http://dx.doi.org/10.5334/bat.i.

Notes

1 The author would like to thank Hugh Cayless and K. Faith Lawrence for discussion at an early stage of this chapter, and Paula Granados García, Rada Varga and Valeria Vitale for invaluable feedback on the final text. All shortcomings of course remain my own.

2 On this argument applied to a different application of LOD, cf. Vitale 2016.

3 In this example of RDF notation, the LGPN: prefix expands to http://www.lgpn.ox.ac.uk/id/, the SNAP: prefix to http://data.snapdrgn.net/ontology/snap#, the Pleiades: prefix to http://pleiades.stoa.org/places/, the PleiadOnt: prefix to https://pleiades.stoa.org/places/vocab#, and the PleiadType: prefix to https://pleiades.stoa.org/vocabularies/place-types/.

4 This example loosely based on Inscriptions of Roman Tripolitania 720 (ed J.M. Reynolds, 2009 [1952]). Available: http://inslib.kcl.ac.uk/irt2009/IRT720.html.

5 An example of a complete SNAP person record, imperfectly mocked-up for the Prosopography of the Byzantine World project, can be found at https://goo.gl/jG8Ly7. The RDF examples given in this section are expressed in TTL (“turtle”) – the Terse RDF Triple Language; see https://www.w3.org/TR/turtle/.

6 In the example below the rdf: prefix expands to http://www.w3.org/1999/02/22-rdf-syntax-ns#.

7 In this example and elsewhere the dct: prefix expands to http://purl.org/dc/terms/.

8 In this example the foaf: prefix expands to http://xmlns.com/foaf/0.1/.

9 In this example the lawd: prefix expands to http://lawd.info/ontology/, and the cnt: prefix to http://www.w3.org/2011/content#.

10 The snap: prefix expands to http://data.snapdrgn.net/ontology/snap#, and rdfs: to http://www.w3.org/2000/01/rdf-schema#.

11 SNAP ontology described at https://snapdrgn.net/ontology.

12 In this example the skos: prefix expands to http://www.w3.org/2004/02/skos/core#.