This article is available at the URI as part of the NYU Library's Ancient World Digital Library in partnership with the Institute for the Study of the Ancient World (ISAW). More information about ISAW Papers is available on the ISAW website.

Except where noted, ©2014 Daniel Pett; distributed under the terms of the Creative Commons Attribution License
Creative Commons License

This article can be downloaded as a single file

ISAW Papers 7.20 (2014)

Linking Portable Antiquities to a Wider Web

Daniel E.J. Pett

Abstract: This paper will discuss the impact that the two LAWDI events had on the digital work and output of the United Kingdom's Portable Antiquities Scheme, based at the British Museum, London. It discusses the progress of the author's work in developing the Scheme's online presence towards Berners-Lee's 5 Stars of Linked data following the two iterations of the LAWDI programme in 2012 and 2013. This article also gives examples of Linked Data principles being utilised by the Portable Antiquities Scheme website.

Subjects: Humanities--Study and teaching

Keywords: archaeology, antiquities, semantic web, linked data, classics, numismatics, portable antiquities scheme

The Portable Antiquities Scheme (hereafter the PAS) was one of the few projects to attend the LAWDI events, which did not truly fall under the Classical World umbrella that neatly encompassed many of the other attending projects. The PAS is a government funded project in the United Kingdom, which promotes the voluntary recording of archaeological objects found by members of the public in England and Wales and administers the Treasure Act process. These objects that are recorded on the PAS database range from the Prehistoric to Post-Medieval periods (as defined in the UK by English Heritage) with a large corpus of material that can be attributed to the Roman Empire, of which the majority are coins. Indeed, numismatic material provides the greatest opportunity for the Scheme to link to other resources (or URIs) through their regular attributes such as the mint (geography), era (time), issuer or moneyer (people), place of discovery (geography) and so on.

Since 2003 (Pett 2010a) these data have been placed online within a dynamic database that is updated in real-time, and over 900,000 objects have been recorded with images and extensive metadata (over 200 possible fields, many using controlled and agreed vocabularies) being collected for each. These metadata present an exciting and very practical opportunity for implementing linked data techniques, collaborating with many of the attending projects, and this paper will briefly touch on this.

The majority of attending projects were beginning their adventure into the world of Linked and Open Data (LOD) and many could provide data that the PAS software could and can consume and use for enrichment of its own website (Pett 2010b, Gruber et al. 2012). In turn these linkages can be tied to external resources such as Virtual Identity Authorities Files (VIAF) or dbPedia or as Kansa (2013) showed in his excellent presentations, to projects such as the Encyclopaedia of Life.

Pre-LAWDI: where did the PAS website stand?

The PAS software is written entirely by the paper's author (all source code available at GitHub), and builds on the original content management system that was provided by Oxford ArchDigital before their liquidation in 2007 (Pett 2010a). (Like Kansa's superb OpenContext, choices of technology and implementation are very similar: PHP, Solr etc). (This software provides a platform for the 'real-time' capture and publication of artefacts discovered by the public within England and Wales, whilst pursuing their hobbies.) From this period onwards, the author began to explore best practices for web implementation and early on in the development of the site (Pett 2012); a decision was made to begin the journey towards what was to become Berners-Lee’s 5 stars of linked data (see Berners-Lee 2006 and Hausenblas 2010) with the implementation of cool URIs (W3C 2008) which attempted to describe the resource that the consumer would find; for example: which leads to details of the page within the Roman coin guide describing the depiction of the personification of Apollo. Data driven pages within the site could be obtained in various representations - for example as JSON, XML, CSV or KML – but content negotiation was still lacking and is at the present time as the author has not managed to find time to implement this. The lack of serving up data with content negotiation has been highlighted by Light (2011) and is something that does need resolving.

Within the site templates structured data has been used for example microformats (these are now 7 years old as a concept) and rdfa (Herman et al, 2013) are used extensively. For example within HTML contact templates, FOAF standards were implemented (Brickley and Miller 2010). This can be shown in Google’s webmaster tools structured data section in figure 1 below:

An example of structured data from Google webmaster tools
Figure 1: An example of structured data from the author’s profile page as seen by Google’s webmaster tools.

Post-LAWDI 2012

After the 2012 event at ISAW (Elliot et al. 2012), efforts were made to bring the PAS into the Pelagios family. This was quite a simple task to expedite with expert advice available online (Barker, Isaksen and Simon 2012) and the PAS database already using Pleiades identifiers within its schema. A weekly compiled dump of the ever changing PAS data that had Pleiades IDs (Pett, 2012b) was produced and this was integrated into the Pelagios ecosystem (at present over 80,000 Roman coins records on the PAS database have attributions to Roman mints which have been aligned with Pleiades identifiers, leaving around 90,000 unattributed.) These data were used to great effect in Cayless’ data explorer visualisation that was unveiled to the world at large at LAWDI-2013 (Cayless 2013) with Rome being our highest attribution.

In the spirit of collaboration and as a mutual benefit, the PAS also hosted a mirror tile store of the maps that Johan Åhfeldt (2012) created and these are available for anyone to use, with the PAS picking up the associated bandwidth costs (Pett 2012d). Further integration with LAWDI resources included the use of the Pelagios widget, Nomisma and Pleiades identifiers and the ISAW javascript library (Rabinowitz & Heath 2012).

Attempts were also made to implement more structured data within HTML templates, for example the use of and Facebook’s OpenGraph metadata tags. A good example of structured data in action can be seen through the author’s implementation of Twitter cards (Pett 2012), a pretty simple process where meta-tags are added to the head section of an HTML document and this allows the Twitter user-interfaces to parse a concise preview of your content (figure 2 below demonstrates this for the Roman cavalry helmet found with an Iron Age hoard in Leicestershire – PAS record PAS-984616 - (Leins & Hill 2012).

An example of a twitter card for the Hallaton Helmet
Figure 2: An example of a Twitter card, produced via the parsing of metadata tags applied within mark-up on templates.

Concerted efforts were also made to try and bring the authority lists used for recording coins to external authority lists, and to achieve this, the author used OpenRefine to reconcile terms against VIAF (see Page 2013 for details on how to do this) and dbPedia. Some of the results of this can be seen in Elliot’s "About Roman Emperors" project (2013) and also within the templates used in the PAS website numismatic guides and this process of enhancement is ongoing.

Post LAWDI 2013 – production and consumption of linked data.

Following the 2013 event at Drew University, the author began to embark on further development of the LOD capabilities of the PAS website (Pett 2013). A decision was made to integrate with resources provided by English Heritage, the Ordnance Survey, Nomisma, VIAF, dbPedia and the British Museum in a more regimented fashion throughout the records of objects held within the database.

Production of linked data via the PAS website

RDF is now produced by using XSLT on XML returned from the SOLR indexes that drive the search engine that power PAS pages and this RDF would attempt to be modelled in the same manner as the British Museum's representation of the CIDOC-CRM model (also building on previous work conducted by the University of Vienna as part of the European funded BRICKS project (Nussbaumer and Haslhofer, 2007: 7–18) but with links out to the resources described above. Through personal correspondence, and face to face meetings with the Research Space project team and via detailed consultation of their draft mapping document (Oldman et al 2013), an attempted modelling has been produced (yet to be documented). The author has changed/ignored some aspects and adopted the nested RDFXML (Klyne 2010) that has been used by the Claros project at Oxford University and also linked to external resources (something that the British Museum implementation does not do yet.)

Integration with the British Museum thesauri was reasonably complicated, but was managed through querying their development endpoint and exporting results as CSV. The resulting tables are now available on GitHub and these were imported into our MySQL database for querying and joining with our existing thesauri. The same process was implemented on the AHRC funded Seneschal endpoint ( to obtain their URI structures, and these were also imported and linked to the PAS schema. These identifiers can then be compiled within the RDF that is ultimately produced from the PAS site, alongside the already integrated Nomisma and Pleiades identifiers. Linking to these mentioned resources provides a rich foundation on which to build either within the confines of the PAS website or on third party sites.

As the PAS database is updated in real-time, there are multiple changes daily to the dataset and our RDF is regenerated nightly through a scheduled cronjob calling a script that transforms the SOLR XML and saves it to our server ( provides a list of available files as {date}.rdf or pelagios-{date}.rdf) and also to Amazon S3 for archiving purposes (15 days saved on an incremental basis.)

Consumption of linked data within the PAS website

With the integration of external identifiers into the PAS database schema, the enrichment of resources can be much improved. Prior to LAWDI-2012, the author had integrated data from various resources, usually via the use of Applications Programming Interfaces (APIs), but sometimes via consumption of RDF data. This was achieved by the use of the ARC2 library (Nowak 2011), but this has now been superseded by using the EasyRDF PHP library (Humfrey 2012) and has led to a wider consumption of RDF throughout the site. It is now possible to extract more data for the enrichment of our issuer and ruler biographical pages, for example Augustus, where via the use of identifiers drawn from Nomisma, the British Museum and dbPedia an aggregated biographical page can be produced and presented with dynamic data drawn directly from the PAS database. This principle has also been applied to the coin guides for other periods of British history, with the same enriching effect. Extra information can be gleaned from the structured data returned from dbPedia, with information relating to parents, titles, battle commands and wives readily available. The return on investment and time spent tying these identifiers to our vocabulary and authority lists is therefore apparent! Other examples are also easy to find within the PAS website, for instance, by combining Pleiades and Nomisma identifiers, enriched pages relating to Roman mints can be produced, (for example, see Rome) with images obtained from Flickr when they have been machine tagged appropriately (see Gillies 2012.) Data can also be consumed from the excellent Pelagios project, as shown below in figure 3: