Introduction to the MoTIF Thesaurus

The purpose of the thesaurus

MoTIF, the pilot thesaurus of Irish folklore, is intended to accompany the Thesaurus Construction Guidelines: An Introduction to Thesauri and Guidelines on Their Construction and act as a sample thesaurus and demonstration of the international principles and best practices outlined in that document. The pilot thesaurus should not be considered a tool for indexing and retrieval of content. Rather, it can act as a base from which such a tool can be constructed.

The scope of the thesaurus

The core topics of the pilot thesaurus of Irish folklore include nature, animals, people, their occupations and activities primarily as they relate to the sustenance and support of the home and community. The thesaurus also covers the genres of folklore. As a pilot thesaurus the coverage of these topics should not be considered exhaustive but rather a sampling of the subject matter.

The total number of indexing terms

Just over 350 terms were selected from vocabulary resources at the beginning of the project. Following facet analysis additional terms were added to aid in the organisation of concepts bringing the number of preferred terms in the thesaurus to 522. The pilot thesaurus contains over 50 non- preferred terms.

Vocabulary control – choice and form of preferred terms

Terms were selected following the guidelines laid down by ISO 25964-1: Information and Documentation. Part 1: Thesauri for Information Retrieval. Vocabulary resources consulted included reference materials, journals and seminal writings on the subject of folklore. These included Seán Ó Suilleabháin’s A Handbook of Irish Folklore, as well as some article titles from Bealoideas, the Journal of the Folklore of Ireland Society. Other sources included Alan Dundes The Study of Folklore (1965) as it provided a list of many forms of folklore. A comprehensive thesaurus would be extended to cover additional vocabulary resources and would include consultation with experts.

The pilot thesaurus of Irish Folklore follows the guidelines set out by ISO 25964-1. This includes the recommendations on forms of entry for the English language:

  • Nouns and noun phrases. These are the most important terms to consider when constructing a thesaurus. In English, count nouns (dogs, cats) will appear in the plural in the thesaurus. Non-count nouns (livestock) appear in the singular.
  • Verbs took the gerund or verbal noun forms, e.g. fishing, hunting. In a thesaurus, verbs should never be in the infinitive (‘to hunt’) or the participle (‘hunted’) forms.
  • Adjectives were avoided in a thesaurus unless they were deemed significant to the subject.
  • Adverbs were avoided in the pilot thesaurus.
  • Articles (a, the) should be avoided unless they are an integral part of the term. If articles are used, equivalence relationships should be set up between the preferred term (which uses the article) and the non-preferred term (which does not use the article).

Where synonyms existed, a preferred term was chosen and non-preferred terms were also recorded. Non-scientific terms were chosen above scientific terms where they appeared as this matched the vocabulary found in most of the vocabulary resources. Definitions were included where any terms were ambiguous and qualifiers were used where necessary.

In line with trends in thesaurus construction and best practice and on the recommendation of international standards, these guidelines and pilot employed the method of ‘splitting’ compound terms where practical and in such a way as to avoid inconsistency, i.e. no more than two split components per compound concept. Where compound concepts were expressed by the combination of simpler terms without ambiguity, they were split. Additionally, where compound objects were not in the core scope of the thesaurus, were too specific for the scope of the thesaurus, where they were rarely used in the collection, and where they contained more than one difference, they were split.

Structure and inter-relationships the standards and rules adopted

The pilot thesaurus of Irish folklore was constructed using fundamental facets as the main divisions and the standard thesaural relationships as outlined in ISO 25964-1. Facet analysis, with facets as top concepts (TTs) in the hierarchies was chosen as this is a more flexible structure which can be more easily updated.

Following initial groupings, review and analysis, the fundamental facets decided on for the pilot thesaurus were:

  • Time
  • Place/Space/Environment
  • Products
  • Activities
  • Processes and Phenomena
  • Events
  • Agents
  • Objects
  • Materials
  • Attributes and Properties
  • Parts

These fundamental facets will be complemented by the facets of Genre and Form as well as Abstract Concepts.

Broader term and narrower term relationships were employed as well as associative relationships. Associative relationships are found primarily across the different hierarchies. The relationships in the pilot thesaurus are intended as demonstrations only and should not be considered exhaustive. The option to add additional associative relationships, including between concepts in the same array, may be considered for a complete thesaurus.

Operational use of the pilot thesaurus

The pilot thesaurus is meant as a stand-alone project and sample thesaurus to accompany the guidelines on thesaurus construction and demonstrate their application. It is not intended as a fully functional tool for indexing and retrieval and should not be considered in this light.

About the MoTIF Project

MoTIF is a collaborative project undertaken by the Digital Repository of Ireland (DRI) and the National Library of Ireland (NLI).

The project aim was to produce guidelines on the construction of thesauri for librarians, archivists, museum professionals and other information professionals. These guidelines will act as a comprehensive introduction to thesauri and provide guidance on the construction of thesauri using facet analysis. The guidelines are illustrated by MoTIF, the pilot thesaurus of Irish folklore.

Thesauri are vital and valuable tools in content discovery, information organisation and retrieval, activities common to all fields including cultural heritage and higher education as well as business and enterprise. Thesauri allow information professionals to represent content in a consistent manner and enable researchers and the public to find this content easily and quickly. These guidelines will give professionals the advice they need to improve their own data practices by adhering to international standards and best practice.

The pilot thesaurus acts as an illustrative guide, providing examples throughout the guidelines, as well as a sample thesaurus and demonstration of the international principles and best practices outlined in the guidelines. It also acts as a demonstration of TemaTres, the open source thesaurus management software used in the project and as a core body of work which can be expanded further into a complete thesaurus.

About the project partners

Digital Repository of Ireland

The Digital Repository of Ireland (DRI) is an interactive, trusted digital repository for social and cultural content held by Irish institutions. By providing a central internet access point and interactive multimedia tools, DRI facilitates engagement with contemporary and historical data, allowing the public, students, and scholars to research Ireland’s cultural heritage and social life in ways never before possible. As a national digital infrastructure, DRI is working with a wide range of institutional stakeholders to link together and preserve Ireland’s rich and varied humanities and social science data.

DRI was launched in 2011 with funding from the Irish Government's PRTLI cycle 5. The Royal Irish Academy is the lead institution in the DRI consortium, which is also composed of the following partners: National University of Ireland Maynooth (NUIM), Trinity College Dublin (TCD), Dublin Institute of Technology (DIT), National University of Ireland Galway (NUIG), and National College of Art and Design (NCAD). The DRI Research Consortium are currently collaborating with a network of cultural, social, academic and industry partners, including the National Library of Ireland (NLI), the National Archives of Ireland (NAI) and RTÉ.

National Library of Ireland

The mission of the National Library of Ireland (NLI) is to collect, preserve, promote and make accessible the documentary and intellectual record of life of Ireland and to contribute to the provision of access to the larger universe of recorded knowledge. The National Library offers an exciting programme of exhibitions, events and learning opportunities for people of all ages and interests. The NLI is driving a forward-looking programme of digitisation, digital preservation, and innovative access, visualization and engagement tools for Ireland’s rich cultural collections. For example, the NLI is one of the main contributors to Vufind, an open-source discovery interface which is used by hundreds of libraries around the world to enhance access to research materials. The NLI also has considerable experience in converting and enhancing metadata for cultural heritage items, including working with Linked Data resources like Freebase, VIAF and DBPedia.

Thesaurus Key

Meaning Abbreviation
Broader Term BT
Narrower Term NT
Related Term RT
Use (a preferred term) USE
Use For (a non-preferred term) UF
Broader Term Generic BTG
Broader Term Instantial BTI
Broader Term Partitive BTP
Broader Term BT
Broader Term BT
Narrower Term Generic NTG
Narrower Term Instantial NTI
Narrower Term Partitive NTP
Top Term TT

Updating and maintenance

As the pilot thesaurus was created as a stand-alone project to demonstrate the application of international standards and best practice in thesaurus construction, it will not be maintained and updated in its present form. A more complete thesaurus would require the establishment of a working group for continuing review into the future.

Export files

This pilot thesaurus acts as an illustrative example of the principles outlined in the guidelines. It is not intended as a fully functional thesaurus. It is possible to export the pilot records but these exports should be used for informational purposes only and the URIs within should not be considered stable.