Search (2 results, page 1 of 1)

  • × author_ss:"Willis, C."
  1. White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.06
    0.055012353 = product of:
      0.22004941 = sum of:
        0.22004941 = weight(_text_:judge in 2781) [ClassicSimilarity], result of:
          0.22004941 = score(doc=2781,freq=2.0), product of:
            0.5152282 = queryWeight, product of:
              7.731176 = idf(docFreq=52, maxDocs=44421)
              0.06664293 = queryNorm
            0.42709115 = fieldWeight in 2781, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.731176 = idf(docFreq=52, maxDocs=44421)
              0.0390625 = fieldNorm(doc=2781)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
  2. Willis, C.; Losee, R.M.: ¬A random walk on an ontology : using thesaurus structure for automatic subject indexing (2013) 0.04
    0.04191779 = product of:
      0.08383558 = sum of:
        0.014487808 = weight(_text_:und in 2016) [ClassicSimilarity], result of:
          0.014487808 = score(doc=2016,freq=2.0), product of:
            0.1478073 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.06664293 = queryNorm
            0.098018214 = fieldWeight in 2016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.03125 = fieldNorm(doc=2016)
        0.06934777 = weight(_text_:headings in 2016) [ClassicSimilarity], result of:
          0.06934777 = score(doc=2016,freq=2.0), product of:
            0.32337824 = queryWeight, product of:
              4.8524013 = idf(docFreq=942, maxDocs=44421)
              0.06664293 = queryNorm
            0.21444786 = fieldWeight in 2016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8524013 = idf(docFreq=942, maxDocs=44421)
              0.03125 = fieldNorm(doc=2016)
      0.5 = coord(2/4)
    
    Abstract
    Relationships between terms and features are an essential component of thesauri, ontologies, and a range of controlled vocabularies. In this article, we describe ways to identify important concepts in documents using the relationships in a thesaurus or other vocabulary structures. We introduce a methodology for the analysis and modeling of the indexing process based on a weighted random walk algorithm. The primary goal of this research is the analysis of the contribution of thesaurus structure to the indexing process. The resulting models are evaluated in the context of automatic subject indexing using four collections of documents pre-indexed with 4 different thesauri (AGROVOC [UN Food and Agriculture Organization], high-energy physics taxonomy [HEP], National Agricultural Library Thesaurus [NALT], and medical subject headings [MeSH]). We also introduce a thesaurus-centric matching algorithm intended to improve the quality of candidate concepts. In all cases, the weighted random walk improves automatic indexing performance over matching alone with an increase in average precision (AP) of 9% for HEP, 11% for MeSH, 35% for NALT, and 37% for AGROVOC. The results of the analysis support our hypothesis that subject indexing is in part a browsing process, and that using the vocabulary and its structure in a thesaurus contributes to the indexing process. The amount that the vocabulary structure contributes was found to differ among the 4 thesauri, possibly due to the vocabulary used in the corresponding thesauri and the structural relationships between the terms. Each of the thesauri and the manual indexing associated with it is characterized using the methods developed here.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus