-
Pal, S.; Mitra, M.; Kamps, J.: Evaluation effort, reliability and reusability in XML retrieval (2011)
0.08
0.077799216 = product of:
0.31119686 = sum of:
0.31119686 = weight(_text_:judge in 197) [ClassicSimilarity], result of:
0.31119686 = score(doc=197,freq=4.0), product of:
0.5152282 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.06664293 = queryNorm
0.6039981 = fieldWeight in 197, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=197)
0.25 = coord(1/4)
- Abstract
- The Initiative for the Evaluation of XML retrieval (INEX) provides a TREC-like platform for evaluating content-oriented XML retrieval systems. Since 2007, INEX has been using a set of precision-recall based metrics for its ad hoc tasks. The authors investigate the reliability and robustness of these focused retrieval measures, and of the INEX pooling method. They explore four specific questions: How reliable are the metrics when assessments are incomplete, or when query sets are small? What is the minimum pool/query-set size that can be used to reliably evaluate systems? Can the INEX collections be used to fairly evaluate "new" systems that did not participate in the pooling process? And, for a fixed amount of assessment effort, would this effort be better spent in thoroughly judging a few queries, or in judging many queries relatively superficially? The authors' findings validate properties of precision-recall-based metrics observed in document retrieval settings. Early precision measures are found to be more error-prone and less stable under incomplete judgments and small topic-set sizes. They also find that system rankings remain largely unaffected even when assessment effort is substantially (but systematically) reduced, and confirm that the INEX collections remain usable when evaluating nonparticipating systems. Finally, they observe that for a fixed amount of effort, judging shallow pools for many queries is better than judging deep pools for a smaller set of queries. However, when judging only a random sample of a pool, it is better to completely judge fewer topics than to partially judge many topics. This result confirms the effectiveness of pooling methods.
-
Voorbij, H.: ¬Een goede titel behoeft geen trefwoord, of toch wel? : een vergelijkend oderzoek titelwoorden - trefwoorden (1997)
0.07
0.07431666 = product of:
0.29726663 = sum of:
0.29726663 = weight(_text_:headings in 2446) [ClassicSimilarity], result of:
0.29726663 = score(doc=2446,freq=12.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.91925365 = fieldWeight in 2446, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.0546875 = fieldNorm(doc=2446)
0.25 = coord(1/4)
- Abstract
- A recent survey at the Royal Library in the Netherlands showed that subject headings are more efficient than title keywords for retrieval purposes. 475 Dutch publications were selected at random and assigned subject headings. The study showed that subject headings provided additional useful information in 56% of titles. Subsequent searching of the library's online catalogue showed that 88% of titles were retrieved via subject headings against 57% through title keywords. Further precision may be achieved with the help of indexing staff, but at considerable cost
- Footnote
- Übers. d. Titels: A good title has no need of subject headings, or does it?: a comparative study of title keywords against subject headings
-
Maglaughlin, K.L.; Sonnenwald, D.H.: User perspectives an relevance criteria : a comparison among relevant, partially relevant, and not-relevant judgements (2002)
0.07
0.06601483 = product of:
0.2640593 = sum of:
0.2640593 = weight(_text_:judge in 201) [ClassicSimilarity], result of:
0.2640593 = score(doc=201,freq=2.0), product of:
0.5152282 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.06664293 = queryNorm
0.5125094 = fieldWeight in 201, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.046875 = fieldNorm(doc=201)
0.25 = coord(1/4)
- Abstract
- In this issue Maglaughin and Sonnenwald provided 12 graduate students with searches related to the student's work and asked them to judge the twenty most recent retrieved representations by highlighting passages thought to contribute to relevance, marking out passages detracting from relevance, and providing a relevant, partially relevant or relevant judgement on each. By recorded interview they were asked about how these decisions were made and to describe the three classes of judgement. The union of criteria identified in past studies did not seem to fully capture the information supplied so a new set was produced and coding agreement found to be adequate. Twenty-nine criteria were identified and grouped into six categories based upon the focus of the criterion. Multiple criteria are used for most judgements, and most criteria may have either a positive or negative effect. Content was the most frequently mentioned criterion.
-
Voorbij, H.: Titelwoorden - trefwoorden : een vergelijkend onderzoek (1997)
0.06
0.060679298 = product of:
0.24271719 = sum of:
0.24271719 = weight(_text_:headings in 4175) [ClassicSimilarity], result of:
0.24271719 = score(doc=4175,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.7505675 = fieldWeight in 4175, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.109375 = fieldNorm(doc=4175)
0.25 = coord(1/4)
- Footnote
- Übers. d. Titels: Title words - subject headings: a comparative research
-
Schabas, A.H.: ¬A comparative evaluation of the retrieval effectiveness of titles, Library of Congress Subject Headings and PRECIS strings for computer searching of UK MARC data (1979)
0.05
0.052010827 = product of:
0.2080433 = sum of:
0.2080433 = weight(_text_:headings in 5276) [ClassicSimilarity], result of:
0.2080433 = score(doc=5276,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.64334357 = fieldWeight in 5276, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.09375 = fieldNorm(doc=5276)
0.25 = coord(1/4)
-
Drabenstott, K.M.; Vizine-Goetz, D.: Using subject headings for online retrieval : theory, practice and potential (1994)
0.05
0.0450427 = product of:
0.1801708 = sum of:
0.1801708 = weight(_text_:headings in 386) [ClassicSimilarity], result of:
0.1801708 = score(doc=386,freq=6.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.5571519 = fieldWeight in 386, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.046875 = fieldNorm(doc=386)
0.25 = coord(1/4)
- Abstract
- Using subject headings for Online Retrieval is an indispensable tool for online system desingners who are developing new systems or refining exicting ones. The book describes subject analysis and subject searching in online catalogs, including the limitations of retrieval, and demonstrates how such limitations can be overcome through system design and programming. The book describes the Library of Congress Subject headings system and system characteristics, shows how information is stored in machine readable files, and offers examples of and recommendations for successful methods. Tables are included to support these recommendations, and diagrams, graphs, and bar charts are used to provide results of data analyses.
-
Byrne, J.R.: Relative effectiveness of titles, abstracts, and subject headings for machine retrieval from the COMPENDEX services (1975)
0.04
0.042906743 = product of:
0.17162697 = sum of:
0.17162697 = weight(_text_:headings in 1603) [ClassicSimilarity], result of:
0.17162697 = score(doc=1603,freq=4.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.5307314 = fieldWeight in 1603, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.0546875 = fieldNorm(doc=1603)
0.25 = coord(1/4)
- Abstract
- We have investigated the relative merits of searching on titles, subject headings, abstracts, free-language terms, and combinations of these elements. The COMPENDEX data base was used for this study since it combined all of these data elements of interest. In general, the results obtained from the experiments indicate that, as expected, titles alone are not satisfactory for efficient retrieval. The combination of titles and abstracts came the closest to 100% retrieval, with searching of abstracts alone doing almost as well. Indexer input, although necessary for 100% retrieval in almost all cases, was found to be relatively unimportant
-
Brown, M.E.: By any other name : accounting for failure in the naming of subject categories (1995)
0.03
0.030339649 = product of:
0.121358596 = sum of:
0.121358596 = weight(_text_:headings in 5666) [ClassicSimilarity], result of:
0.121358596 = score(doc=5666,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.37528375 = fieldWeight in 5666, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.0546875 = fieldNorm(doc=5666)
0.25 = coord(1/4)
- Abstract
- Research shows that 65-80% of subject search terms fail to match the appropriate subject heading and one third to one half of subject searches result in no references being retrieved. Examines the subject search terms geberated by 82 school and college students in Princeton, NJ, evaluated the match between the named terms and the expected subject headings, proposes an explanation for match failures in relation to 3 invariant properties common to all search terms: concreteness, complexity, and syndeticity. Suggests that match failure is a consequence of developmental naming patterns and that these patterns can be overcome through the use of metacognitive naming skills
-
Schultz Jr., W.N.; Braddy, L.: ¬A librarian-centered study of perceptions of subject terms and controlled vocabulary (2017)
0.03
0.030339649 = product of:
0.121358596 = sum of:
0.121358596 = weight(_text_:headings in 156) [ClassicSimilarity], result of:
0.121358596 = score(doc=156,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.37528375 = fieldWeight in 156, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.0546875 = fieldNorm(doc=156)
0.25 = coord(1/4)
- Abstract
- Controlled vocabulary and subject headings in OPAC records have proven to be useful in improving search results. The authors used a survey to gather information about librarian opinions and professional use of controlled vocabulary. Data from a range of backgrounds and expertise were examined, including academic and public libraries, and technical services as well as public services professionals. Responses overall demonstrated positive opinions of the value of controlled vocabulary, including in reference interactions as well as during bibliographic instruction sessions. Results are also examined based upon factors such as age and type of librarian.
-
Tibbo, H.R.: ¬The epic struggle : subject retrieval from large bibliographic databases (1994)
0.03
0.026005413 = product of:
0.10402165 = sum of:
0.10402165 = weight(_text_:headings in 2247) [ClassicSimilarity], result of:
0.10402165 = score(doc=2247,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.32167178 = fieldWeight in 2247, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.046875 = fieldNorm(doc=2247)
0.25 = coord(1/4)
- Abstract
- Discusses a retrieval study that focused on collection level archival records in the OCLC OLUC, made accessible through the EPIC online search system. Data were also collected from the local OPAC at North Carolina University at Chapel Hill (UNC-CH) in which UNC-CH produced OCLC records are loaded. The chief objective was to explore the retrieval environments in which a random sample of USMARC AMC records produced at UNC-CH were found: specifically to obtain a picture of the density of these databases in regard to each subject heading applied and, more generally, for each records. Key questions were: how many records would be retrieved for each subject heading attached to each of the records; and what was the nature of these subject headings vis a vis the numer of hits associated with them. Results show that large retrieval sets are a potential problem with national bibliographic utilities and that the local and national retrieval environments can vary greatly. The need for specifity in indexing is emphasized
-
McJunkin, M.C.: Precision and recall in title keyword searching (1995)
0.03
0.026005413 = product of:
0.10402165 = sum of:
0.10402165 = weight(_text_:headings in 3419) [ClassicSimilarity], result of:
0.10402165 = score(doc=3419,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.32167178 = fieldWeight in 3419, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.046875 = fieldNorm(doc=3419)
0.25 = coord(1/4)
- Abstract
- Investigates the extent to which title keywords convey subject content and compares the relative effectiveness of searching title keywords using 2 search strategies to examine whether adjacency operators in title keyword searches are effective in improving recall and precision of online searching. Title keywords from a random sample of titles in the field of economics were searched on FirstSearch, using the WorldCat database, which is equivalent in coverage to the OCLC OLUC, with and without adjacency of the keywords specified. The LCSH of the items retrieved were compared with the sample title subject headings to determine the degree of match or relevance and the values for precision and recall were calculated. Results indicated that, when keywords were discipline specific, adjacency operators improved precision with little degradation of recall. Systems that allow positional operators or rank output by proximity of terms may increase search success
-
Abdou, S.; Savoy, J.: Searching in Medline : query expansion and manual indexing evaluation (2008)
0.03
0.026005413 = product of:
0.10402165 = sum of:
0.10402165 = weight(_text_:headings in 3062) [ClassicSimilarity], result of:
0.10402165 = score(doc=3062,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.32167178 = fieldWeight in 3062, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.046875 = fieldNorm(doc=3062)
0.25 = coord(1/4)
- Abstract
- Based on a relatively large subset representing one third of the Medline collection, this paper evaluates ten different IR models, including recent developments in both probabilistic and language models. We show that the best performing IR models is a probabilistic model developed within the Divergence from Randomness framework [Amati, G., & van Rijsbergen, C.J. (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM-Transactions on Information Systems 20(4), 357-389], which result in 170% enhancements in mean average precision when compared to the classical tf idf vector-space model. This paper also reports on our impact evaluations on the retrieval effectiveness of manually assigned descriptors (MeSH or Medical Subject Headings), showing that by including these terms retrieval performance can improve from 2.4% to 13.5%, depending on the underling IR model. Finally, we design a new general blind-query expansion approach showing improved retrieval performances compared to those obtained using the Rocchio approach.
-
Hider, P.: ¬The search value added by professional indexing to a bibliographic database (2017)
0.02
0.021671178 = product of:
0.08668471 = sum of:
0.08668471 = weight(_text_:headings in 4868) [ClassicSimilarity], result of:
0.08668471 = score(doc=4868,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.26805982 = fieldWeight in 4868, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.0390625 = fieldNorm(doc=4868)
0.25 = coord(1/4)
- Abstract
- Gross et al. (2015) have demonstrated that about a quarter of hits would typically be lost to keyword searchers if contemporary academic library catalogs dropped their controlled subject headings. This paper reports on an analysis of the loss levels that would result if a bibliographic database, namely the Australian Education Index (AEI), were missing the subject descriptors and identifiers assigned by its professional indexers, employing the methodology developed by Gross and Taylor (2005), and later by Gross et al. (2015). The results indicate that AEI users would lose a similar proportion of hits per query to that experienced by library catalog users: on average, 27% of the resources found by a sample of keyword queries on the AEI database would not have been found without the subject indexing, based on the Australian Thesaurus of Education Descriptors (ATED). The paper also discusses the methodological limitations of these studies, pointing out that real-life users might still find some of the resources missed by a particular query through follow-up searches, while additional resources might also be found through iterative searching on the subject vocabulary. The paper goes on to describe a new research design, based on a before - and - after experiment, which addresses some of these limitations. It is argued that this alternative design will provide a more realistic picture of the value that professionally assigned subject indexing and controlled subject vocabularies can add to literature searching of a more scholarly and thorough kind.
-
Hider, P.: ¬The search value added by professional indexing to a bibliographic database (2018)
0.02
0.021671178 = product of:
0.08668471 = sum of:
0.08668471 = weight(_text_:headings in 300) [ClassicSimilarity], result of:
0.08668471 = score(doc=300,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.26805982 = fieldWeight in 300, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.0390625 = fieldNorm(doc=300)
0.25 = coord(1/4)
- Abstract
- Gross et al. (2015) have demonstrated that about a quarter of hits would typically be lost to keyword searchers if contemporary academic library catalogs dropped their controlled subject headings. This article reports on an investigation of the search value that subject descriptors and identifiers assigned by professional indexers add to a bibliographic database, namely the Australian Education Index (AEI). First, a similar methodology to that developed by Gross et al. (2015) was applied, with keyword searches representing a range of educational topics run on the AEI database with and without its subject indexing. The results indicated that AEI users would also lose, on average, about a quarter of hits per query. Second, an alternative research design was applied in which an experienced literature searcher was asked to find resources on a set of educational topics on an AEI database stripped of its subject indexing and then asked to search for additional resources on the same topics after the subject indexing had been reinserted. In this study, the proportion of additional resources that would have been lost had it not been for the subject indexing was again found to be about a quarter of the total resources found for each topic, on average.
-
Mandl, T.: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval (2006)
0.02
0.019165568 = product of:
0.07666227 = sum of:
0.07666227 = weight(_text_:und in 975) [ClassicSimilarity], result of:
0.07666227 = score(doc=975,freq=14.0), product of:
0.1478073 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06664293 = queryNorm
0.51866364 = fieldWeight in 975, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=975)
0.25 = coord(1/4)
- Abstract
- Im Information Retrieval tragen Evaluierungsinitiativen erheblich zur empirisch fundierten Forschung bei. Mit umfangreichen Kollektionen und Aufgaben unterstützen sie die Standardisierung und damit die Systementwicklung. Die wachsenden Anforderungen hinsichtlich der Korpora und Anwendungsszenarien führten zu einer starken Diversifizierung innerhalb der Evaluierungsinitiativen. Dieser Artikel gibt einen Überblick über den aktuellen Stand der wichtigsten Evaluierungsinitiativen und neuen Trends.
- Source
- Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
-
Lohmann, H.: Verbesserung der Literatursuche durch Dokumentanreicherung und automatische Inhaltserschließung : Das Projekt 'KASCADE' an der Universitäts- und Landesbibliothek Düsseldorf (1999)
0.02
0.018820217 = product of:
0.07528087 = sum of:
0.07528087 = weight(_text_:und in 2221) [ClassicSimilarity], result of:
0.07528087 = score(doc=2221,freq=6.0), product of:
0.1478073 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06664293 = queryNorm
0.50931764 = fieldWeight in 2221, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.09375 = fieldNorm(doc=2221)
0.25 = coord(1/4)
- Imprint
- Köln : Fachhochschule, Fachbereich Bibliotheks- und Informationswesen
-
Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003)
0.02
0.017743869 = product of:
0.070975475 = sum of:
0.070975475 = weight(_text_:und in 2734) [ClassicSimilarity], result of:
0.070975475 = score(doc=2734,freq=12.0), product of:
0.1478073 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06664293 = queryNorm
0.48018923 = fieldWeight in 2734, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=2734)
0.25 = coord(1/4)
- Abstract
- Die Menge an Daten im Internet steigt weiter rapide an. Damit wächst auch der Bedarf an qualitativ hochwertigen Information Retrieval Diensten zur Orientierung und problemorientierten Suche. Die Entscheidung für die Benutzung oder Beschaffung von Information Retrieval Software erfordert aussagekräftige Evaluierungsergebnisse. Dieser Beitrag stellt neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen vor und zeigt den Trend zu Spezialisierung und Diversifizierung von Evaluierungsstudien, die den Realitätsgrad derErgebnisse erhöhen. DerSchwerpunkt liegt auf dem Retrieval von Fachtexten, Internet-Seiten und Multimedia-Objekten.
- Source
- Information - Wissenschaft und Praxis. 54(2003) H.4, S.203-210
-
Kluck, M.; Winter, M.: Topic-Entwicklung und Relevanzbewertung bei GIRT : ein Werkstattbericht (2006)
0.02
0.017743869 = product of:
0.070975475 = sum of:
0.070975475 = weight(_text_:und in 967) [ClassicSimilarity], result of:
0.070975475 = score(doc=967,freq=12.0), product of:
0.1478073 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06664293 = queryNorm
0.48018923 = fieldWeight in 967, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=967)
0.25 = coord(1/4)
- Abstract
- Der Zusammenhang zwischen Topic-Entwicklung und Relevanzbewertung wird anhand einiger Fallbeispiele aus der CLEF-Evaluierungskampagne 2005 diskutiert. Im fachspezifischen Retrievaltest für multilinguale Systeme wurden die Topics am Dokumentenbestand von GIRT entwickelt. Die Zusammenhänge von Topic-Formulierung und Interpretationsspielräumen bei der Relevanzbewertung werden untersucht.
- Source
- Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
-
Cleverdon, C.W.; Mills, J.: ¬The testing of index language devices (1985)
0.02
0.017336942 = product of:
0.06934777 = sum of:
0.06934777 = weight(_text_:headings in 4643) [ClassicSimilarity], result of:
0.06934777 = score(doc=4643,freq=2.0), product of:
0.32337824 = queryWeight, product of:
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.06664293 = queryNorm
0.21444786 = fieldWeight in 4643, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.8524013 = idf(docFreq=942, maxDocs=44421)
0.03125 = fieldNorm(doc=4643)
0.25 = coord(1/4)
- Abstract
- A landmark event in the twentieth-century development of subject analysis theory was a retrieval experiment, begun in 1957, by Cyril Cleverdon, Librarian of the Cranfield Institute of Technology. For this work he received the Professional Award of the Special Libraries Association in 1962 and the Award of Merit of the American Society for Information Science in 1970. The objective of the experiment, called Cranfield I, was to test the ability of four indexing systems-UDC, Facet, Uniterm, and Alphabetic-Subject Headings-to retrieve material responsive to questions addressed to a collection of documents. The experiment was ambitious in scale, consisting of eighteen thousand documents and twelve hundred questions. Prior to Cranfield I, the question of what constitutes good indexing was approached subjectively and reference was made to assumptions in the form of principles that should be observed or user needs that should be met. Cranfield I was the first large-scale effort to use objective criteria for determining the parameters of good indexing. Its creative impetus was the definition of user satisfaction in terms of precision and recall. Out of the experiment emerged the definition of recall as the percentage of relevant documents retrieved and precision as the percentage of retrieved documents that were relevant. Operationalizing the concept of user satisfaction, that is, making it measurable, meant that it could be studied empirically and manipulated as a variable in mathematical equations. Much has been made of the fact that the experimental methodology of Cranfield I was seriously flawed. This is unfortunate as it tends to diminish Cleverdon's contribu tion, which was not methodological-such contributions can be left to benchmark researchers-but rather creative: the introduction of a new paradigm, one that proved to be eminently productive. The criticism leveled at the methodological shortcomings of Cranfield I underscored the need for more precise definitions of the variables involved in information retrieval. Particularly important was the need for a definition of the dependent variable index language. Like the definitions of precision and recall, that of index language provided a new way of looking at the indexing process. It was a re-visioning that stimulated research activity and led not only to a better understanding of indexing but also the design of better retrieval systems." Cranfield I was followed by Cranfield II. While Cranfield I was a wholesale comparison of four indexing "systems," Cranfield II aimed to single out various individual factors in index languages, called "indexing devices," and to measure how variations in these affected retrieval performance. The following selection represents the thinking at Cranfield midway between these two notable retrieval experiments.
-
Wolff, C.: Leistungsvergleich der Retrievaloberflächen zwischen Web und klassischen Expertensystemen (2001)
0.02
0.016769873 = product of:
0.06707949 = sum of:
0.06707949 = weight(_text_:und in 6870) [ClassicSimilarity], result of:
0.06707949 = score(doc=6870,freq=14.0), product of:
0.1478073 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06664293 = queryNorm
0.4538307 = fieldWeight in 6870, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0546875 = fieldNorm(doc=6870)
0.25 = coord(1/4)
- Abstract
- Die meisten Web-Auftritte der Hosts waren bisher für den Retrieval-Laien gedacht. Im Hintergrund steht dabei das Ziel: mehr Nutzung durch einfacheres Retrieval. Dieser Ansatz steht aber im Konflikt mit der wachsenden Datenmenge und Dokumentgröße, die eigentlich ein immer ausgefeilteres Retrieval verlangen. Häufig wird von Information Professionals die Kritik geäußert, dass die Webanwendungen einen Verlust an Relevanz bringen. Wie weit der Nutzer tatsächlich einen Kompromiss zwischen Relevanz und Vollständigkeit eingehen muss, soll in diesem Beitrag anhand verschiedener Host-Rechner quantifiziert werden
- Series
- Tagungen der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis; 4
- Source
- Information Research & Content Management: Orientierung, Ordnung und Organisation im Wissensmarkt; 23. DGI-Online-Tagung der DGI und 53. Jahrestagung der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. DGI, Frankfurt am Main, 8.-10.5.2001. Proceedings. Hrsg.: R. Schmidt