-
Evans, J.E.: Some external and internal factors affecting users of interactive information systems (1996)
0.05
0.05479776 = product of:
0.10959552 = sum of:
0.022471312 = weight(_text_:und in 6330) [ClassicSimilarity], result of:
0.022471312 = score(doc=6330,freq=2.0), product of:
0.15283768 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.068911016 = queryNorm
0.14702731 = fieldWeight in 6330, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.046875 = fieldNorm(doc=6330)
0.087124206 = weight(_text_:human in 6330) [ClassicSimilarity], result of:
0.087124206 = score(doc=6330,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 6330, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=6330)
0.5 = coord(2/4)
- Abstract
- This contribution reports the results of continuing research in human-information system interactions. Following training and experience with an electronic information retrieval system novice and experienced subject groups responded to questions ranking their value assessments of 7 attributes of information sources in relation to 15 factors describing the search process. In general, novice users were more heavily influenced by the process factors (negative influences) than by the positive attributes of information qualities. Experienced users, while still concerned with process factors, were more strongly influenced by the qualitative information attributes. The specific advantages and contributions of this research are several: higher dimensionality of measured factors and attributes (15 x 7); higher granularity of analysis using a 7 value metric in a closed-end Likert scale; development of bi-directional, firced-choice influence vectors; and a larger sample size (N=186) than previously reported in the literature
- Source
- Herausforderungen an die Informationswirtschaft: Informationsverdichtung, Informationsbewertung und Datenvisualisierung. Proceedings des 5. Internationalen Symposiums für Informationswissenschaft (ISI'96), Humboldt-Universität zu Berlin, 17.-19. Oktober 1996. Hrsg.: J. Krause u.a
-
Chen, H.; Martinez, J.; Kirchhoff, A.; Ng, T.D.; Schatz, B.R.: Alleviating search uncertainty through concept associations : automatic indexing, co-occurence analysis, and parallel computing (1998)
0.05
0.05479776 = product of:
0.10959552 = sum of:
0.022471312 = weight(_text_:und in 6202) [ClassicSimilarity], result of:
0.022471312 = score(doc=6202,freq=2.0), product of:
0.15283768 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.068911016 = queryNorm
0.14702731 = fieldWeight in 6202, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.046875 = fieldNorm(doc=6202)
0.087124206 = weight(_text_:human in 6202) [ClassicSimilarity], result of:
0.087124206 = score(doc=6202,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 6202, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=6202)
0.5 = coord(2/4)
- Abstract
- In this article, we report research on an algorithmic approach to alleviating search uncertainty in a large information space. Grounded on object filtering, automatic indexing, and co-occurence analysis, we performed a large-scale experiment using a parallel supercomputer (SGI Power Challenge) to analyze 400.000+ abstracts in an INSPEC computer engineering collection. Two system-generated thesauri, one based on a combined object filtering and automatic indexing method, and the other based on automatic indexing only, were compaed with the human-generated INSPEC subject thesaurus. Our user evaluation revealed that the system-generated thesauri were better than the INSPEC thesaurus in 'concept recall', but in 'concept precision' the 3 thesauri were comparable. Our analysis also revealed that the terms suggested by the 3 thesauri were complementary and could be used to significantly increase 'variety' in search terms the thereby reduce search uncertainty
- Theme
- Konzeption und Anwendung des Prinzips Thesaurus
-
Wilbur, W.J.: Human subjectivity and performance limits in document retrieval (1999)
0.05
0.05082245 = product of:
0.2032898 = sum of:
0.2032898 = weight(_text_:human in 5539) [ClassicSimilarity], result of:
0.2032898 = score(doc=5539,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.67550725 = fieldWeight in 5539, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.109375 = fieldNorm(doc=5539)
0.25 = coord(1/4)
-
Wilbur, W.J.: Human subjectivity and performance limits in document retrieval (1996)
0.05
0.050301187 = product of:
0.20120475 = sum of:
0.20120475 = weight(_text_:human in 6675) [ClassicSimilarity], result of:
0.20120475 = score(doc=6675,freq=6.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.6685788 = fieldWeight in 6675, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0625 = fieldNorm(doc=6675)
0.25 = coord(1/4)
- Abstract
- Test sets for the document retrieval task composed of human relevance judgments have been constructed that allow one to compare human performance directly with that of automatic methods and that place absolute limits on performance by any method. Current retrieval systems are found to generate only about half of the information allowed by these absolute limits. The data suggests that most of the improvement that could be achieved consistent with these limits can only be achieved by incorporating specific subject information into retrieval systems
-
Spink, A.; Goodrum, A.: ¬A study of search intermediary working notes : implications for IR system design (1996)
0.04
0.035936903 = product of:
0.14374761 = sum of:
0.14374761 = weight(_text_:human in 50) [ClassicSimilarity], result of:
0.14374761 = score(doc=50,freq=4.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.47765577 = fieldWeight in 50, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0546875 = fieldNorm(doc=50)
0.25 = coord(1/4)
- Abstract
- Reports findings from an explanatory study investigating working notes created during encoding and external storage (EES) processes, by human search intermediaries using a Boolean information retrieval systems. Analysis of 221 sets of working notes created by human search intermediaries revealed extensive use of EES processes and the creation of working notes of textual, numerical and graphical entities. Nearly 70% of recorded working noted were textual/numerical entities, nearly 30 were graphical entities and 0,73% were indiscernible. Segmentation devices were also used in 48% of the working notes. The creation of working notes during the EES processes was a fundamental element within the mediated, interactive information retrieval process. Discusses implications for the design of interfaces to support users' EES processes and further research
-
Dunlop, M.D.; Johnson, C.W.; Reid, J.: Exploring the layers of information retrieval evaluation (1998)
0.04
0.035936903 = product of:
0.14374761 = sum of:
0.14374761 = weight(_text_:human in 4762) [ClassicSimilarity], result of:
0.14374761 = score(doc=4762,freq=4.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.47765577 = fieldWeight in 4762, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0546875 = fieldNorm(doc=4762)
0.25 = coord(1/4)
- Abstract
- Presents current work on modelling interactive information retrieval systems and users' interactions with them. Analyzes the papers in this special issue in the context of evaluation in information retrieval (IR) by examining the different layers at which IR use could be evaluated. IR poses the double evaluation problem of evaluating both the underlying system effectiveness and the overall ability of the system to aid users. The papers look at different issues in combining human-computer interaction (HCI) research with IR research and provide insights into the problem of evaluating the information seeking process
- Footnote
- Contribution to a special section of articles related to human-computer interaction and information retrieval
-
Hou, Y.; Pascale, A.; Carnerero-Cano, J.; Sattigeri, P.; Tchrakian, T.; Marinescu, R.; Daly, E.; Padhi, I.: WikiContradict : a benchmark for evaluating LLMs on real-world knowledge conflicts from Wikipedia (2024)
0.03
0.031438243 = product of:
0.12575297 = sum of:
0.12575297 = weight(_text_:human in 2368) [ClassicSimilarity], result of:
0.12575297 = score(doc=2368,freq=6.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.41786176 = fieldWeight in 2368, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0390625 = fieldNorm(doc=2368)
0.25 = coord(1/4)
- Abstract
- Retrieval-augmented generation (RAG) has emerged as a promising solution to mitigate the limitations of large language models (LLMs), such as hallucinations and outdated information. However, it remains unclear how LLMs handle knowledge conflicts arising from different augmented retrieved passages, especially when these passages originate from the same source and have equal trustworthiness. In this work, we conduct a comprehensive evaluation of LLM-generated answers to questions that have varying answers based on contradictory passages from Wikipedia, a dataset widely regarded as a high-quality pre-training resource for most LLMs. Specifically, we introduce WikiContradict, a benchmark consisting of 253 high-quality, human-annotated instances designed to assess LLM performance when augmented with retrieved passages containing real-world knowledge conflicts. We benchmark a diverse range of both closed and open-source LLMs under different QA scenarios, including RAG with a single passage, and RAG with 2 contradictory passages. Through rigorous human evaluations on a subset of WikiContradict instances involving 5 LLMs and over 3,500 judgements, we shed light on the behaviour and limitations of these models. For instance, when provided with two passages containing contradictory facts, all models struggle to generate answers that accurately reflect the conflicting nature of the context, especially for implicit conflicts requiring reasoning. Since human evaluation is costly, we also introduce an automated model that estimates LLM performance using a strong open-source language model, achieving an F-score of 0.8. Using this automated metric, we evaluate more than 1,500 answers from seven LLMs across all WikiContradict instances. To facilitate future work, we release WikiContradict on: https://ibm.biz/wikicontradict.
-
Spink, A.: Term relevance feedback and mediated database searching : implications for information retrieval practice and systems design (1995)
0.03
0.03080306 = product of:
0.12321224 = sum of:
0.12321224 = weight(_text_:human in 1824) [ClassicSimilarity], result of:
0.12321224 = score(doc=1824,freq=4.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.40941924 = fieldWeight in 1824, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=1824)
0.25 = coord(1/4)
- Abstract
- Research into both the algorithmic and human approaches to information retrieval is required to improve information retrieval system design and database searching effectiveness. Uses the human approach to examine the sources and effectiveness of search terms selected during mediated interactive information retrieval. Focuses on determining the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. Results show that termns selected from particular database fields of retrieved items during term relevance feedback (TRF) were more effective than search terms from the intermediarity, database thesauri or users' domain knowledge during the interaction, but not as effective as terms from the users' written question statements. Implications for the design and testing of automatic relevance feedback techniques that place greater emphasis on these sources and the practice of database searching are also discussed
-
Newby, G.B.: Cognitive space and information space (2001)
0.03
0.03080306 = product of:
0.12321224 = sum of:
0.12321224 = weight(_text_:human in 977) [ClassicSimilarity], result of:
0.12321224 = score(doc=977,freq=4.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.40941924 = fieldWeight in 977, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=977)
0.25 = coord(1/4)
- Abstract
- This article works towards realization of exosomatic memory for information systems. In exosomatic memory systems, the information spaces of systems will be consistent with the cognitive spaces of their human users. A method for measuring concept relations in human cognitive space is presented: the paired comparison survey with Principal Components Analysis. A study to measure the cognitive spaces of 16 research participants is presented. Items measured include relations among seven TREC topic statements as well as 17 concepts from the topic statements. A method for automatically generating information spaces from document collections is presented that uses term cooccurrence, eigensystems analysis, and Principal Components Analysis. The extent of similarity between the cognitive spaces and the information spaces, which were derived independently from each other, is measured. A strong similarity between the information spaces and the cognitive spaces are found, indicating that the methods described may have good utility for working towards information systems that operate as exosomatic memories
-
Saracevic, T.: Individual differences in organizing, searching and retrieving information (1991)
0.03
0.029041402 = product of:
0.11616561 = sum of:
0.11616561 = weight(_text_:human in 3691) [ClassicSimilarity], result of:
0.11616561 = score(doc=3691,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.38600415 = fieldWeight in 3691, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0625 = fieldNorm(doc=3691)
0.25 = coord(1/4)
- Abstract
- Synthesises the major findings of several decades of research into the magnitude of individual deffirences in information retrieval related tasks and suggests implications for practice and design. The study is related to a series of studies of human aspects and cognitive decision making in information seeking, searching and retrieving
-
Draper, S.W.; Dunlop, M.D.: New IR - new evaluation : the impact of interaction and multimedia on information retrieval and its evaluation (1997)
0.03
0.025669215 = product of:
0.10267686 = sum of:
0.10267686 = weight(_text_:human in 3462) [ClassicSimilarity], result of:
0.10267686 = score(doc=3462,freq=4.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.34118268 = fieldWeight in 3462, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0390625 = fieldNorm(doc=3462)
0.25 = coord(1/4)
- Abstract
- The field of information retrieval (IR) traditionally addressed the problem of retrieving text documents from large collections by full text indexing of words. It has always been characterised by a strong focus on evaluation to compare the performance of alternative designs. the emergence into widespread use both of multimedia and of interactive user interfaces has extensive implications for this field and the evaluation methods on which it depends. discusses what we currently understand about those implications. The 'system' being measured must be expanded to include the human users, whose behaviour has a large effect on overall retrieval success, which now depends upon sessions of many retrieval cycles, rather than a single transaction. Multimedia raise issues not only of how users might specify a query in the same medium (e.g. sketch the kind of picture they want), but of cross-medium retrieval. Current explorations in IR evaluation show diversity along at least 2 dimensions. One is that between comprehensive models that have a place for every possible relevant factor, and lightweight methods. The other is that between highly standardised workbench tests avoiding human users vs. workplace studies
-
Beaulieu, M.; Robertson, S.; Rasmussen, E.: Evaluating interactive systems in TREC (1996)
0.03
0.025411226 = product of:
0.1016449 = sum of:
0.1016449 = weight(_text_:human in 3066) [ClassicSimilarity], result of:
0.1016449 = score(doc=3066,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.33775362 = fieldWeight in 3066, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.0546875 = fieldNorm(doc=3066)
0.25 = coord(1/4)
- Abstract
- The TREC experiments were designed to allow large-scale laboratory testing of information retrieval techniques. As the experiments have progressed, groups within TREC have become increasingly interested in finding ways to allow user interaction without invalidating the experimental design. The development of an 'interactive track' within TREC to accomodate user interaction has required some modifications in the way the retrieval task is designed. In particular there is a need to simulate a realistic interactive searching task within a laboratory environment. Through successive interactive studies in TREC, the Okapi team at City University London has identified methodological issues relevant to this process. A diagnostic experiment was conducted as a follow-up to TREC searches which attempted to isolate the human nad automatic contributions to query formulation and retrieval performance
-
Kutlu, M.; Elsayed, T.; Lease, M.: Intelligent topic selection for low-cost information retrieval evaluation : a new perspective on deep vs. shallow judging (2018)
0.03
0.025150593 = product of:
0.10060237 = sum of:
0.10060237 = weight(_text_:human in 92) [ClassicSimilarity], result of:
0.10060237 = score(doc=92,freq=6.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.3342894 = fieldWeight in 92, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.03125 = fieldNorm(doc=92)
0.25 = coord(1/4)
- Abstract
- While test collections provide the cornerstone for Cranfield-based evaluation of information retrieval (IR) systems, it has become practically infeasible to rely on traditional pooling techniques to construct test collections at the scale of today's massive document collections (e.g., ClueWeb12's 700M+ Webpages). This has motivated a flurry of studies proposing more cost-effective yet reliable IR evaluation methods. In this paper, we propose a new intelligent topic selection method which reduces the number of search topics (and thereby costly human relevance judgments) needed for reliable IR evaluation. To rigorously assess our method, we integrate previously disparate lines of research on intelligent topic selection and deep vs. shallow judging (i.e., whether it is more cost-effective to collect many relevance judgments for a few topics or a few judgments for many topics). While prior work on intelligent topic selection has never been evaluated against shallow judging baselines, prior work on deep vs. shallow judging has largely argued for shallowed judging, but assuming random topic selection. We argue that for evaluating any topic selection method, ultimately one must ask whether it is actually useful to select topics, or should one simply perform shallow judging over many topics? In seeking a rigorous answer to this over-arching question, we conduct a comprehensive investigation over a set of relevant factors never previously studied together: 1) method of topic selection; 2) the effect of topic familiarity on human judging speed; and 3) how different topic generation processes (requiring varying human effort) impact (i) budget utilization and (ii) the resultant quality of judgments. Experiments on NIST TREC Robust 2003 and Robust 2004 test collections show that not only can we reliably evaluate IR systems with fewer topics, but also that: 1) when topics are intelligently selected, deep judging is often more cost-effective than shallow judging in evaluation reliability; and 2) topic familiarity and topic generation costs greatly impact the evaluation cost vs. reliability trade-off. Our findings challenge conventional wisdom in showing that deep judging is often preferable to shallow judging when topics are selected intelligently.
-
Hersh, W.; Pentecost, J.; Hickam, D.: ¬A task-oriented approach to information retrieval evaluation : overview and design for empirical testing (1996)
0.02
0.021781052 = product of:
0.087124206 = sum of:
0.087124206 = weight(_text_:human in 3069) [ClassicSimilarity], result of:
0.087124206 = score(doc=3069,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 3069, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=3069)
0.25 = coord(1/4)
- Abstract
- As retrieval system become more oriented towards end-users, there is an increasing need for improved methods to evaluate their effectiveness. We performed a task-oriented assessment of 2 MEDLINE searching systems, one which promotes traditional Boolean searching on human-indexed thesaurus terms and the other natural language searching on words in the title, abstracts and indexing terms. Medical students were randomized to one of the 2 systems and given clinical questions to answer. The students were able to use each system successfully, with no significant differences in questions correctly answered, time taken, relevant articles retrieved, or user satisfaction between the systems. This approach to evaluation was successful in measuring effectiveness of system use and demonstrates that both types of systems can be used equally well with minimal training
-
Hersh, W.R.; Pentecost, J.; Hickam, D.H.: ¬A task-oriented approach to retrieval system evaluation (1995)
0.02
0.021781052 = product of:
0.087124206 = sum of:
0.087124206 = weight(_text_:human in 3935) [ClassicSimilarity], result of:
0.087124206 = score(doc=3935,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 3935, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=3935)
0.25 = coord(1/4)
- Abstract
- There is a need for improved methods to evaluate the effectiveness of end user information retrieval systems. Performs a task oriented assessment of 2 MEDLINE searching systems, one which promotes Boolean searching on human indexed thesaurus terms and the other natural language searching on words in the title, abstract, and indexing terms. Each was used by medical students to answer clinical questions. Students were able to use each system successfully, with no significant differences in questions correctly answered, time taken, relevant articles retrieved, or user satisfaction between the systems. This approach to evaluation was successful in measuring effectiveness of system use and demonstrates that both types of systems can be used equally well with minimal training
-
Bodoff, D.; Kambil, A.: Partial coordination : II. A preliminary evaluation and failure analysis (1998)
0.02
0.021781052 = product of:
0.087124206 = sum of:
0.087124206 = weight(_text_:human in 3323) [ClassicSimilarity], result of:
0.087124206 = score(doc=3323,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 3323, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=3323)
0.25 = coord(1/4)
- Abstract
- Partial coordination is a new method for cataloging documents for subject access. It is especially designed to enhance the precision of document searches in online environments. This article reports a preliminary evaluation of partial coordination that shows promising results compared with full-text retrieval. We also report the difficulties in empirically evaluating the effectiveness of automatic full-text retrieval in contrast to mixed methods such as partial coordination which combine human cataloging with computerized retrieval. Based on our study, we propose research in this area will substantially benefit from a common framework for failure analysis and a common data set. This will allow information retrieval researchers adapting 'library style'cataloging to large electronic document collections, as well as those developing automated or mixed methods, to directly compare their proposals for indexing and retrieval. This article concludes by suggesting guidelines for constructing such as testbed
-
Yerbury, H.; Parker, J.: Novice searchers' use of familiar structures in searching bibliographic information retrieval systems (1998)
0.02
0.021781052 = product of:
0.087124206 = sum of:
0.087124206 = weight(_text_:human in 3874) [ClassicSimilarity], result of:
0.087124206 = score(doc=3874,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 3874, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=3874)
0.25 = coord(1/4)
- Abstract
- Reports results of a study of the use of metaphors as problem solving mechanisms by novice searchers of bibliographic databases. Metaphors provide a framework or 'familiar structure' of credible associations within which relationships in other domains may be considered. 28 students taking an undergraduate course in information retrieval at Sydney University of Technology, were recorded as they 'talked through' a search on a bibliographic retrieval system. The transcripts were analyzed using conventional methods and the NUDIST software package for qualitative research. A range of metaphors was apparent from the language use by students in the search process. Those which predominated were: a journey; human interaction; a building or matching process; a problem solving process, and a search for a quantity. Many of the studentes experiencing the interaction as a problem solving process or a search for quantity perceived the outcomes as successful. Concludes that when memory for operating methods and procedures is incomplete an unconscious approach through the use of a conceptual system which is consonant with the task at hand may also lead to success in bibliographic searching
-
McDonald, S.; Stevenson, R.J.: Navigation in hyperspace : an evaluation of the effects of navigational tools and subject matter expertise on browsing and information retrieval in hypertext (1998)
0.02
0.021781052 = product of:
0.087124206 = sum of:
0.087124206 = weight(_text_:human in 4760) [ClassicSimilarity], result of:
0.087124206 = score(doc=4760,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 4760, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=4760)
0.25 = coord(1/4)
- Footnote
- Contribution to a special section devoted to human-computer interaction and information retrieval
-
MacCall, S.L.; Cleveland, A.D.: ¬A relevance-based quantitative measure for Internet information retrieval evaluation (1999)
0.02
0.021781052 = product of:
0.087124206 = sum of:
0.087124206 = weight(_text_:human in 689) [ClassicSimilarity], result of:
0.087124206 = score(doc=689,freq=2.0), product of:
0.30094394 = queryWeight, product of:
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.068911016 = queryNorm
0.2895031 = fieldWeight in 689, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3671384 = idf(docFreq=1531, maxDocs=44421)
0.046875 = fieldNorm(doc=689)
0.25 = coord(1/4)
- Abstract
- An important indicator of a maturating Internet is the development of metrics for its evaluation as a practical tool for enduser information retrieval. However, the Internet presents specific problems for traditional IR measures, such as the need to deal with the variety of classes of retrieval tools. This paper presents a metric for comparing the performance of common classes of Internet information retrieval tool, including human indexed catalogs of web resources and automatically indexed databases of web pages. The metric uses a relevance-based quantitative measure to compare the performance of endusers using these Internet information retrieval tools. The benefit of the proposed metric is that it is relevance-based (using enduser relevance judgments), and it facilitates the comparison of the performance of different classes of IIR tools
-
Mandl, T.: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval (2006)
0.02
0.019817837 = product of:
0.07927135 = sum of:
0.07927135 = weight(_text_:und in 975) [ClassicSimilarity], result of:
0.07927135 = score(doc=975,freq=14.0), product of:
0.15283768 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.068911016 = queryNorm
0.51866364 = fieldWeight in 975, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=975)
0.25 = coord(1/4)
- Abstract
- Im Information Retrieval tragen Evaluierungsinitiativen erheblich zur empirisch fundierten Forschung bei. Mit umfangreichen Kollektionen und Aufgaben unterstützen sie die Standardisierung und damit die Systementwicklung. Die wachsenden Anforderungen hinsichtlich der Korpora und Anwendungsszenarien führten zu einer starken Diversifizierung innerhalb der Evaluierungsinitiativen. Dieser Artikel gibt einen Überblick über den aktuellen Stand der wichtigsten Evaluierungsinitiativen und neuen Trends.
- Source
- Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker