-
Information ethics : privacy, property, and power (2005)
0.05
0.054119907 = product of:
0.108239815 = sum of:
0.09695539 = weight(_text_:judge in 3392) [ClassicSimilarity], result of:
0.09695539 = score(doc=3392,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.21354558 = fieldWeight in 3392, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.01953125 = fieldNorm(doc=3392)
0.011284425 = weight(_text_:und in 3392) [ClassicSimilarity], result of:
0.011284425 = score(doc=3392,freq=4.0), product of:
0.13024996 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.058726728 = queryNorm
0.086636685 = fieldWeight in 3392, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.01953125 = fieldNorm(doc=3392)
0.5 = coord(2/4)
- BK
- 06.00 / Information und Dokumentation: Allgemeines
- Classification
- 06.00 / Information und Dokumentation: Allgemeines
- Footnote
- Rez. in: JASIST 58(2007) no.2, S.302 (L.A. Ennis):"This is an important and timely anthology of articles "on the normative issues surrounding information control" (p. 11). Using an interdisciplinary approach, Moore's work takes a broad look at the relatively new field of information ethics. Covering a variety of disciplines including applied ethics, intellectual property, privacy, free speech, and more, the book provides information professionals of all kinds with a valuable and thought-provoking resource. Information Ethics is divided into five parts and twenty chapters or articles. At the end of each of the five parts, the editor has included a few "discussion cases," which allows the users to apply what they just read to potential real life examples. Part I, "An Ethical Framework for Analysis," provides readers with an introduction to reasoning and ethics. This complex and philosophical section of the book contains five articles and four discussion cases. All five of the articles are really thought provoking and challenging writings on morality. For instance, in the first article, "Introduction to Moral Reasoning," Tom Regan examines how not to answer a moral question. For example, he thinks using what the majority believes as a means of determining what is and is not moral is flawed. "The Metaphysics of Morals" by Immanuel Kant looks at the reasons behind actions. According to Kant, to be moral one has to do the right thing for the right reasons. By including materials that force the reader to think more broadly and deeply about what is right and wrong, Moore has provided an important foundation and backdrop for the rest of the book. Part II, "Intellectual Property: Moral and Legal Concerns," contains five articles and three discussion cases for tackling issues like ownership, patents, copyright, and biopiracy. This section takes a probing look at intellectual and intangible property from a variety of viewpoints. For instance, in "Intellectual Property is Still Property," Judge Frank Easterbrook argues that intellectual property is no different than physical property and should not be treated any differently by law. Tom Palmer's article, "Are Patents and Copyrights Morally Justified," however, uses historical examples to show how intellectual and physical properties differ.
-
Spink, A.; Greisdorf, H.: Regions and levels : Measuring and mapping users' relevance judgements (2001)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 6586) [ClassicSimilarity], result of:
0.19391078 = score(doc=6586,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 6586, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=6586)
0.25 = coord(1/4)
- Abstract
- The dichotomous bipolar approach to relevance has produced an abundance of information retrieval (M) research. However, relevance studies that include consideration of users' partial relevance judgments are moving to a greater relevance clarity and congruity to impact the design of more effective [R systems. The study reported in this paper investigates the various regions of across a distribution of users' relevance judgments, including how these regions may be categorized, measured, and evaluated. An instrument was designed using four scales for collecting, measuring, and describing enduser relevance judgments. The instrument was administered to 21 end-users who conducted searches on their own information problems and made relevance judgments on a total of 1059 retrieved items. Findings include: (1) overlapping regions of relevance were found to impact the usefulness of precision ratios as a measure of IR system effectiveness, (2) both positive and negative levels of relevance are important to users as they make relevance judgments, (3) topicality was used more to reject rather than accept items as highly relevant, (4) utility was more used to judge items highly relevant, and (5) the nature of relevance judgment distribution suggested a new IR evaluation measure-median effect. Findings suggest that the middle region of a distribution of relevance judgments, also called "partial relevance," represents a key avenue for ongoing study. The findings provide implications for relevance theory, and the evaluation of IR systems
-
Pu, H.-T.; Chuang, S.-L.; Yang, C.: Subject categorization of query terms for exploring Web users' search interests (2002)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 1587) [ClassicSimilarity], result of:
0.19391078 = score(doc=1587,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 1587, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=1587)
0.25 = coord(1/4)
- Abstract
- Subject content analysis of Web query terms is essential to understand Web searching interests. Such analysis includes exploring search topics and observing changes in their frequency distributions with time. To provide a basis for in-depth analysis of users' search interests on a larger scale, this article presents a query categorization approach to automatically classifying Web query terms into broad subject categories. Because a query is short in length and simple in structure, its intended subject(s) of search is difficult to judge. Our approach, therefore, combines the search processes of real-world search engines to obtain highly ranked Web documents based on each unknown query term. These documents are used to extract cooccurring terms and to create a feature set. An effective ranking function has also been developed to find the most appropriate categories. Three search engine logs in Taiwan were collected and tested. They contained over 5 million queries from different periods of time. The achieved performance is quite encouraging compared with that of human categorization. The experimental results demonstrate that the approach is efficient in dealing with large numbers of queries and adaptable to the dynamic Web environment. Through good integration of human and machine efforts, the frequency distributions of subject categories in response to changes in users' search interests can be systematically observed in real time. The approach has also shown potential for use in various information retrieval applications, and provides a basis for further Web searching studies.
-
Nicholson, S.: Bibliomining for automated collection development in a digital library setting : using data mining to discover Web-based scholarly research works (2003)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 2867) [ClassicSimilarity], result of:
0.19391078 = score(doc=2867,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 2867, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=2867)
0.25 = coord(1/4)
- Abstract
- This research creates an intelligent agent for automated collection development in a digital library setting. It uses a predictive model based an facets of each Web page to select scholarly works. The criteria came from the academic library selection literature, and a Delphi study was used to refine the list to 41 criteria. A Perl program was designed to analyze a Web page for each criterion and applied to a large collection of scholarly and nonscholarly Web pages. Bibliomining, or data mining for libraries, was then used to create different classification models. Four techniques were used: logistic regression, nonparametric discriminant analysis, classification trees, and neural networks. Accuracy and return were used to judge the effectiveness of each model an test datasets. In addition, a set of problematic pages that were difficult to classify because of their similarity to scholarly research was gathered and classified using the models. The resulting models could be used in the selection process to automatically create a digital library of Webbased scholarly research works. In addition, the technique can be extended to create a digital library of any type of structured electronic information.
-
White, H.D.: Combining bibliometrics, information retrieval, and relevance theory : part 2: some implications for information science (2007)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 1437) [ClassicSimilarity], result of:
0.19391078 = score(doc=1437,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 1437, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=1437)
0.25 = coord(1/4)
- Abstract
- When bibliometric data are converted to term frequency (tf) and inverse document frequency (idf) values, plotted as pennant diagrams, and interpreted according to Sperber and Wilson's relevance theory (RT), the results evoke major variables of information science (IS). These include topicality, in the sense of intercohesion and intercoherence among texts; cognitive effects of texts in response to people's questions; people's levels of expertise as a precondition for cognitive effects; processing effort as textual or other messages are received; specificity of terms as it affects processing effort; relevance, defined in RT as the effects/effort ratio; and authority of texts and their authors. While such concerns figure automatically in dialogues between people, they become problematic when people create or use or judge literature-based information systems. The difficulty of achieving worthwhile cognitive effects and acceptable processing effort in human-system dialogues explains why relevance is the central concern of IS. Moreover, since relevant communication with both systems and unfamiliar people is uncertain, speakers tend to seek cognitive effects that cost them the least effort. Yet hearers need greater effort, often greater specificity, from speakers if their responses are to be highly relevant in their turn. This theme of mismatch manifests itself in vague reference questions, underdeveloped online searches, uncreative judging in retrieval evaluation trials, and perfunctory indexing. Another effect of least effort is a bias toward topical relevance over other kinds. RT can explain these outcomes as well as more adaptive ones. Pennant diagrams, applied here to a literature search and a Bradford-style journal analysis, can model them. Given RT and the right context, bibliometrics may predict psychometrics.
-
Hartley, J.; Betts, L.: ¬The effects of spacing and titles on judgments of the effectiveness of structured abstracts (2007)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 2325) [ClassicSimilarity], result of:
0.19391078 = score(doc=2325,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 2325, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=2325)
0.25 = coord(1/4)
- Abstract
- Previous research assessing the effectiveness of structured abstracts has been limited in two respects. First, when comparing structured abstracts with traditional ones, investigators usually have rewritten the original abstracts, and thus confounded changes in the layout with changes in both the wording and the content of the text. Second, investigators have not always included the title of the article together with the abstract when asking participants to judge the quality of the abstracts, yet titles alert readers to the meaning of the materials that follow. The aim of this research was to redress these limitations. Three studies were carried out. Four versions of each of four abstracts were prepared. These versions consisted of structured/traditional abstracts matched in content, with and without titles. In Study 1, 64 undergraduates each rated one of these abstracts on six separate rating scales. In Study 2, 225 academics and research workers rated the abstracts electronically, and in Study 3, 252 information scientists did likewise. In Studies 1 and 3, the respondents rated the structured abstracts significantly more favorably than they did the traditional ones, but the presence or absence of titles had no effect on their judgments. In Study 2, no main effects were observed for structure or for titles. The layout of the text, together with the subheadings, contributed to the higher ratings of effectiveness for structured abstracts, but the presence or absence of titles had no clear effects in these experimental studies. It is likely that this spatial organization, together with the greater amount of information normally provided in structured abstracts, explains why structured abstracts are generally judged to be superior to traditional ones.
-
Xu, Y.; Yin, H.: Novelty and topicality in interactive information retrieval (2008)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 2355) [ClassicSimilarity], result of:
0.19391078 = score(doc=2355,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 2355, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=2355)
0.25 = coord(1/4)
- Abstract
- The information science research community is characterized by a paradigm split, with a system-centered cluster working on information retrieval (IR) algorithms and a user-centered cluster working on user behavior. The two clusters rarely leverage each other's insight and strength. One major suggestion from user-centered studies is to treat the relevance judgment of documents as a subjective, multidimensional, and dynamic concept rather than treating it as objective and based on topicality only. This study explores the possibility to enhance users' topicality-based relevance judgment with subjective novelty judgment in interactive IR. A set of systems is developed which differs in the way the novelty judgment is incorporated. In particular, this study compares systems which assume that users' novelty judgment is directed to a certain subtopic area and those which assume that users' novelty judgment is undirected. This study also compares systems which assume that users judge a document based on topicality first and then novelty in a stepwise, noncompensatory fashion and those which assume that users consider topicality and novelty simultaneously and as compensatory to each other. The user study shows that systems assuming directed novelty in general have higher relevance precision, but systems assuming a stepwise judgment process and systems assuming a compensatory judgment process are not significantly different.
-
Hartley, J.; Betts, L.: Revising and polishing a structured abstract : is it worth the time and effort? (2008)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 3362) [ClassicSimilarity], result of:
0.19391078 = score(doc=3362,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 3362, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=3362)
0.25 = coord(1/4)
- Abstract
- Many writers of structured abstracts spend a good deal of time revising and polishing their texts - but is it worth it? Do readers notice the difference? In this paper we report three studies of readers using rating scales to judge (electronically) the clarity of an original and a revised abstract, both as a whole and in its constituent parts. In Study 1, with approximately 250 academics and research workers, we found some significant differences in favor of the revised abstract, but in Study 2, with approximately 210 information scientists, we found no significant effects. Pooling the data from Studies 1 and 2, however, in Study 3, led to significant differences at a higher probability level between the perception of the original and revised abstract as a whole and between the same components as found in Study 1. These results thus indicate that the revised abstract as a whole, as well as certain specific components of it, were judged significantly clearer than the original one. In short, the results of these experiments show that readers can and do perceive differences between original and revised texts - sometimes - and that therefore these efforts are worth the time and effort.
-
Lee, K.C.; Lee, N.; Li, H.: ¬A particle swarm optimization-driven cognitive map approach to analyzing information systems project risk (2009)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 3855) [ClassicSimilarity], result of:
0.19391078 = score(doc=3855,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 3855, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=3855)
0.25 = coord(1/4)
- Abstract
- Project risks encompass both internal and external factors that are interrelated, influencing others in a causal way. It is very important to identify those factors and their causal relationships to reduce the project risk. In the past, most IT companies evaluate project risk by roughly measuring the related factors, but ignoring the important fact that there are complicated causal relationships among them. There is a strong need to develop more effective mechanisms to systematically judge all factors related to project risk and identify the causal relationships among those factors. To accomplish this research objective, our study adopts a cognitive map (CM)-based mechanism called the MACOM (Multi-Agents COgnitive Map), where CM is represented by a set of multi-agents, each embedded with basic intelligence to determine its causal relationships with other agents. CM has proven especially useful in solving unstructured problems with many variables and causal relationships; however, simply applying CM to project risk management is not enough because most causal relationships are hard to identify and measure exactly. To overcome this problem, we have borrowed a multi-agent metaphor in which CM is represented by a set of multi-agents, and project risk is explained through the interaction of the multi-agents. Such an approach presents a new computational capability for resolving complicated decision problems. Using the MACOM framework, we have proved that the task of resolving the IS project risk management could be systematically and intelligently solved, and in this way, IS project managers can be given robust decision support.
-
MacFarlane, A.; Al-Wabil, A.; Marshall, C.R.; Albrair, A.; Jones, S.A.; Zaphiris, P.: ¬The effect of dyslexia on information retrieval : a pilot study (2010)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 611) [ClassicSimilarity], result of:
0.19391078 = score(doc=611,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 611, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=611)
0.25 = coord(1/4)
- Abstract
- Purpose - The purpose of this paper is to resolve a gap in the knowledge of how people with dyslexia interact with information retrieval (IR) systems, specifically an understanding of their information-searching behaviour. Design/methodology/approach - The dyslexia cognitive profile is used to design a logging system, recording the difference between two sets of participants: dyslexic and control users. A standard Okapi interface is used - together with two standard TREC topics - in order to record the information searching behaviour of these users. Findings - Using the log data, the differences in information-searching behaviour of control and dyslexic users, i.e. in the way the two groups interact with Okapi, are established and it also established that qualitative information collected (such as experience etc.) may not be able to account for these differences. Evidence from query variables was unable to distinguish between groups, but differences on topic for the same variables were recorded. Users who view more documents tended to judge more documents as being relevant, in terms of either the user group or topic. Session data indicated that there may be an important difference between the number of iterations used in a search between the user groups, as there may be little effect from the topic on this variable. Originality/value - This is the first study of the effect of dyslexia on information search behaviour, and it provides some evidence to take the field forward.
-
Nunes, S.; Ribeiro, C.; David, G.: Term weighting based on document revision history (2011)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 946) [ClassicSimilarity], result of:
0.19391078 = score(doc=946,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 946, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=946)
0.25 = coord(1/4)
- Abstract
- In real-world information retrieval systems, the underlying document collection is rarely stable or definitive. This work is focused on the study of signals extracted from the content of documents at different points in time for the purpose of weighting individual terms in a document. The basic idea behind our proposals is that terms that have existed for a longer time in a document should have a greater weight. We propose 4 term weighting functions that use each document's history to estimate a current term score. To evaluate this thesis, we conduct 3 independent experiments using a collection of documents sampled from Wikipedia. In the first experiment, we use data from Wikipedia to judge each set of terms. In a second experiment, we use an external collection of tags from a popular social bookmarking service as a gold standard. In the third experiment, we crowdsource user judgments to collect feedback on term preference. Across all experiments results consistently support our thesis. We show that temporally aware measures, specifically the proposed revision term frequency and revision term frequency span, outperform a term-weighting measure based on raw term frequency alone.
-
Berendsen, R.; Rijke, M. de; Balog, K.; Bogers, T.; Bosch, A. van den: On the assessment of expertise profiles (2013)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 2089) [ClassicSimilarity], result of:
0.19391078 = score(doc=2089,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 2089, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=2089)
0.25 = coord(1/4)
- Abstract
- Expertise retrieval has attracted significant interest in the field of information retrieval. Expert finding has been studied extensively, with less attention going to the complementary task of expert profiling, that is, automatically identifying topics about which a person is knowledgeable. We describe a test collection for expert profiling in which expert users have self-selected their knowledge areas. Motivated by the sparseness of this set of knowledge areas, we report on an assessment experiment in which academic experts judge a profile that has been automatically generated by state-of-the-art expert-profiling algorithms; optionally, experts can indicate a level of expertise for relevant areas. Experts may also give feedback on the quality of the system-generated knowledge areas. We report on a content analysis of these comments and gain insights into what aspects of profiles matter to experts. We provide an error analysis of the system-generated profiles, identifying factors that help explain why certain experts may be harder to profile than others. We also analyze the impact on evaluating expert-profiling systems of using self-selected versus judged system-generated knowledge areas as ground truth; they rank systems somewhat differently but detect about the same amount of pairwise significant differences despite the fact that the judged system-generated assessments are more sparse.
-
White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 2781) [ClassicSimilarity], result of:
0.19391078 = score(doc=2781,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 2781, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=2781)
0.25 = coord(1/4)
- Abstract
- Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
-
Zhao, Y.W.; Chi. C.-H.; Heuvel, W.J. van den: Imperfect referees : reducing the impact of multiple biases in peer review (2015)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 3271) [ClassicSimilarity], result of:
0.19391078 = score(doc=3271,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 3271, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=3271)
0.25 = coord(1/4)
- Abstract
- Bias in peer review entails systematic prejudice that prevents accurate and objective assessment of scientific studies. The disparity between referees' opinions on the same paper typically makes it difficult to judge the paper's quality. This article presents a comprehensive study of peer review biases with regard to 2 aspects of referees: the static profiles (factual authority and self-reported confidence) and the dynamic behavioral context (the temporal ordering of reviews by a single reviewer), exploiting anonymized, real-world review reports of 2 different international conferences in information systems / computer science. Our work extends conventional bias research by considering multiple biases occurring simultaneously. Our findings show that the referees' static profiles are more dominant in peer review bias when compared to their dynamic behavioral context. Of the static profiles, self-reported confidence improved both conference fitness and impact-based bias reductions, while factual authority could only contribute to conference fitness-based bias reduction. Our results also clearly show that the reliability of referees' judgments varies along their static profiles and is contingent on the temporal interval between 2 consecutive reviews.
-
Zhang, Y.; Trace, C.B.: ¬The quality of health and wellness self-tracking data : a consumer perspective (2022)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 1460) [ClassicSimilarity], result of:
0.19391078 = score(doc=1460,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 1460, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=1460)
0.25 = coord(1/4)
- Abstract
- Information quality (IQ) is key to users' satisfaction with information systems. Understanding what IQ means to users can effectively inform system improvement. Existing inquiries into self-tracking data quality primarily focus on accuracy. Interviewing 20 consumers who had self-tracked health indicators for at least 6 months, we identified eight dimensions that consumers apply to evaluate self-tracking data quality: value-added, accuracy, completeness, accessibility, ease of understanding, trustworthiness, aesthetics, and invasiveness. These dimensions fell into four categories-intrinsic, contextual, representational, and accessibility-suggesting that consumers judge self-tracking data quality not only based on the data's inherent quality but also considering tasks at hand, the clarity of data representation, and data accessibility. We also found that consumers' self-tracking data quality judgments are shaped primarily by their goals or motivations, subjective experience with tracked activities, mental models of how systems work, self-tracking tools' reputation, cost, and design, and domain knowledge and intuition, but less by more objective criteria such as scientific research results, validated devices, or consultation with experts. Future studies should develop and validate a scale for measuring consumers' perceptions of self-tracking data quality and commit efforts to develop technologies and training materials to enhance consumers' ability to evaluate data quality.
-
Tang, M.-C.; Liao, I.-H.: Preference diversity and openness to novelty : scales construction from the perspective of movie recommendation (2022)
0.05
0.048477694 = product of:
0.19391078 = sum of:
0.19391078 = weight(_text_:judge in 1649) [ClassicSimilarity], result of:
0.19391078 = score(doc=1649,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.42709115 = fieldWeight in 1649, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0390625 = fieldNorm(doc=1649)
0.25 = coord(1/4)
- Abstract
- In response to calls for recommender systems to balance accuracy and alternative measures such as diversity and novelty, we propose that recommendation strategies should be applied adaptively according to users' preference traits. Psychological scales for "preference diversity" and "openness to novelty" were developed to measure users' willingness to accept diverse and novel recommendations, respectively. To validate the scales empirically, a user study was conducted in which 293 regular moviegoers were asked to judge a set of 220 movies representing both mainstream and "long-tail" appeals. The judgment task involved indicating and rating movies they had seen, heard of but not seen, and not known previously. Correlatoin analyses were then conducted between the participants' preference diversity and openness to novelty scores with the diversity and novelty of their past movie viewing profile and movies they had not seen before but shown interest in. Preference diversity scores were shown to be significantly related to the diversity of the movies they had seen. Higher preference diversity scores were also associated with greater diversity in favored unknown movies. Similarly, participants who scored high on the openness to novelty scale had viewed more little-known movies and were generally interested in less popular movies as well as movies that differed from those they had seen before. Implications of these psychological traits for recommendation strategies are also discussed.
-
Adler, R.; Ewing, J.; Taylor, P.: Citation statistics : A report from the International Mathematical Union (IMU) in cooperation with the International Council of Industrial and Applied Mathematics (ICIAM) and the Institute of Mathematical Statistics (IMS) (2008)
0.04
0.04113469 = product of:
0.16453876 = sum of:
0.16453876 = weight(_text_:judge in 3417) [ClassicSimilarity], result of:
0.16453876 = score(doc=3417,freq=4.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.36239886 = fieldWeight in 3417, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.0234375 = fieldNorm(doc=3417)
0.25 = coord(1/4)
- Abstract
- Using citation data to assess research ultimately means using citation-based statistics to rank things.journals, papers, people, programs, and disciplines. The statistical tools used to rank these things are often misunderstood and misused. - For journals, the impact factor is most often used for ranking. This is a simple average derived from the distribution of citations for a collection of articles in the journal. The average captures only a small amount of information about that distribution, and it is a rather crude statistic. In addition, there are many confounding factors when judging journals by citations, and any comparison of journals requires caution when using impact factors. Using the impact factor alone to judge a journal is like using weight alone to judge a person's health. - For papers, instead of relying on the actual count of citations to compare individual papers, people frequently substitute the impact factor of the journals in which the papers appear. They believe that higher impact factors must mean higher citation counts. But this is often not the case! This is a pervasive misuse of statistics that needs to be challenged whenever and wherever it occurs. -For individual scientists, complete citation records can be difficult to compare. As a consequence, there have been attempts to find simple statistics that capture the full complexity of a scientist's citation record with a single number. The most notable of these is the h-index, which seems to be gaining in popularity. But even a casual inspection of the h-index and its variants shows that these are naive attempts to understand complicated citation records. While they capture a small amount of information about the distribution of a scientist's citations, they lose crucial information that is essential for the assessment of research.
-
Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001)
0.04
0.038782157 = product of:
0.15512863 = sum of:
0.15512863 = weight(_text_:judge in 188) [ClassicSimilarity], result of:
0.15512863 = score(doc=188,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.34167293 = fieldWeight in 188, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.03125 = fieldNorm(doc=188)
0.25 = coord(1/4)
- Abstract
- Kim and Wilber present three techniques for the algorithmic identification in text of content bearing terms and phrases intended for human use as entry points or hyperlinks. Using a set of 1,075 terms from MEDLINE evaluated on a zero to four, stop word to definite content word scale, they evaluate the ranked lists of their three methods based on their placement of content words in the top ranks. Data consist of the natural language elements of 304,057 MEDLINE records from 1996, and 173,252 Wall Street Journal records from the TIPSTER collection. Phrases are extracted by breaking at punctuation marks and stop words, normalized by lower casing, replacement of nonalphanumerics with spaces, and the reduction of multiple spaces. In the ``strength of context'' approach each document is a vector of binary values for each word or word pair. The words or word pairs are removed from all documents, and the Robertson, Spark Jones relevance weight for each term computed, negative weights replaced with zero, those below a randomness threshold ignored, and the remainder summed for each document, to yield a score for the document and finally to assign to the term the average document score for documents in which it occurred. The average of these word scores is assigned to the original phrase. The ``frequency clumping'' approach defines a random phrase as one whose distribution among documents is Poisson in character. A pvalue, the probability that a phrase frequency of occurrence would be equal to, or less than, Poisson expectations is computed, and a score assigned which is the negative log of that value. In the ``database comparison'' approach if a phrase occurring in a document allows prediction that the document is in MEDLINE rather that in the Wall Street Journal, it is considered to be content bearing for MEDLINE. The score is computed by dividing the number of occurrences of the term in MEDLINE by occurrences in the Journal, and taking the product of all these values. The one hundred top and bottom ranked phrases that occurred in at least 500 documents were collected for each method. The union set had 476 phrases. A second selection was made of two word phrases occurring each in only three documents with a union of 599 phrases. A judge then ranked the two sets of terms as to subject specificity on a 0 to 4 scale. Precision was the average subject specificity of the first r ranks and recall the fraction of the subject specific phrases in the first r ranks and eleven point average precision was used as a summary measure. The three methods all move content bearing terms forward in the lists as does the use of the sum of the logs of the three methods.
-
Spero, S.: LCSH is to thesaurus as doorbell is to mammal : visualizing structural problems in the Library of Congress Subject Headings (2008)
0.04
0.038782157 = product of:
0.15512863 = sum of:
0.15512863 = weight(_text_:judge in 3659) [ClassicSimilarity], result of:
0.15512863 = score(doc=3659,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.34167293 = fieldWeight in 3659, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.03125 = fieldNorm(doc=3659)
0.25 = coord(1/4)
- Abstract
- The Library of Congress Subject Headings (LCSH) has been developed over the course of more than a century, predating the semantic web by some time. Until the 1986, the only concept-toconcept relationship available was an undifferentiated "See Also" reference, which was used for both associative (RT) and hierarchical (BT/NT) connections. In that year, in preparation for the first release of the headings in machine readable MARC Authorities form, an attempt was made to automatically convert these "See Also" links into the standardized thesaural relations. Unfortunately, the rule used to determine the type of reference to generate relied on the presence of symmetric links to detect associatively related terms; "See Also" references that were only present in one of the related terms were assumed to be hierarchical. This left the process vulnerable to inconsistent use of references in the pre-conversion data, with a marked bias towards promoting relationships to hierarchical status. The Library of Congress was aware that the results of the conversion contained many inconsistencies, and intended to validate and correct the results over the course of time. Unfortunately, twenty years later, less than 40% of the converted records have been evaluated. The converted records, being the earliest encountered during the Library's cataloging activities, represent the most basic concepts within LCSH; errors in the syndetic structure for these records affect far more subordinate concepts than those nearer the periphery. Worse, a policy of patterning new headings after pre-existing ones leads to structural errors arising from the conversion process being replicated in these newer headings, perpetuating and exacerbating the errors. As the LCSH prepares for its second great conversion, from MARC to SKOS, it is critical to address these structural problems. As part of the work on converting the headings into SKOS, I have experimented with different visualizations of the tangled web of broader terms embedded in LCSH. This poster illustrates several of these renderings, shows how they can help users to judge which relationships might not be correct, and shows just exactly how Doorbells and Mammals are related.
-
White, H.D.: Relevance in theory (2009)
0.04
0.038782157 = product of:
0.15512863 = sum of:
0.15512863 = weight(_text_:judge in 859) [ClassicSimilarity], result of:
0.15512863 = score(doc=859,freq=2.0), product of:
0.45402667 = queryWeight, product of:
7.731176 = idf(docFreq=52, maxDocs=44421)
0.058726728 = queryNorm
0.34167293 = fieldWeight in 859, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.731176 = idf(docFreq=52, maxDocs=44421)
0.03125 = fieldNorm(doc=859)
0.25 = coord(1/4)
- Abstract
- Relevance is the central concept in information science because of its salience in designing and evaluating literature-based answering systems. It is salient when users seek information through human intermediaries, such as reference librarians, but becomes even more so when systems are automated and users must navigate them on their own. Designers of classic precomputer systems of the nineteenth and twentieth centuries appear to have been no less concerned with relevance than the information scientists of today. The concept has, however, proved difficult to define and operationalize. A common belief is that it is a relation between a user's request for information and the documents the system retrieves in response. Documents might be considered retrieval-worthy because they: 1) constitute evidence for or against a claim; 2) answer a question; or 3) simply match the request in topic. In practice, literature-based answering makes use of term-matching technology, and most evaluation of relevance has involved topical match as the primary criterion for acceptability. The standard table for evaluating the relation of retrieved documents to a request has only the values "relevant" and "not relevant," yet many analysts hold that relevance admits of degrees. Moreover, many analysts hold that users decide relevance on more dimensions than topical match. Who then can validly judge relevance? Is it only the person who put the request and who can evaluate a document on multiple dimensions? Or can surrogate judges perform this function on the basis of topicality? Such questions arise in a longstanding debate on whether relevance is objective or subjective. One proposal has been to reframe the debate in terms of relevance theory (imported from linguistic pragmatics), which makes relevance increase with a document's valuable cognitive effects and decrease with the effort needed to process it. This notion allows degree of topical match to contribute to relevance but allows other considerations to contribute as well. Since both cognitive effects and processing effort will differ across users, they can be taken as subjective, but users' decisions can also be objectively evaluated if the logic behind them is made explicit. Relevance seems problematical because the considerations that lead people to accept documents in literature searches, or to use them later in contexts such as citation, are seldom fully revealed. Once they are revealed, relevance may be seen as not only multidimensional and dynamic, but also understandable.