Search (112 results, page 1 of 6)

Pfeifer, U.; Pennekamp, S.: Incremental processing of vague queries in interactive retrieval systems (1997) 0.06

0.059097588 = product of:
  0.118195176 = sum of:
    0.030633135 = weight(_text_:und in 1735) [ClassicSimilarity], result of:
      0.030633135 = score(doc=1735,freq=2.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.19603643 = fieldWeight in 1735, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.0625 = fieldNorm(doc=1735)
    0.08756204 = weight(_text_:have in 1735) [ClassicSimilarity], result of:
      0.08756204 = score(doc=1735,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.39414543 = fieldWeight in 1735, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.0625 = fieldNorm(doc=1735)
  0.5 = coord(2/4)

Abstract: The application of information retrieval techniques in interactive environments requires systems capable of effeciently processing vague queries. To reach reasonable response times, new data structures and algorithms have to be developed. In this paper we describe an approach taking advantage of the conditions of interactive usage and special access paths. To have a reference we investigate text queries and compared our algorithms to the well known 'Buckley/Lewit' algorithm. We achieved significant improvements for the response times
Source: Hypertext - Information Retrieval - Multimedia '97: Theorien, Modelle und Implementierungen integrierter elektronischer Informationssysteme. Proceedings HIM '97. Hrsg.: N. Fuhr u.a

Kleinberg, J.M.: Authoritative sources in a hyperlinked environment (1998) 0.03
```
0.034705818 = product of:
  0.069411635 = sum of:
    0.022974849 = weight(_text_:und in 1005) [ClassicSimilarity], result of:
      0.022974849 = score(doc=1005,freq=2.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.14702731 = fieldWeight in 1005, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.046875 = fieldNorm(doc=1005)
    0.046436783 = weight(_text_:have in 1005) [ClassicSimilarity], result of:
      0.046436783 = score(doc=1005,freq=2.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.20902719 = fieldWeight in 1005, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=1005)
  0.5 = coord(2/4)
```
Abstract

The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of "authoritative" information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of "hub pages" that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.

Content

Vorversionen auch in: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 1998, und als IBM Research Report RJ 10076, May 1997.
Bidoki, A.M.Z.; Yazdani, N.: an intelligent ranking algorithm for web pages : DistanceRank (2008) 0.02
```
0.023459004 = product of:
  0.09383602 = sum of:
    0.09383602 = weight(_text_:have in 3068) [ClassicSimilarity], result of:
      0.09383602 = score(doc=3068,freq=6.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.42238668 = fieldWeight in 3068, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.0546875 = fieldNorm(doc=3068)
  0.25 = coord(1/4)
```
Abstract

A fast and efficient page ranking mechanism for web crawling and retrieval remains as a challenging issue. Recently, several link based ranking algorithms like PageRank, HITS and OPIC have been proposed. In this paper, we propose a novel recursive method based on reinforcement learning which considers distance between pages as punishment, called "DistanceRank" to compute ranks of web pages. The distance is defined as the number of "average clicks" between two pages. The objective is to minimize punishment or distance so that a page with less distance to have a higher rank. Experimental results indicate that DistanceRank outperforms other ranking algorithms in page ranking and crawling scheduling. Furthermore, the complexity of DistanceRank is low. We have used University of California at Berkeley's web for our experiments.
Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.02
```
0.020107718 = product of:
  0.08043087 = sum of:
    0.08043087 = weight(_text_:have in 3799) [ClassicSimilarity], result of:
      0.08043087 = score(doc=3799,freq=6.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.3620457 = fieldWeight in 3799, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=3799)
  0.25 = coord(1/4)
```
Abstract

With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
Robertson, S.E.; Sparck Jones, K.: Simple, proven approaches to text retrieval (1997) 0.02
```
0.01934866 = product of:
  0.07739464 = sum of:
    0.07739464 = weight(_text_:have in 5532) [ClassicSimilarity], result of:
      0.07739464 = score(doc=5532,freq=8.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.34837866 = fieldWeight in 5532, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.0390625 = fieldNorm(doc=5532)
  0.25 = coord(1/4)
```
Abstract

This technical note describes straightforward techniques for document indexing and retrieval that have been solidly established through extensive testing and are easy to apply. They are useful for many different types of text material, are viable for very large files, and have the advantage that they do not require special skills or training for searching, but are easy for end users. The document and text retrieval methods described here have a sound theoretical basis, are well established by extensive testing, and the ideas involved are now implemented in some commercial retrieval systems. Testing in the last few years has, in particular, shown that the methods presented here work very well with full texts, not only title and abstracts, and with large files of texts containing three quarters of a million documents. These tests, the TREC Tests (see Harman 1993 - 1997; IP&M 1995), have been rigorous comparative evaluations involving many different approaches to information retrieval. These techniques depend an the use of simple terms for indexing both request and document texts; an term weighting exploiting statistical information about term occurrences; an scoring for request-document matching, using these weights, to obtain a ranked search output; and an relevance feedback to modify request weights or term sets in iterative searching. The normal implementation is via an inverted file organisation using a term list with linked document identifiers, plus counting data, and pointers to the actual texts. The user's request can be a word list, phrases, sentences or extended text.
García Cumbreras, M.A.; Perea-Ortega, J.M.; García Vega, M.; Ureña López, L.A.: Information retrieval with geographical references : relevant documents filtering vs. query expansion (2009) 0.02
```
0.019154195 = product of:
  0.07661678 = sum of:
    0.07661678 = weight(_text_:have in 222) [ClassicSimilarity], result of:
      0.07661678 = score(doc=222,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.34487724 = fieldWeight in 222, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.0546875 = fieldNorm(doc=222)
  0.25 = coord(1/4)
```
Abstract

This is a thorough analysis of two techniques applied to Geographic Information Retrieval (GIR). Previous studies have researched the application of query expansion to improve the selection process of information retrieval systems. This paper emphasizes the effectiveness of the filtering of relevant documents applied to a GIR system, instead of query expansion. Based on the CLEF (Cross Language Evaluation Forum) framework available, several experiments have been run. Some based on query expansion, some on the filtering of relevant documents. The results show that filtering works better in a GIR environment, because relevant documents are not reordered in the final list.
Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003) 0.02
```
0.018758887 = product of:
  0.07503555 = sum of:
    0.07503555 = weight(_text_:und in 2734) [ClassicSimilarity], result of:
      0.07503555 = score(doc=2734,freq=12.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.48018923 = fieldWeight in 2734, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.0625 = fieldNorm(doc=2734)
  0.25 = coord(1/4)
```
Abstract

Die Menge an Daten im Internet steigt weiter rapide an. Damit wächst auch der Bedarf an qualitativ hochwertigen Information Retrieval Diensten zur Orientierung und problemorientierten Suche. Die Entscheidung für die Benutzung oder Beschaffung von Information Retrieval Software erfordert aussagekräftige Evaluierungsergebnisse. Dieser Beitrag stellt neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen vor und zeigt den Trend zu Spezialisierung und Diversifizierung von Evaluierungsstudien, die den Realitätsgrad derErgebnisse erhöhen. DerSchwerpunkt liegt auf dem Retrieval von Fachtexten, Internet-Seiten und Multimedia-Objekten.

Source

Information - Wissenschaft und Praxis. 54(2003) H.4, S.203-210
Nagelschmidt, M.: Verfahren zur Anfragemodifikation im Information Retrieval (2008) 0.02
```
0.018163214 = product of:
  0.072652854 = sum of:
    0.072652854 = weight(_text_:und in 3774) [ClassicSimilarity], result of:
      0.072652854 = score(doc=3774,freq=20.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.4649412 = fieldWeight in 3774, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.046875 = fieldNorm(doc=3774)
  0.25 = coord(1/4)
```
Abstract

Für das Modifizieren von Suchanfragen kennt das Information Retrieval vielfältige Möglichkeiten. Nach einer einleitenden Darstellung der Wechselwirkung zwischen Informationsbedarf und Suchanfrage wird eine konzeptuelle und typologische Annäherung an Verfahren zur Anfragemodifikation gegeben. Im Anschluss an eine kurze Charakterisierung des Fakten- und des Information Retrieval, sowie des Vektorraum- und des probabilistischen Modells, werden intellektuelle, automatische und interaktive Modifikationsverfahren vorgestellt. Neben klassischen intellektuellen Verfahren, wie der Blockstrategie und der "Citation Pearl Growing"-Strategie, umfasst die Darstellung der automatischen und interaktiven Verfahren Modifikationsmöglichkeiten auf den Ebenen der Morphologie, der Syntax und der Semantik von Suchtermen. Darüber hinaus werden das Relevance Feedback, der Nutzen informetrischer Analysen und die Idee eines assoziativen Retrievals auf der Basis von Clustering- und terminologischen Techniken, sowie zitationsanalytischen Verfahren verfolgt. Ein Eindruck für die praktischen Gestaltungsmöglichkeiten der behandelten Verfahren soll abschließend durch fünf Anwendungsbeispiele vermittelt werden.
Fuhr, N.: Zur Überwindung der Diskrepanz zwischen Retrievalforschung und -praxis (1990) 0.02
```
0.017124442 = product of:
  0.06849777 = sum of:
    0.06849777 = weight(_text_:und in 6624) [ClassicSimilarity], result of:
      0.06849777 = score(doc=6624,freq=10.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.4383508 = fieldWeight in 6624, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.0625 = fieldNorm(doc=6624)
  0.25 = coord(1/4)
```
Abstract

In diesem Beitrag werden einige Forschungsergebnisse des Information Retrieval vorgestellt, die unmittelbar zur Verbesserung der Retrievalqualität für bereits existierende Datenbanken eingesetzt werden können: Linguistische Algorithmen zur Grund- und Stammformreduktion unterstützen die Suche nach Flexions- und Derivationsformen von Suchtermen. Rankingalgorithmen, die Frage- und Dokumentterme gewichten, führen zu signifikant besseren Retrievalergebnissen als beim Booleschen Retrieval. Durch Relevance Feedback können die Retrievalqualität weiter gesteigert und außerdem der Benutzer bei der sukzessiven Modifikation seiner Frageformulierung unterstützt werden. Es wird eine benutzerfreundliche Bedienungsoberfläche für ein System vorgestellt, das auf diesen Konzepten basiert.
Tober, M.; Hennig, L.; Furch, D.: SEO Ranking-Faktoren und Rang-Korrelationen 2014 : Google Deutschland (2014) 0.02
```
0.017124442 = product of:
  0.06849777 = sum of:
    0.06849777 = weight(_text_:und in 2484) [ClassicSimilarity], result of:
      0.06849777 = score(doc=2484,freq=10.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.4383508 = fieldWeight in 2484, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.0625 = fieldNorm(doc=2484)
  0.25 = coord(1/4)
```
Abstract

Dieses Whitepaper beschäftigt sich mit der Definition und Bewertung von Faktoren, die eine hohe Rangkorrelation-Koeffizienz mit organischen Suchergebnissen aufweisen und dient dem Zweck der tieferen Analyse von Suchmaschinen-Algorithmen. Die Datenerhebung samt Auswertung bezieht sich auf Ranking-Faktoren für Google-Deutschland im Jahr 2014. Zusätzlich wurden die Korrelationen und Faktoren unter anderem anhand von Durchschnitts- und Medianwerten sowie Entwicklungstendenzen zu den Vorjahren hinsichtlich ihrer Relevanz für vordere Suchergebnis-Positionen interpretiert.
Behnert, C.; Borst, T.: Neue Formen der Relevanz-Sortierung in bibliothekarischen Informationssystemen : das DFG-Projekt LibRank (2015) 0.02
```
0.017124442 = product of:
  0.06849777 = sum of:
    0.06849777 = weight(_text_:und in 392) [ClassicSimilarity], result of:
      0.06849777 = score(doc=392,freq=10.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.4383508 = fieldWeight in 392, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.0625 = fieldNorm(doc=392)
  0.25 = coord(1/4)
```
Abstract

Das von der DFG geförderte Projekt LibRank erforscht neue Rankingverfahren für bibliothekarische Informationssysteme, die aufbauend auf Erkenntnissen aus dem Bereich Websuche qualitätsinduzierende Faktoren wie z. B. Aktualität, Popularität und Verfügbarkeit von einzelnen Medien berücksichtigen. Die konzipierten Verfahren werden im Kontext eines in den Wirtschaftswissenschaften häufig genutzten Rechercheportals (EconBiz) entwickelt und in einem Testsystem systematisch evaluiert. Es werden Rankingfaktoren, die für den Bibliotheksbereich von besonderem Interesse sind, vorgestellt und exemplarisch Probleme und Herausforderungen aufgezeigt.

Source

Bibliothek: Forschung und Praxis. 39(2015) H.3, S.384-393
Picard, J.; Savoy, J.: Enhancing retrieval with hyperlinks : a general model based on propositional argumentation systems (2003) 0.02
```
0.016756432 = product of:
  0.06702573 = sum of:
    0.06702573 = weight(_text_:have in 2427) [ClassicSimilarity], result of:
      0.06702573 = score(doc=2427,freq=6.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.30170476 = fieldWeight in 2427, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.0390625 = fieldNorm(doc=2427)
  0.25 = coord(1/4)
```
Abstract

Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information an the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indicator of popularity, and (4) find hubs and authorities for a given topic. Although the TREC experiments have not demonstrated the usefulness of hyperlinks for retrieval, the hypertext structure is nevertheless an essential aspect of the Web, and as such, should not be ignored. The development of abstract models of the IR task was a key factor to the improvement of search engines. However, at this time conceptual tools for modeling the hypertext retrieval task are lacking, making it difficult to compare, improve, and reason an the existing techniques. This article proposes a general model for using hyperlinks based an Probabilistic Argumentation Systems, in which each of the above-mentioned techniques can be stated. This model will allow to discover some inconsistencies in the mentioned techniques, and to take a higher level and systematic approach for using hyperlinks for retrieval.
He, J.; Meij, E.; Rijke, M. de: Result diversification based on query-specific cluster ranking (2011) 0.02
```
0.016756432 = product of:
  0.06702573 = sum of:
    0.06702573 = weight(_text_:have in 355) [ClassicSimilarity], result of:
      0.06702573 = score(doc=355,freq=6.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.30170476 = fieldWeight in 355, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.0390625 = fieldNorm(doc=355)
  0.25 = coord(1/4)
```
Abstract

Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification is restricted to documents belonging to clusters that potentially contain a high percentage of relevant documents. Empirical results show that the proposed framework improves the performance of several existing diversification methods. The framework also gives rise to a simple yet effective cluster-based approach to result diversification that selects documents from different clusters to be included in a ranked list in a round robin fashion. We describe a set of experiments aimed at thoroughly analyzing the behavior of the two main components of the proposed diversification framework, ranking and selecting clusters for diversification. Both components have a crucial impact on the overall performance of our framework, but ranking clusters plays a more important role than selecting clusters. We also examine properties that clusters should have in order for our diversification framework to be effective. Most relevant documents should be contained in a small number of high-quality clusters, while there should be no dominantly large clusters. Also, documents from these high-quality clusters should have a diverse content. These properties are strongly correlated with the overall performance of the proposed diversification framework.

Dreßler, H.: Fuzzy Information Retrieval (2008) 0.02

0.016580671 = product of:
  0.066322684 = sum of:
    0.066322684 = weight(_text_:und in 3300) [ClassicSimilarity], result of:
      0.066322684 = score(doc=3300,freq=6.0), product of:
        0.15626246 = queryWeight, product of:
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.07045517 = queryNorm
        0.42443132 = fieldWeight in 3300, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.217899 = idf(docFreq=13141, maxDocs=44421)
          0.078125 = fieldNorm(doc=3300)
  0.25 = coord(1/4)

Abstract: Nach einer Erläuterung der Grundlagen der Fuzzylogik wird das Prinzip der unscharfen Suche dargestellt und die Unterschiede zum herkömmlichen Information Retrieval beschrieben. Am Beispiel der Suche nach Steinen für ein Mauerwerk wird gezeigt, wie eine unscharfe Suche in der D&WFuzzydatenbank erfolgreich durchgeführt werden kann und zu eindeutigen Ergebnissen führt.
Source: Information - Wissenschaft und Praxis. 59(2008) H.6/7, S.351-352

Schamber, L.; Bateman, J.: Relevance criteria uses and importance : progress in development of a measurement scale (1999) 0.02
```
0.016417881 = product of:
  0.065671526 = sum of:
    0.065671526 = weight(_text_:have in 691) [ClassicSimilarity], result of:
      0.065671526 = score(doc=691,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.29560906 = fieldWeight in 691, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=691)
  0.25 = coord(1/4)
```
Abstract

The criteria employed by end-users in making relevance judgments can be powerful and useful indicators of the values users ascribe to a variety of factors in their information seeking and use situations. This paper describes intermediate results in a long-term project intended to develop a measurement scale based on users' relevance criteria. The five tests that are reported here have involved 350 users in an effort to progressively refine and validate the scale content. The range of research questions and types of users and information environments have gradually been expanded to assess the adaptability and transferability of the instrument. The instrument provides quantitative data, notably criterion importance ratings that can be analyzed using several techniques. The substantive findings confirm those of previous studies on relevance evaluation behavior
Ding, Y.; Chowdhury, G.; Foo, S.: Organsising keywords in a Web search environment : a methodology based on co-word analysis (2000) 0.02
```
0.016417881 = product of:
  0.065671526 = sum of:
    0.065671526 = weight(_text_:have in 1105) [ClassicSimilarity], result of:
      0.065671526 = score(doc=1105,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.29560906 = fieldWeight in 1105, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=1105)
  0.25 = coord(1/4)
```
Abstract

The rapid development of the Internet and World Wide Web has caused some critical problem for information retrieval. Researchers have made several attempts to solve these problems. Thesauri and subject heading lists as traditional information retrieval tools have been criticised for their efficiency to tackle these newly emerging problems. This paper proposes an information retrieval tool generated by cocitation analysis, comprising keyword clusters with relationships based on the co-occurrences of keywords in the literature. Such a tool can play the role of an associative thesaurus that can provide information about the keywords in a domain that might be useful for information searching and query expansion
Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.02
```
0.016417881 = product of:
  0.065671526 = sum of:
    0.065671526 = weight(_text_:have in 3502) [ClassicSimilarity], result of:
      0.065671526 = score(doc=3502,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.29560906 = fieldWeight in 3502, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=3502)
  0.25 = coord(1/4)
```
Abstract

Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.
Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.02
```
0.016417881 = product of:
  0.065671526 = sum of:
    0.065671526 = weight(_text_:have in 4455) [ClassicSimilarity], result of:
      0.065671526 = score(doc=4455,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.29560906 = fieldWeight in 4455, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=4455)
  0.25 = coord(1/4)
```
Abstract

Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.02
```
0.016417881 = product of:
  0.065671526 = sum of:
    0.065671526 = weight(_text_:have in 45) [ClassicSimilarity], result of:
      0.065671526 = score(doc=45,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.29560906 = fieldWeight in 45, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=45)
  0.25 = coord(1/4)
```
Abstract

In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language-processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this article, the authors argue that document retrieval for question answering represents a task different from retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. They show that the current evaluation resources have major shortcomings; to remedy the situation, they have manually created a small, reusable question answering test collection for research purposes. In this article they describe their methodology for building this test collection and discuss issues they encountered regarding the notion of "answer correctness."
Singh, S.; Dey, L.: ¬A rough-fuzzy document grading system for customized text information retrieval (2005) 0.02
```
0.016417881 = product of:
  0.065671526 = sum of:
    0.065671526 = weight(_text_:have in 2007) [ClassicSimilarity], result of:
      0.065671526 = score(doc=2007,freq=4.0), product of:
        0.22215667 = queryWeight, product of:
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.07045517 = queryNorm
        0.29560906 = fieldWeight in 2007, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1531634 = idf(docFreq=5157, maxDocs=44421)
          0.046875 = fieldNorm(doc=2007)
  0.25 = coord(1/4)
```
Abstract

Due to the large repository of documents available on the web, users are usually inundated by a large volume of information, most of which is found to be irrelevant. Since user perspectives vary, a client-side text filtering system that learns the user's perspective can reduce the problem of irrelevant retrieval. In this paper, we have provided the design of a customized text information filtering system which learns user preferences and modifies the initial query to fetch better documents. It uses a rough-fuzzy reasoning scheme. The rough-set based reasoning takes care of natural language nuances, like synonym handling, very elegantly. The fuzzy decider provides qualitative grading to the documents for the user's perusal. We have provided the detailed design of the various modules and some results related to the performance analysis of the system.

Search (112 results, page 1 of 6)

Authors

Years

Languages

Types

Themes

Subjects

Classifications