-
Bidoki, A.M.Z.; Yazdani, N.: an intelligent ranking algorithm for web pages : DistanceRank (2008)
0.04
0.03589697 = product of:
0.14358789 = sum of:
0.14358789 = weight(_text_:higher in 3068) [ClassicSimilarity], result of:
0.14358789 = score(doc=3068,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.40538147 = fieldWeight in 3068, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0546875 = fieldNorm(doc=3068)
0.25 = coord(1/4)
- Abstract
- A fast and efficient page ranking mechanism for web crawling and retrieval remains as a challenging issue. Recently, several link based ranking algorithms like PageRank, HITS and OPIC have been proposed. In this paper, we propose a novel recursive method based on reinforcement learning which considers distance between pages as punishment, called "DistanceRank" to compute ranks of web pages. The distance is defined as the number of "average clicks" between two pages. The objective is to minimize punishment or distance so that a page with less distance to have a higher rank. Experimental results indicate that DistanceRank outperforms other ranking algorithms in page ranking and crawling scheduling. Furthermore, the complexity of DistanceRank is low. We have used University of California at Berkeley's web for our experiments.
-
Abdelkareem, M.A.A.: In terms of publication index, what indicator is the best for researchers indexing, Google Scholar, Scopus, Clarivate or others? (2018)
0.04
0.03589697 = product of:
0.14358789 = sum of:
0.14358789 = weight(_text_:higher in 548) [ClassicSimilarity], result of:
0.14358789 = score(doc=548,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.40538147 = fieldWeight in 548, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0546875 = fieldNorm(doc=548)
0.25 = coord(1/4)
- Abstract
- I believe that Google Scholar is the most popular academic indexing way for researchers and citations. However, some other indexing institutions may be more professional than Google Scholar but not as popular as Google Scholar. Other indexing websites like Scopus and Clarivate are providing more statistical figures for scholars, institutions or even journals. On account of publication citations, always Google Scholar shows higher citations for a paper than other indexing websites since Google Scholar consider most of the publication platforms so he can easily count the citations. While other databases just consider the citations come from those journals that are already indexed in their database
-
Käki, M.: fKWIC: frequency-based Keyword-in-Context Index for filtering Web search results (2006)
0.03
0.030768832 = product of:
0.12307533 = sum of:
0.12307533 = weight(_text_:higher in 112) [ClassicSimilarity], result of:
0.12307533 = score(doc=112,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.34746984 = fieldWeight in 112, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.046875 = fieldNorm(doc=112)
0.25 = coord(1/4)
- Abstract
- Enormous Web search engine databases combined with short search queries result in large result sets that are often difficult to access. Result ranking works fairly well, but users need help when it fails. For these situations, we propose a filtering interface that is inspired by keyword-in-context (KWIC) indices. The user interface lists the most frequent keyword contexts (fKWIC). When a context is selected, the corresponding results are displayed in the result list, allowing users to concentrate on the specific context. We compared the keyword context index user interface to the rank order result listing in an experiment with 36 participants. The results show that the proposed user interface was 29% faster in finding relevant results, and the precision of the selected results was 19% higher. In addition, participants showed positive attitudes toward the system.
-
Deerwester, S.C.; Dumais, S.T.; Landauer, T.K.; Furnas, G.W.; Harshman, R.A.: Indexing by latent semantic analysis (1990)
0.03
0.030768832 = product of:
0.12307533 = sum of:
0.12307533 = weight(_text_:higher in 3399) [ClassicSimilarity], result of:
0.12307533 = score(doc=3399,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.34746984 = fieldWeight in 3399, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.046875 = fieldNorm(doc=3399)
0.25 = coord(1/4)
- Abstract
- A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising.
-
Efron, M.: Linear time series models for term weighting in information retrieval (2010)
0.03
0.030768832 = product of:
0.12307533 = sum of:
0.12307533 = weight(_text_:higher in 675) [ClassicSimilarity], result of:
0.12307533 = score(doc=675,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.34746984 = fieldWeight in 675, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.046875 = fieldNorm(doc=675)
0.25 = coord(1/4)
- Abstract
- Common measures of term importance in information retrieval (IR) rely on counts of term frequency; rare terms receive higher weight in document ranking than common terms receive. However, realistic scenarios yield additional information about terms in a collection. Of interest in this article is the temporal behavior of terms as a collection changes over time. We propose capturing each term's collection frequency at discrete time intervals over the lifespan of a corpus and analyzing the resulting time series. We hypothesize the collection frequency of a weakly discriminative term x at time t is predictable by a linear model of the term's prior observations. On the other hand, a linear time series model for a strong discriminators' collection frequency will yield a poor fit to the data. Operationalizing this hypothesis, we induce three time-based measures of term importance and test these against state-of-the-art term weighting models.
-
Keen, E.M.: Designing and testing an interactive ranked retrieval system for professional searchers (1994)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 1134) [ClassicSimilarity], result of:
0.10256278 = score(doc=1134,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 1134, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=1134)
0.25 = coord(1/4)
- Abstract
- Reports 3 explorations of ranked system design. 2 tests used a 'cystic fibrosis' test collection with 100 queries. Experiment 1 compared a Boolean with a ranked interactive system using a subject qualified trained searcher, and reporting recall and precision results. Experiment 2 compared 15 different ranked match algorithms in a batch mode using 2 test collections, and included some new proximate pairs and term weighting approaches. Experiment 3 is a design plan for an interactive ranked prototype offering mid search algorithm choices plus other manual search devices (such as obligatory and unwanted terms), as influenced by thinking aloud comments from experiment 1. Concludes that, in Boolean versus ranked using inverse collection frequency, the searcher inspected more records on ranked than Boolean and so achieved a higher recall but lower precision; however, the presentation order of the relevant records, was, on average, very similar in both systems. Concludes also that: query reformulation was quite strongly practised in ranked searching but does not appear to have been effective; the term pairs proximate weithing methods in experiment 2 enhanced precision on both test collections when used with inverse collection frequency weighting (ICF); and the design plan for an interactive prototype adds to a selection of match algorithms other devices, such as obligatory and unwanted term marking, evidence for this being found from think aloud comments
-
Picard, J.; Savoy, J.: Enhancing retrieval with hyperlinks : a general model based on propositional argumentation systems (2003)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 2427) [ClassicSimilarity], result of:
0.10256278 = score(doc=2427,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 2427, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=2427)
0.25 = coord(1/4)
- Abstract
- Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information an the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indicator of popularity, and (4) find hubs and authorities for a given topic. Although the TREC experiments have not demonstrated the usefulness of hyperlinks for retrieval, the hypertext structure is nevertheless an essential aspect of the Web, and as such, should not be ignored. The development of abstract models of the IR task was a key factor to the improvement of search engines. However, at this time conceptual tools for modeling the hypertext retrieval task are lacking, making it difficult to compare, improve, and reason an the existing techniques. This article proposes a general model for using hyperlinks based an Probabilistic Argumentation Systems, in which each of the above-mentioned techniques can be stated. This model will allow to discover some inconsistencies in the mentioned techniques, and to take a higher level and systematic approach for using hyperlinks for retrieval.
-
Crouch, C.J.; Crouch, D.B.; Chen, Q.; Holtz, S.J.: Improving the retrieval effectiveness of very short queries (2002)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 3572) [ClassicSimilarity], result of:
0.10256278 = score(doc=3572,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 3572, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=3572)
0.25 = coord(1/4)
- Abstract
- This paper describes an automatic approach designed to improve the retrieval effectiveness of very short queries such as those used in web searching. The method is based on the observation that stemming, which is designed to maximize recall, often results in depressed precision. Our approach is based on pseudo-feedback and attempts to increase the number of relevant documents in the pseudo-relevant set by reranking those documents based on the presence of unstemmed query terms in the document text. The original experiments underlying this work were carried out using Smart 11.0 and the lnc.ltc weighting scheme on three sets of documents from the TREC collection with corresponding TREC (title only) topics as queries. (The average length of these queries after stoplisting ranges from 2.4 to 4.5 terms.) Results, evaluated in terms of P@20 and non-interpolated average precision, showed clearly that pseudo-feedback (PF) based on this approach was effective in increasing the number of relevant documents in the top ranks. Subsequent experiments, performed on the same data sets using Smart 13.0 and the improved Lnu.ltu weighting scheme, indicate that these results hold up even over the much higher baseline provided by the new weights. Query drift analysis presents a more detailed picture of the improvements produced by this process.
-
López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Order-based fitness functions for genetic algorithms applied to relevance feedback (2003)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 154) [ClassicSimilarity], result of:
0.10256278 = score(doc=154,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 154, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=154)
0.25 = coord(1/4)
- Abstract
- Lopez-Pujalte and Guerrero-Bote test a relevance feedback genetic algorithm while varying its order based fitness functions and generating a function based upon the Ide dec-hi method as a base line. Using the non-zero weighted term types assigned to the query, and to the initially retrieved set of documents, as genes, a chromosome of equal length is created for each. The algorithm is provided with the chromosomes for judged relevant documents, for judged irrelevant documents, and for the irrelevant documents with their terms negated. The algorithm uses random selection of all possible genes, but gives greater likelihood to those with higher fitness values. When the fittest chromosome of a previous population is eliminated it is restored while the least fittest of the new population is eliminated in its stead. A crossover probability of .8 and a mutation probability of .2 were used with 20 generations. Three fitness functions were utilized; the Horng and Yeh function which takes into account the position of relevant documents, and two new functions, one based on accumulating the cosine similarity for retrieved documents, the other on stored fixed-recall-interval precessions. The Cranfield collection was used with the first 15 documents retrieved from 33 queries chosen to have at least 3 relevant documents in the first 15 and at least 5 relevant documents not initially retrieved. Precision was calculated at fixed recall levels using the residual collection method which removes viewed documents. One of the three functions improved the original retrieval by127 percent, while the Ide dec-hi method provided a 120 percent improvement.
-
Ruthven, T.; Lalmas, M.; Rijsbergen, K.van: Incorporating user research behavior into relevance feedback (2003)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 169) [ClassicSimilarity], result of:
0.10256278 = score(doc=169,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 169, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=169)
0.25 = coord(1/4)
- Abstract
- Ruthven, Mounia, and van Rijsbergen rank and select terms for query expansion using information gathered on searcher evaluation behavior. Using the TREC Financial Times and Los Angeles Times collections and search topics from TREC-6 placed in simulated work situations, six student subjects each preformed three searches on an experimental system and three on a control system with instructions to search by natural language expression in any way they found comfortable. Searching was analyzed for behavior differences between experimental and control situations, and for effectiveness and perceptions. In three experiments paired t-tests were the analysis tool with controls being a no relevance feedback system, a standard ranking for automatic expansion system, and a standard ranking for interactive expansion while the experimental systems based ranking upon user information on temporal relevance and partial relevance. Two further experiments compare using user behavior (number assessed relevant and similarity of relevant documents) to choose a query expansion technique against a non-selective technique and finally the effect of providing the user with knowledge of the process. When partial relevance data and time of assessment data are incorporated in term ranking more relevant documents were recovered in fewer iterations, however retrieval effectiveness overall was not improved. The subjects, none-the-less, rated the suggested terms as more useful and used them more heavily. Explanations of what the feedback techniques were doing led to higher use of the techniques.
-
Liu, R.-L.; Huang, Y.-C.: Ranker enhancement for proximity-based ranking of biomedical texts (2011)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 947) [ClassicSimilarity], result of:
0.10256278 = score(doc=947,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 947, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=947)
0.25 = coord(1/4)
- Abstract
- Biomedical decision making often requires relevant evidence from the biomedical literature. Retrieval of the evidence calls for a system that receives a natural language query for a biomedical information need and, among the huge amount of texts retrieved for the query, ranks relevant texts higher for further processing. However, state-of-the-art text rankers have weaknesses in dealing with biomedical queries, which often consist of several correlating concepts and prefer those texts that completely talk about the concepts. In this article, we present a technique, Proximity-Based Ranker Enhancer (PRE), to enhance text rankers by term-proximity information. PRE assesses the term frequency (TF) of each term in the text by integrating three types of term proximity to measure the contextual completeness of query terms appearing in nearby areas in the text being ranked. Therefore, PRE may serve as a preprocessor for (or supplement to) those rankers that consider TF in ranking, without the need to change the algorithms and development processes of the rankers. Empirical evaluation shows that PRE significantly improves various kinds of text rankers, and when compared with several state-of-the-art techniques that enhance rankers by term-proximity information, PRE may more stably and significantly enhance the rankers.
-
Pan, M.; Huang, J.X.; He, T.; Mao, Z.; Ying, Z.; Tu, X.: ¬A simple kernel co-occurrence-based enhancement for pseudo-relevance feedback (2020)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 678) [ClassicSimilarity], result of:
0.10256278 = score(doc=678,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 678, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=678)
0.25 = coord(1/4)
- Abstract
- Pseudo-relevance feedback is a well-studied query expansion technique in which it is assumed that the top-ranked documents in an initial set of retrieval results are relevant and expansion terms are then extracted from those documents. When selecting expansion terms, most traditional models do not simultaneously consider term frequency and the co-occurrence relationships between candidate terms and query terms. Intuitively, however, a term that has a higher co-occurrence with a query term is more likely to be related to the query topic. In this article, we propose a kernel co-occurrence-based framework to enhance retrieval performance by integrating term co-occurrence information into the Rocchio model and a relevance language model (RM3). Specifically, a kernel co-occurrence-based Rocchio method (KRoc) and a kernel co-occurrence-based RM3 method (KRM3) are proposed. In our framework, co-occurrence information is incorporated into both the factor of the term discrimination power and the factor of the within-document term weight to boost retrieval performance. The results of a series of experiments show that our proposed methods significantly outperform the corresponding strong baselines over all data sets in terms of the mean average precision and over most data sets in terms of P@10. A direct comparison of standard Text Retrieval Conference data sets indicates that our proposed methods are at least comparable to state-of-the-art approaches.
-
Purpura, A.; Silvello, G.; Susto, G.A.: Learning to rank from relevance judgments distributions (2022)
0.03
0.025640694 = product of:
0.10256278 = sum of:
0.10256278 = weight(_text_:higher in 1646) [ClassicSimilarity], result of:
0.10256278 = score(doc=1646,freq=2.0), product of:
0.35420436 = queryWeight, product of:
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.06757609 = queryNorm
0.2895582 = fieldWeight in 1646, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.2415633 = idf(docFreq=638, maxDocs=44421)
0.0390625 = fieldNorm(doc=1646)
0.25 = coord(1/4)
- Abstract
- LEarning TO Rank (LETOR) algorithms are usually trained on annotated corpora where a single relevance label is assigned to each available document-topic pair. Within the Cranfield framework, relevance labels result from merging either multiple expertly curated or crowdsourced human assessments. In this paper, we explore how to train LETOR models with relevance judgments distributions (either real or synthetically generated) assigned to document-topic pairs instead of single-valued relevance labels. We propose five new probabilistic loss functions to deal with the higher expressive power provided by relevance judgments distributions and show how they can be applied both to neural and gradient boosting machine (GBM) architectures. Moreover, we show how training a LETOR model on a sampled version of the relevance judgments from certain probability distributions can improve its performance when relying either on traditional or probabilistic loss functions. Finally, we validate our hypothesis on real-world crowdsourced relevance judgments distributions. Overall, we observe that relying on relevance judgments distributions to train different LETOR models can boost their performance and even outperform strong baselines such as LambdaMART on several test collections.
-
Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003)
0.02
0.017992323 = product of:
0.07196929 = sum of:
0.07196929 = weight(_text_:und in 2734) [ClassicSimilarity], result of:
0.07196929 = score(doc=2734,freq=12.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.48018923 = fieldWeight in 2734, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=2734)
0.25 = coord(1/4)
- Abstract
- Die Menge an Daten im Internet steigt weiter rapide an. Damit wächst auch der Bedarf an qualitativ hochwertigen Information Retrieval Diensten zur Orientierung und problemorientierten Suche. Die Entscheidung für die Benutzung oder Beschaffung von Information Retrieval Software erfordert aussagekräftige Evaluierungsergebnisse. Dieser Beitrag stellt neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen vor und zeigt den Trend zu Spezialisierung und Diversifizierung von Evaluierungsstudien, die den Realitätsgrad derErgebnisse erhöhen. DerSchwerpunkt liegt auf dem Retrieval von Fachtexten, Internet-Seiten und Multimedia-Objekten.
- Source
- Information - Wissenschaft und Praxis. 54(2003) H.4, S.203-210
-
Nagelschmidt, M.: Verfahren zur Anfragemodifikation im Information Retrieval (2008)
0.02
0.01742099 = product of:
0.06968396 = sum of:
0.06968396 = weight(_text_:und in 3774) [ClassicSimilarity], result of:
0.06968396 = score(doc=3774,freq=20.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.4649412 = fieldWeight in 3774, product of:
4.472136 = tf(freq=20.0), with freq of:
20.0 = termFreq=20.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.046875 = fieldNorm(doc=3774)
0.25 = coord(1/4)
- Abstract
- Für das Modifizieren von Suchanfragen kennt das Information Retrieval vielfältige Möglichkeiten. Nach einer einleitenden Darstellung der Wechselwirkung zwischen Informationsbedarf und Suchanfrage wird eine konzeptuelle und typologische Annäherung an Verfahren zur Anfragemodifikation gegeben. Im Anschluss an eine kurze Charakterisierung des Fakten- und des Information Retrieval, sowie des Vektorraum- und des probabilistischen Modells, werden intellektuelle, automatische und interaktive Modifikationsverfahren vorgestellt. Neben klassischen intellektuellen Verfahren, wie der Blockstrategie und der "Citation Pearl Growing"-Strategie, umfasst die Darstellung der automatischen und interaktiven Verfahren Modifikationsmöglichkeiten auf den Ebenen der Morphologie, der Syntax und der Semantik von Suchtermen. Darüber hinaus werden das Relevance Feedback, der Nutzen informetrischer Analysen und die Idee eines assoziativen Retrievals auf der Basis von Clustering- und terminologischen Techniken, sowie zitationsanalytischen Verfahren verfolgt. Ein Eindruck für die praktischen Gestaltungsmöglichkeiten der behandelten Verfahren soll abschließend durch fünf Anwendungsbeispiele vermittelt werden.
-
Fuhr, N.: Zur Überwindung der Diskrepanz zwischen Retrievalforschung und -praxis (1990)
0.02
0.016424669 = product of:
0.065698676 = sum of:
0.065698676 = weight(_text_:und in 6624) [ClassicSimilarity], result of:
0.065698676 = score(doc=6624,freq=10.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.4383508 = fieldWeight in 6624, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=6624)
0.25 = coord(1/4)
- Abstract
- In diesem Beitrag werden einige Forschungsergebnisse des Information Retrieval vorgestellt, die unmittelbar zur Verbesserung der Retrievalqualität für bereits existierende Datenbanken eingesetzt werden können: Linguistische Algorithmen zur Grund- und Stammformreduktion unterstützen die Suche nach Flexions- und Derivationsformen von Suchtermen. Rankingalgorithmen, die Frage- und Dokumentterme gewichten, führen zu signifikant besseren Retrievalergebnissen als beim Booleschen Retrieval. Durch Relevance Feedback können die Retrievalqualität weiter gesteigert und außerdem der Benutzer bei der sukzessiven Modifikation seiner Frageformulierung unterstützt werden. Es wird eine benutzerfreundliche Bedienungsoberfläche für ein System vorgestellt, das auf diesen Konzepten basiert.
-
Tober, M.; Hennig, L.; Furch, D.: SEO Ranking-Faktoren und Rang-Korrelationen 2014 : Google Deutschland (2014)
0.02
0.016424669 = product of:
0.065698676 = sum of:
0.065698676 = weight(_text_:und in 2484) [ClassicSimilarity], result of:
0.065698676 = score(doc=2484,freq=10.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.4383508 = fieldWeight in 2484, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=2484)
0.25 = coord(1/4)
- Abstract
- Dieses Whitepaper beschäftigt sich mit der Definition und Bewertung von Faktoren, die eine hohe Rangkorrelation-Koeffizienz mit organischen Suchergebnissen aufweisen und dient dem Zweck der tieferen Analyse von Suchmaschinen-Algorithmen. Die Datenerhebung samt Auswertung bezieht sich auf Ranking-Faktoren für Google-Deutschland im Jahr 2014. Zusätzlich wurden die Korrelationen und Faktoren unter anderem anhand von Durchschnitts- und Medianwerten sowie Entwicklungstendenzen zu den Vorjahren hinsichtlich ihrer Relevanz für vordere Suchergebnis-Positionen interpretiert.
-
Behnert, C.; Borst, T.: Neue Formen der Relevanz-Sortierung in bibliothekarischen Informationssystemen : das DFG-Projekt LibRank (2015)
0.02
0.016424669 = product of:
0.065698676 = sum of:
0.065698676 = weight(_text_:und in 392) [ClassicSimilarity], result of:
0.065698676 = score(doc=392,freq=10.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.4383508 = fieldWeight in 392, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=392)
0.25 = coord(1/4)
- Abstract
- Das von der DFG geförderte Projekt LibRank erforscht neue Rankingverfahren für bibliothekarische Informationssysteme, die aufbauend auf Erkenntnissen aus dem Bereich Websuche qualitätsinduzierende Faktoren wie z. B. Aktualität, Popularität und Verfügbarkeit von einzelnen Medien berücksichtigen. Die konzipierten Verfahren werden im Kontext eines in den Wirtschaftswissenschaften häufig genutzten Rechercheportals (EconBiz) entwickelt und in einem Testsystem systematisch evaluiert. Es werden Rankingfaktoren, die für den Bibliotheksbereich von besonderem Interesse sind, vorgestellt und exemplarisch Probleme und Herausforderungen aufgezeigt.
- Source
- Bibliothek: Forschung und Praxis. 39(2015) H.3, S.384-393
-
Dreßler, H.: Fuzzy Information Retrieval (2008)
0.02
0.015903117 = product of:
0.06361247 = sum of:
0.06361247 = weight(_text_:und in 3300) [ClassicSimilarity], result of:
0.06361247 = score(doc=3300,freq=6.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.42443132 = fieldWeight in 3300, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.078125 = fieldNorm(doc=3300)
0.25 = coord(1/4)
- Abstract
- Nach einer Erläuterung der Grundlagen der Fuzzylogik wird das Prinzip der unscharfen Suche dargestellt und die Unterschiede zum herkömmlichen Information Retrieval beschrieben. Am Beispiel der Suche nach Steinen für ein Mauerwerk wird gezeigt, wie eine unscharfe Suche in der D&WFuzzydatenbank erfolgreich durchgeführt werden kann und zu eindeutigen Ergebnissen führt.
- Source
- Information - Wissenschaft und Praxis. 59(2008) H.6/7, S.351-352
-
Elsweiler, D.; Kruschwitz, U.: Interaktives Information Retrieval (2023)
0.01
0.01469067 = product of:
0.05876268 = sum of:
0.05876268 = weight(_text_:und in 1798) [ClassicSimilarity], result of:
0.05876268 = score(doc=1798,freq=8.0), product of:
0.14987694 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06757609 = queryNorm
0.39207286 = fieldWeight in 1798, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0625 = fieldNorm(doc=1798)
0.25 = coord(1/4)
- Abstract
- Interaktives Information Retrieval (IIR) zielt darauf ab, die komplexen Interaktionen zwischen Nutzer*innen und Systemen im IR zu verstehen. Es gibt umfangreiche Literatur zu Themen wie der formalen Modellierung des Suchverhaltens, der Simulation der Interaktion, den interaktiven Funktionen zur Unterstützung des Suchprozesses und der Evaluierung interaktiver Suchsysteme. Dabei ist die interaktive Unterstützung nicht allein auf die Suche beschränkt, sondern hat ebenso die Hilfe bei Navigation und Exploration zum Ziel.
- Source
- Grundlagen der Informationswissenschaft. Hrsg.: Rainer Kuhlen, Dirk Lewandowski, Wolfgang Semar und Christa Womser-Hacker. 7., völlig neu gefasste Ausg