Search (121 results, page 1 of 7)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.07
    0.07392389 = product of:
      0.14784779 = sum of:
        0.09665604 = weight(_text_:java in 2211) [ClassicSimilarity], result of:
          0.09665604 = score(doc=2211,freq=2.0), product of:
            0.49653333 = queryWeight, product of:
              7.0475073 = idf(docFreq=104, maxDocs=44421)
              0.07045517 = queryNorm
            0.19466174 = fieldWeight in 2211, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.0475073 = idf(docFreq=104, maxDocs=44421)
              0.01953125 = fieldNorm(doc=2211)
        0.051191743 = weight(_text_:have in 2211) [ClassicSimilarity], result of:
          0.051191743 = score(doc=2211,freq=14.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.23043081 = fieldWeight in 2211, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.01953125 = fieldNorm(doc=2211)
      0.5 = coord(2/4)
    
    Abstract
    In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
    From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.
    Nevertheless, because thesaurus use has shown to improve retrieval, for our method we integrate functions in the search interface that permit users to explore built-in search vocabularies to improve retrieval from digital libraries. Our method automatically generates the terms and their semantic relationships representing relevant topics covered in a digital library. We call these generated terms the "concepts", and the generated terms and their semantic relationships we call the "concept space". Additionally, we used a visualization technique to display the concept space and allow users to interact with this space. The automatically generated term set is considered to be more representative of subject area in a corpus than an "externally" imposed thesaurus, and our method has the potential of saving a significant amount of time and labor for those who have been manually creating thesauri as well. Information visualization is an emerging discipline and developed very quickly in the last decade. With growing volumes of documents and associated complexities, information visualization has become increasingly important. Researchers have found information visualization to be an effective way to use and understand information while minimizing a user's cognitive load. Our work was based on an algorithmic approach of concept discovery and association. Concepts are discovered using an algorithm based on an automated thesaurus generation procedure. Subsequently, similarities among terms are computed using the cosine measure, and the associations among terms are established using a method known as max-min distance clustering. The concept space is then visualized in a spring embedding graph, which roughly shows the semantic relationships among concepts in a 2-D visual representation. The semantic space of the visualization is used as a medium for users to retrieve the desired documents. In the remainder of this article, we present our algorithmic approach of concept generation and clustering, followed by description of the visualization technique and interactive interface. The paper ends with key conclusions and discussions on future work.
    Content
    The JAVA applet is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. A prototype of this interface has been developed and is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. The D-Lib search interface is available at <http://www.dlib.org/Architext/AT-dlib2query.html>.
  2. Shiri, A.A.; Revie, C.; Chowdhury, G.: Thesaurus-assisted search term selection and query expansion : a review of user-centred studies (2002) 0.04
    0.044323187 = product of:
      0.088646375 = sum of:
        0.022974849 = weight(_text_:und in 2330) [ClassicSimilarity], result of:
          0.022974849 = score(doc=2330,freq=2.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.14702731 = fieldWeight in 2330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.046875 = fieldNorm(doc=2330)
        0.065671526 = weight(_text_:have in 2330) [ClassicSimilarity], result of:
          0.065671526 = score(doc=2330,freq=4.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.29560906 = fieldWeight in 2330, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=2330)
      0.5 = coord(2/4)
    
    Abstract
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing an studies that adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken an the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summarises the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections: first, studies an thesaurus-aided search term selection; and second, studies dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  3. Shiri, A.A.; Revie, C.: End-user interaction with thesauri : an evaluation of cognitive overlap in search term selection (2004) 0.03
    0.034705818 = product of:
      0.069411635 = sum of:
        0.022974849 = weight(_text_:und in 3658) [ClassicSimilarity], result of:
          0.022974849 = score(doc=3658,freq=2.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.14702731 = fieldWeight in 3658, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.046875 = fieldNorm(doc=3658)
        0.046436783 = weight(_text_:have in 3658) [ClassicSimilarity], result of:
          0.046436783 = score(doc=3658,freq=2.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.20902719 = fieldWeight in 3658, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=3658)
      0.5 = coord(2/4)
    
    Abstract
    The use of thesaurus-enhanced search tools is an the increase. This paper provides an insight into end-users interaction with and perceptions of such tools. In particular the overlap between users' initial query formulation and thesaurus structures is investigated. This investigation involved the performance of genuine search tasks an the CAB Abstracts database by academic users in the domain of veterinary medicine. The perception of these users regarding the nature and usefulness of the terms suggested from the thesaurus during the search interaction is reported. The results indicated that around 80% of terms entered were matched either exactly or partially to thesaurus terms. Users found over 90% of the terms suggested to be close to their search topics and where terms were selected they indicated that around 50% were to support a 'narrowing down' activity. These findings have implications for the design of thesaurus-enhanced interfaces.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  4. Robertson, S.E.; Walker, S.; Hancock-Beaulieu, M.M.: Large test collection experiments of an operational, interactive system : OKAPI at TREC (1995) 0.02
    0.023459004 = product of:
      0.09383602 = sum of:
        0.09383602 = weight(_text_:have in 33) [ClassicSimilarity], result of:
          0.09383602 = score(doc=33,freq=6.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.42238668 = fieldWeight in 33, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.0546875 = fieldNorm(doc=33)
      0.25 = coord(1/4)
    
    Abstract
    The Okapi system has been used in a series of experiments on the TREC collections, investiganting probabilistic methods, relevance feedback, and query expansion, and interaction issues. Some new probabilistic models have been developed, resulting in simple weigthing functions that take account of document length and within document and within query term frequency. All have been shown to be beneficial when based on large quantities of relevance data as in the routing task. Interaction issues are much more difficult to evaluate in the TREC framework, and no benefits have yet been demonstrated from feedback based on small numbers of 'relevant' items identified by intermediary searchers
  5. Morato, J.; Llorens, J.; Genova, G.; Moreiro, J.A.: Experiments in discourse analysis impact on information classification and retrieval algorithms (2003) 0.02
    0.021632457 = product of:
      0.08652983 = sum of:
        0.08652983 = weight(_text_:have in 2083) [ClassicSimilarity], result of:
          0.08652983 = score(doc=2083,freq=10.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.38949913 = fieldWeight in 2083, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.0390625 = fieldNorm(doc=2083)
      0.25 = coord(1/4)
    
    Abstract
    Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible to carry out text analysis by means of linguistic and extra-linguistic knowledge. Since the mid 80s, research has tended to pay more attention to context, giving discourse analysis a more central role. The research presented in this paper aims to check whether discourse variables have an impact on modern information retrieval and classification algorithms. In order to evaluate this hypothesis, a functional framework for information analysis in an automated environment has been proposed, where the n-grams (filtering) and the k-means and Chen's classification algorithms have been tested against sub-collections of documents based on the following discourse variables: "Genre", "Register", "Domain terminology", and "Document structure". The results obtained with the algorithms for the different sub-collections were compared to the MeSH information structure. These demonstrate that n-grams does not appear to have a clear dependence on discourse variables, though the k-means classification algorithm does, but only on domain terminology and document structure, and finally Chen's algorithm has a clear dependence on all of the discourse variables. This information could be used to design better classification algorithms, where discourse variables should be taken into account. Other minor conclusions drawn from these results are also presented.
  6. Schek, M.: Automatische Klassifizierung und Visualisierung im Archiv der Süddeutschen Zeitung (2005) 0.02
    0.020653864 = product of:
      0.08261546 = sum of:
        0.08261546 = weight(_text_:und in 5884) [ClassicSimilarity], result of:
          0.08261546 = score(doc=5884,freq=76.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.5286968 = fieldWeight in 5884, product of:
              8.717798 = tf(freq=76.0), with freq of:
                76.0 = termFreq=76.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.02734375 = fieldNorm(doc=5884)
      0.25 = coord(1/4)
    
    Abstract
    Die Süddeutsche Zeitung (SZ) verfügt seit ihrer Gründung 1945 über ein Pressearchiv, das die Texte der eigenen Redakteure und zahlreicher nationaler und internationaler Publikationen dokumentiert und auf Anfrage für Recherchezwecke bereitstellt. Die Einführung der EDV begann Anfang der 90er Jahre mit der digitalen Speicherung zunächst der SZ-Daten. Die technische Weiterentwicklung ab Mitte der 90er Jahre diente zwei Zielen: (1) dem vollständigen Wechsel von der Papierablage zur digitalen Speicherung und (2) dem Wandel von einer verlagsinternen Dokumentations- und Auskunftsstelle zu einem auch auf dem Markt vertretenen Informationsdienstleister. Um die dabei entstehenden Aufwände zu verteilen und gleichzeitig Synergieeffekte zwischen inhaltlich verwandten Archiven zu erschließen, gründeten der Süddeutsche Verlag und der Bayerische Rundfunk im Jahr 1998 die Dokumentations- und Informationszentrum (DIZ) München GmbH, in der die Pressearchive der beiden Gesellschafter und das Bildarchiv des Süddeutschen Verlags zusammengeführt wurden. Die gemeinsam entwickelte Pressedatenbank ermöglichte das standortübergreifende Lektorat, die browserbasierte Recherche für Redakteure und externe Kunden im Intraund Internet und die kundenspezifischen Content Feeds für Verlage, Rundfunkanstalten und Portale. Die DIZPressedatenbank enthält zur Zeit 6,9 Millionen Artikel, die jeweils als HTML oder PDF abrufbar sind. Täglich kommen ca. 3.500 Artikel hinzu, von denen ca. 1.000 lektoriert werden. Das Lektorat erfolgt im DIZ nicht durch die Vergabe von Schlagwörtern am Dokument, sondern durch die Verlinkung der Artikel mit "virtuellen Mappen", den Dossiers. Diese stellen die elektronische Repräsentation einer Papiermappe dar und sind das zentrale Erschließungsobjekt. Im Gegensatz zu statischen Klassifikationssystemen ist die Dossierstruktur dynamisch und aufkommensabhängig, d.h. neue Dossiers werden hauptsächlich anhand der aktuellen Berichterstattung erstellt. Insgesamt enthält die DIZ-Pressedatenbank ca. 90.000 Dossiers, davon sind 68.000 Sachthemen (Topics), Personen und Institutionen. Die Dossiers sind untereinander zum "DIZ-Wissensnetz" verlinkt.
    DIZ definiert das Wissensnetz als Alleinstellungsmerkmal und wendet beträchtliche personelle Ressourcen für die Aktualisierung und Oualitätssicherung der Dossiers auf. Nach der Umstellung auf den komplett digitalisierten Workflow im April 2001 identifizierte DIZ vier Ansatzpunkte, wie die Aufwände auf der Inputseite (Lektorat) zu optimieren sind und gleichzeitig auf der Outputseite (Recherche) das Wissensnetz besser zu vermarkten ist: 1. (Teil-)Automatische Klassifizierung von Pressetexten (Vorschlagwesen) 2. Visualisierung des Wissensnetzes (Topic Mapping) 3. (Voll-)Automatische Klassifizierung und Optimierung des Wissensnetzes 4. Neue Retrievalmöglichkeiten (Clustering, Konzeptsuche) Die Projekte 1 und 2 "Automatische Klassifizierung und Visualisierung" starteten zuerst und wurden beschleunigt durch zwei Entwicklungen: - Der Bayerische Rundfunk (BR), ursprünglich Mitbegründer und 50%-Gesellschafter der DIZ München GmbH, entschloss sich aus strategischen Gründen, zum Ende 2003 aus der Kooperation auszusteigen. - Die Medienkrise, hervorgerufen durch den massiven Rückgang der Anzeigenerlöse, erforderte auch im Süddeutschen Verlag massive Einsparungen und die Suche nach neuen Erlösquellen. Beides führte dazu, dass die Kapazitäten im Bereich Pressedokumentation von ursprünglich rund 20 (nur SZ, ohne BR-Anteil) auf rund 13 zum 1. Januar 2004 sanken und gleichzeitig die Aufwände für die Pflege des Wissensnetzes unter verstärkten Rechtfertigungsdruck gerieten. Für die Projekte 1 und 2 ergaben sich daraus drei quantitative und qualitative Ziele: - Produktivitätssteigerung im Lektorat - Konsistenzverbesserung im Lektorat - Bessere Vermarktung und intensivere Nutzung der Dossiers in der Recherche Alle drei genannten Ziele konnten erreicht werden, wobei insbesondere die Produktivität im Lektorat gestiegen ist. Die Projekte 1 und 2 "Automatische Klassifizierung und Visualisierung" sind seit Anfang 2004 erfolgreich abgeschlossen. Die Folgeprojekte 3 und 4 laufen seit Mitte 2004 und sollen bis Mitte 2005 abgeschlossen sein. Im folgenden wird in Abschnitt 2 die Produktauswahl und Arbeitsweise der Automatischen Klassifizierung beschrieben. Abschnitt 3 schildert den Einsatz der Wissensnetz-Visualisierung in Lektorat und Recherche. Abschnitt 4 fasst die Ergebnisse der Projekte 1 und 2 zusammen und gibt einen Ausblick auf die Ziele der Projekte 3 und 4.
  7. Evens, M.: Thesaural relations in information retrieval (2002) 0.02
    0.020107718 = product of:
      0.08043087 = sum of:
        0.08043087 = weight(_text_:have in 2201) [ClassicSimilarity], result of:
          0.08043087 = score(doc=2201,freq=6.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.3620457 = fieldWeight in 2201, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=2201)
      0.25 = coord(1/4)
    
    Abstract
    Thesaural relations have long been used in information retrieval to enrich queries; they have sometimes been used to cluster documents as well. Sometimes the first query to an information retrieval system yields no results at all, or, what can be even more disconcerting, many thousands of hits. One solution is to rephrase the query, improving the choice of query terms by using related terms of different types. A collection of related terms is often called a thesaurus. This chapter describes the lexical-semantic relations that have been used in building thesauri and summarizes some of the effects of using these relational thesauri in information retrieval experiments
  8. Narock, T.; Zhou, L.; Yoon, V.: Semantic similarity of ontology instances using polarity mining (2013) 0.02
    0.020107718 = product of:
      0.08043087 = sum of:
        0.08043087 = weight(_text_:have in 1620) [ClassicSimilarity], result of:
          0.08043087 = score(doc=1620,freq=6.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.3620457 = fieldWeight in 1620, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=1620)
      0.25 = coord(1/4)
    
    Abstract
    Semantic similarity is vital to many areas, such as information retrieval. Various methods have been proposed with a focus on comparing unstructured text documents. Several of these have been enhanced with ontology; however, they have not been applied to ontology instances. With the growth in ontology instance data published online through, for example, Linked Open Data, there is an increasing need to apply semantic similarity to ontology instances. Drawing on ontology-supported polarity mining (OSPM), we propose an algorithm that enhances the computation of semantic similarity with polarity mining techniques. The algorithm is evaluated with online customer review data. The experimental results show that the proposed algorithm outperforms the baseline algorithm in multiple settings.
  9. Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.02
    0.020107718 = product of:
      0.08043087 = sum of:
        0.08043087 = weight(_text_:have in 3799) [ClassicSimilarity], result of:
          0.08043087 = score(doc=3799,freq=6.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.3620457 = fieldWeight in 3799, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=3799)
      0.25 = coord(1/4)
    
    Abstract
    With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
  10. Hauer, M.: Neue OPACs braucht das Land ... dandelon.com (2006) 0.02
    0.019896807 = product of:
      0.07958723 = sum of:
        0.07958723 = weight(_text_:und in 47) [ClassicSimilarity], result of:
          0.07958723 = score(doc=47,freq=24.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.50931764 = fieldWeight in 47, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.046875 = fieldNorm(doc=47)
      0.25 = coord(1/4)
    
    Abstract
    In dandelon.com werden im Gegensatz zu den bisherigen Federated Search-Portal-Ansätzen die Titel von Medien neu mittels intelligentCAPTURE dezentral und kollaborativ erschlossen und inhaltlich stark erweitert. intelligentCAPTURE erschließt maschinell bisher Buchinhaltsverzeichnisse, Bücher, Klappentexte, Aufsätze und Websites, übernimmt bibliografische Daten aus Bibliotheken (XML, Z.39.50), von Verlagen (ONIX + Cover Pages), Zeitschriftenagenturen (Swets) und Buchhandel (SOAP) und exportierte maschinelle Indexate und aufbereitete Dokumente an die Bibliothekskataloge (MAB, MARC, XML) oder Dokumentationssysteme, an dandelon.com und teils auch an Fachportale. Die Daten werden durch Scanning und OCR, durch Import von Dateien und Lookup auf Server und durch Web-Spidering/-Crawling gewonnen. Die Qualität der Suche in dandelon.com ist deutlich besser als in bisherigen Bibliothekssystemen. Die semantische, multilinguale Suche mit derzeit 1,2 Millionen Fachbegriffen trägt zu den guten Suchergebnissen stark bei.
    Source
    Spezialbibliotheken zwischen Auftrag und Ressourcen: 6.-9. September 2005 in München, 30. Arbeits- und Fortbildungstagung der ASpB e.V. / Sektion 5 im Deutschen Bibliotheksverband. Red.: M. Brauer
  11. Faaborg, A.; Lagoze, C.: Semantic browsing (2003) 0.02
    0.019154195 = product of:
      0.07661678 = sum of:
        0.07661678 = weight(_text_:have in 2026) [ClassicSimilarity], result of:
          0.07661678 = score(doc=2026,freq=4.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.34487724 = fieldWeight in 2026, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.0546875 = fieldNorm(doc=2026)
      0.25 = coord(1/4)
    
    Abstract
    We have created software applications that allow users to both author and use Semantic Web metadata. To create and use a layer of semantic content on top of the existing Web, we have (1) implemented a user interface that expedites the task of attributing metadata to resources on the Web, and (2) augmented a Web browser to leverage this semantic metadata to provide relevant information and tasks to the user. This project provides a framework for annotating and reorganizing existing files, pages, and sites on the Web that is similar to Vannevar Bushrsquos original concepts of trail blazing and associative indexing.
  12. Schmitz-Esser, W.: EXPO-INFO 2000 : Visuelles Besucherinformationssystem für Weltausstellungen (2000) 0.02
    0.017257709 = product of:
      0.069030836 = sum of:
        0.069030836 = weight(_text_:und in 2404) [ClassicSimilarity], result of:
          0.069030836 = score(doc=2404,freq=26.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.44176215 = fieldWeight in 2404, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.0390625 = fieldNorm(doc=2404)
      0.25 = coord(1/4)
    
    Abstract
    Das aktuelle Wissen der Welt im Spiegel einer Weltausstellung: Wie stellt man das dar und wie macht man es Interessierten zugänglich - in der Ausstellung, in Publikationen, im Funk und über das Internet? Was man alles auf einer Weltausstellung an der Schwelle zum dritten Jahrtausend sehen und erfahren kann, sprengt in Fülle und Vielfalt jeden individuell faßbaren Rahmen. Schmitz-Esser zeigt in seinem Buch, wie der Besucher wahlweise in vier Sprachen die Weltausstellung erleben und die Quintessenz davon mitnehmen kann. Ermöglicht wird dies durch das Konzept des virtuellen "Wissens in der Kapsel", das so aufbereitet ist, daß es in allen gängigen medialen Formen und für unterschiedlichste Wege der Aneignung eingesetzt werden kann. Die Lösung ist nicht nur eine Sache der Informatik und Informationstechnologie, sondern vielmehr auch eine Herausforderung an Informationswissenschaft und Computerlinguistik. Das Buch stellt Ziel, Ansatz, Komponenten und Voraussetzungen dafür dar.
    Content
    Willkommene Anregung schon am Eingang.- Vertiefung des Wissens während der Ausstellung.- Alles für das Wohlbefinden.- Die Systemstruktur und ihre einzelnen Elemente.- Wovon alles ausgeht.- Den Stoff als Topics und Subtopics strukturieren.- Die Nutshells.- Der Proxy-Text.Der Thesaurus.- Gedankenraumreisen.- Und zurück in die reale Welt.- Weitergehende Produkte.- Das EXPO-Infosystem auf einen Blick.- Register.- Literaturverzeichnis.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  13. Rahmstorf, G.: Integriertes Management inhaltlicher Datenarten (2001) 0.02
    0.017231137 = product of:
      0.068924546 = sum of:
        0.068924546 = weight(_text_:und in 6856) [ClassicSimilarity], result of:
          0.068924546 = score(doc=6856,freq=18.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.44108194 = fieldWeight in 6856, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.046875 = fieldNorm(doc=6856)
      0.25 = coord(1/4)
    
    Abstract
    Inhaltliche Daten sind im Unterschied zu Messdaten, Zahlen, Analogsignalen und anderen Informationen solche Daten, die sich auch sprachlich interpretieren lassen. Sie transportieren Inhalte, die sich benennen lassen. Zu inhaltlichen Daten gehören z. B. Auftragsdaten, Werbetexte, Produktbezeichnungen und Patentklassifikationen. Die meisten Daten, die im Internet kommuniziert werden, sind inhaltliche Daten. Man kann inhaltliche Daten in vier Klassen einordnen: * Wissensdaten - formatierte Daten (Fakten u. a. Daten in strukturierter Form), - nichtformatierte Daten (vorwiegend Texte); * Zugriffsdaten - Benennungsdaten (Wortschatz, Terminologie, Themen u. a.), - Begriffsdaten (Ordnungs- und Bedeutungsstrukturen). In der Wissensorganisation geht es hauptsächlich darum, die unüberschaubare Fülle des Wissens zu ordnen und wiederauffindbar zu machen. Daher befasst sich das Fach nicht nur mit dem Wissen selbst, selbst sondern auch mit den Mitteln, die dazu verwendet werden, das Wissen zu ordnen und auffindbar zu machen
    Series
    Tagungen der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis; 4
    Source
    Information Research & Content Management: Orientierung, Ordnung und Organisation im Wissensmarkt; 23. DGI-Online-Tagung der DGI und 53. Jahrestagung der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. DGI, Frankfurt am Main, 8.-10.5.2001. Proceedings. Hrsg.: R. Schmidt
  14. Schwartz, C.: Web search engines (1998) 0.02
    0.016417881 = product of:
      0.065671526 = sum of:
        0.065671526 = weight(_text_:have in 6700) [ClassicSimilarity], result of:
          0.065671526 = score(doc=6700,freq=4.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.29560906 = fieldWeight in 6700, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=6700)
      0.25 = coord(1/4)
    
    Abstract
    This reviews looks briefly at the history of WWW search engine development, considers the current state of affairs, and reflects on the future. Networked discovery tools have evolved along with Internet resource availability. WWW search engines display some complexity in their variety, content, resource acquisition strategies, and in the array of tools the deploy to assist users. A small but growing body of evaluation literature, much of it not systematic in nature, indicates that performance effectiveness is difficult to assess in this setting. Significant improvements in general-content search engine retrieval and ranking performance may not be possible, and are probalby not worth the effort, although search engine providers have introduced some rudimentary attempts at personalization, summarization, and query expansion. The shift to distributed search across multitype database systems could extend general networked discovery and retrieval to include smaller resource collections with rich metadata and navigation tools
  15. Greenberg, J.: Optimal query expansion (QE) processing methods with semantically encoded structured thesaurus terminology (2001) 0.02
    0.016417881 = product of:
      0.065671526 = sum of:
        0.065671526 = weight(_text_:have in 6750) [ClassicSimilarity], result of:
          0.065671526 = score(doc=6750,freq=4.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.29560906 = fieldWeight in 6750, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=6750)
      0.25 = coord(1/4)
    
    Abstract
    While researchers have explored the value of structured thesauri as controlled vocabularies for general information retrieval (IR) activities, they have not identified the optimal query expansion (QE) processing methods for taking advantage of the semantic encoding underlying the terminology in these tools. The study reported on in this article addresses this question, and examined whether QE via semantically encoded thesauri terminology is more effective in the automatic or interactive processing environment. The research found that, regardless of end-users' retrieval goals, synonyms and partial synonyms (SYNs) and narrower terms (NTs) are generally good candidates for automatic QE and that related (RTs) are better candidates for interactive QE. The study also examined end-users' selection of semantically encoded thesauri terms for interactive QE, and explored how retrieval goals and QE processes may be combined in future thesauri-supported IR systems
  16. Arenas, M.; Cuenca Grau, B.; Kharlamov, E.; Marciuska, S.; Zheleznyakov, D.: Faceted search over ontology-enhanced RDF data (2014) 0.02
    0.016417881 = product of:
      0.065671526 = sum of:
        0.065671526 = weight(_text_:have in 3207) [ClassicSimilarity], result of:
          0.065671526 = score(doc=3207,freq=4.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.29560906 = fieldWeight in 3207, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=3207)
      0.25 = coord(1/4)
    
    Abstract
    An increasing number of applications rely on RDF, OWL2, and SPARQL for storing and querying data. SPARQL, however, is not targeted towards end-users, and suitable query interfaces are needed. Faceted search is a prominent approach for end-user data access, and several RDF-based faceted search systems have been developed. There is, however, a lack of rigorous theoretical underpinning for faceted search in the context of RDF and OWL2. In this paper, we provide such solid foundations. We formalise faceted interfaces for this context, identify a fragment of first-order logic capturing the underlying queries, and study the complexity of answering such queries for RDF and OWL2 profiles. We then study interface generation and update, and devise efficiently implementable algorithms. Finally, we have implemented and tested our faceted search algorithms for scalability, with encouraging results.
  17. Hoeber, O.: ¬A study of visually linked keywords to support exploratory browsing in academic search (2022) 0.02
    0.016417881 = product of:
      0.065671526 = sum of:
        0.065671526 = weight(_text_:have in 1645) [ClassicSimilarity], result of:
          0.065671526 = score(doc=1645,freq=4.0), product of:
            0.22215667 = queryWeight, product of:
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.07045517 = queryNorm
            0.29560906 = fieldWeight in 1645, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1531634 = idf(docFreq=5157, maxDocs=44421)
              0.046875 = fieldNorm(doc=1645)
      0.25 = coord(1/4)
    
    Abstract
    While the search interfaces used by common academic digital libraries provide easy access to a wealth of peer-reviewed literature, their interfaces provide little support for exploratory browsing. When faced with a complex search task (such as one that requires knowledge discovery), exploratory browsing is an important first step in an exploratory search process. To more effectively support exploratory browsing, we have designed and implemented a novel academic digital library search interface (KLink Search) with two new features: visually linked keywords and an interactive workspace. To study the potential value of these features, we have conducted a controlled laboratory study with 32 participants, comparing KLink Search to a baseline digital library search interface modeled after that used by IEEE Xplore. Based on subjective opinions, objective performance, and behavioral data, we show the value of adding lightweight visual and interactive features to academic digital library search interfaces to support exploratory browsing.
  18. Knorz, G.; Rein, B.: Semantische Suche in einer Hochschulontologie (2005) 0.02
    0.016414026 = product of:
      0.0656561 = sum of:
        0.0656561 = weight(_text_:und in 2852) [ClassicSimilarity], result of:
          0.0656561 = score(doc=2852,freq=12.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.42016557 = fieldWeight in 2852, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.0546875 = fieldNorm(doc=2852)
      0.25 = coord(1/4)
    
    Abstract
    Ontologien werden eingesetzt, um durch semantische Fundierung insbesondere für das Dokumentenretrieval eine grundlegend bessere Basis zu haben, als dies gegenwärtiger Stand der Technik ist. Vorgestellt wird eine an der FH Darmstadt entwickelte und eingesetzte Ontologie, die den Gegenstandsbereich Hochschule sowohl breit abdecken und gleichzeitig differenziert semantisch beschreiben soll. Das Problem der semantischen Suche besteht nun darin, dass sie für Informationssuchende so einfach wie bei gängigen Suchmaschinen zu nutzen sein soll, und gleichzeitig auf der Grundlage des aufwendigen Informationsmodells hochwertige Ergebnisse liefern muss. Es wird beschrieben, welche Möglichkeiten die verwendete Software K-Infinity bereitstellt und mit welchem Konzept diese Möglichkeiten für eine semantische Suche nach Dokumenten und anderen Informationseinheiten (Personen, Veranstaltungen, Projekte etc.) eingesetzt werden.
    Source
    Information - Wissenschaft und Praxis. 56(2005) H.5/6, S.281-290
  19. Knorz, G.; Rein, B.: Semantische Suche in einer Hochschulontologie : Ontologie-basiertes Information-Filtering und -Retrieval mit relationalen Datenbanken (2005) 0.02
    0.016414026 = product of:
      0.0656561 = sum of:
        0.0656561 = weight(_text_:und in 324) [ClassicSimilarity], result of:
          0.0656561 = score(doc=324,freq=12.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.42016557 = fieldWeight in 324, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.0546875 = fieldNorm(doc=324)
      0.25 = coord(1/4)
    
    Abstract
    Ontologien werden eingesetzt, um durch semantische Fundierung insbesondere für das Dokumentenretrieval eine grundlegend bessere Basis zu haben, als dies gegenwärtiger Stand der Technik ist. Vorgestellt wird eine an der FH Darmstadt entwickelte und eingesetzte Ontologie, die den Gegenstandsbereich Hochschule sowohl breit abdecken und gleichzeitig differenziert semantisch beschreiben soll. Das Problem der semantischen Suche besteht nun darin, dass sie für Informationssuchende so einfach wie bei gängigen Suchmaschinen zu nutzen sein soll, und gleichzeitig auf der Grundlage des aufwendigen Informationsmodells hochwertige Ergebnisse liefern muss. Es wird beschrieben, welche Möglichkeiten die verwendete Software K-Infinity bereitstellt und mit welchem Konzept diese Möglichkeiten für eine semantische Suche nach Dokumenten und anderen Informationseinheiten (Personen, Veranstaltungen, Projekte etc.) eingesetzt werden.
  20. Boteram, F.: Typisierung semantischer Relationen in integrierten Systemen der Wissensorganisation (2013) 0.02
    0.015874784 = product of:
      0.06349914 = sum of:
        0.06349914 = weight(_text_:und in 1919) [ClassicSimilarity], result of:
          0.06349914 = score(doc=1919,freq=22.0), product of:
            0.15626246 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.07045517 = queryNorm
            0.4063621 = fieldWeight in 1919, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.0390625 = fieldNorm(doc=1919)
      0.25 = coord(1/4)
    
    Abstract
    Die, differenzierte Typisierung semantischer Relationen hinsichtlich ihrer bedeutungstragenden inhaltlichen und formallogischen Eigenschaften in Systemen der Wissensorganisation ist eine Voraussetzung für leistungsstarke und benutzerfreundliche Modelle des information Retrieval und der Wissensexploration. Systeme, die mehrere Dokumentationssprachen miteinander verknüpfen und funktional integrieren, erfordern besondere Ansätze für die Typisierung der verwendeten oder benötigten Relationen. Aufbauend auf vorangegangenen Überlegungen zu Modellen der semantischen Interoperabilität in verteilten Systemen, welche durch ein zentrales Kernsystem miteinander verbunden und so in den übergeordneten Funktionszusammenhang der Wissensorganisation gestellt werden, werden differenzierte und funktionale Strategien zur Typisierung und stratifizierten Definition der unterschiedlichen Relationen in diesem System entwickelt. Um die von fortschrittlichen Retrievalparadigmen erforderten Funktionalitäten im Kontext vernetzter Systeme zur Wissensorganisation unterstützen zu können, werden die formallogischen, typologischen und strukturellen Eigenschaften sowie der eigentliche semantische Gehalt aller Relationstypen definiert, die zur Darstellung von Begriffsbeziehungen verwendet werden. Um die Vielzahl unterschiedlicher aber im Funktionszusammenhang des Gesamtsystems auf einander bezogenen Relationstypen präzise und effizient ordnen zu können, wird eine mehrfach gegliederte Struktur benötigt, welche die angestrebten Inventare in einer Ear den Nutzer übersichtlichen und intuitiv handhabbaren Form präsentieren und somit für eine Verwendung in explorativen Systemen vorhalten kann.

Years

Languages

  • e 80
  • d 39
  • f 1
  • More… Less…

Types

  • a 97
  • el 20
  • m 12
  • r 4
  • x 4
  • s 1
  • More… Less…