-
Sjöbergh, J.: Older versions of the ROUGEeval summarization evaluation system were easier to fool (2007)
0.03
0.026376635 = product of:
0.10550654 = sum of:
0.10550654 = weight(_text_:however in 1940) [ClassicSimilarity], result of:
0.10550654 = score(doc=1940,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.367069 = fieldWeight in 1940, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0625 = fieldNorm(doc=1940)
0.25 = coord(1/4)
- Abstract
- We show some limitations of the ROUGE evaluation method for automatic summarization. We present a method for automatic summarization based on a Markov model of the source text. By a simple greedy word selection strategy, summaries with high ROUGE-scores are generated. These summaries would however not be considered good by human readers. The method can be adapted to trick different settings of the ROUGEeval package.
-
Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002)
0.02
0.023313873 = product of:
0.09325549 = sum of:
0.09325549 = weight(_text_:however in 1601) [ClassicSimilarity], result of:
0.09325549 = score(doc=1601,freq=4.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.32444623 = fieldWeight in 1601, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=1601)
0.25 = coord(1/4)
- Abstract
- This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.
-
Sweeney, S.; Crestani, F.; Losada, D.E.: 'Show me more' : incremental length summarisation using novelty detection (2008)
0.02
0.023313873 = product of:
0.09325549 = sum of:
0.09325549 = weight(_text_:however in 3054) [ClassicSimilarity], result of:
0.09325549 = score(doc=3054,freq=4.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.32444623 = fieldWeight in 3054, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=3054)
0.25 = coord(1/4)
- Abstract
- The paper presents a study investigating the effects of incorporating novelty detection in automatic text summarisation. Condensing a textual document, automatic text summarisation can reduce the need to refer to the source document. It also offers a means to deliver device-friendly content when accessing information in non-traditional environments. An effective method of summarisation could be to produce a summary that includes only novel information. However, a consequence of focusing exclusively on novel parts may result in a loss of context, which may have an impact on the correct interpretation of the summary, with respect to the source document. In this study we compare two strategies to produce summaries that incorporate novelty in different ways: a constant length summary, which contains only novel sentences, and an incremental summary, containing additional sentences that provide context. The aim is to establish whether a summary that contains only novel sentences provides sufficient basis to determine relevance of a document, or if indeed we need to include additional sentences to provide context. Findings from the study seem to suggest that there is only a minimal difference in performance for the tasks we set our users and that the presence of contextual information is not so important. However, for the case of mobile information access, a summary that contains only novel information does offer benefits, given bandwidth constraints.
-
Lee, J.-H.; Park, S.; Ahn, C.-M.; Kim, D.: Automatic generic document summarization based on non-negative matrix factorization (2009)
0.02
0.023079555 = product of:
0.09231822 = sum of:
0.09231822 = weight(_text_:however in 3448) [ClassicSimilarity], result of:
0.09231822 = score(doc=3448,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.32118538 = fieldWeight in 3448, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0546875 = fieldNorm(doc=3448)
0.25 = coord(1/4)
- Abstract
- In existing unsupervised methods, Latent Semantic Analysis (LSA) is used for sentence selection. However, the obtained results are less meaningful, because singular vectors are used as the bases for sentence selection from given documents, and singular vector components can have negative values. We propose a new unsupervised method using Non-negative Matrix Factorization (NMF) to select sentences for automatic generic document summarization. The proposed method uses non-negative constraints, which are more similar to the human cognition process. As a result, the method selects more meaningful sentences for generic document summarization than those selected using LSA.
-
Xu, D.; Cheng, G.; Qu, Y.: Preferences in Wikipedia abstracts : empirical findings and implications for automatic entity summarization (2014)
0.02
0.019782476 = product of:
0.079129905 = sum of:
0.079129905 = weight(_text_:however in 3700) [ClassicSimilarity], result of:
0.079129905 = score(doc=3700,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.27530175 = fieldWeight in 3700, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.046875 = fieldNorm(doc=3700)
0.25 = coord(1/4)
- Abstract
- The volume of entity-centric structured data grows rapidly on the Web. The description of an entity, composed of property-value pairs (a.k.a. features), has become very large in many applications. To avoid information overload, efforts have been made to automatically select a limited number of features to be shown to the user based on certain criteria, which is called automatic entity summarization. However, to the best of our knowledge, there is a lack of extensive studies on how humans rank and select features in practice, which can provide empirical support and inspire future research. In this article, we present a large-scale statistical analysis of the descriptions of entities provided by DBpedia and the abstracts of their corresponding Wikipedia articles, to empirically study, along several different dimensions, which kinds of features are preferable when humans summarize. Implications for automatic entity summarization are drawn from the findings.
-
Kuhlen, R.: Abstracts, abstracting : intellektuelle und maschinelle Verfahren (1990)
0.02
0.018618755 = product of:
0.07447502 = sum of:
0.07447502 = weight(_text_:und in 2332) [ClassicSimilarity], result of:
0.07447502 = score(doc=2332,freq=4.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.48516542 = fieldWeight in 2332, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.109375 = fieldNorm(doc=2332)
0.25 = coord(1/4)
- Source
- Grundlagen der praktischen Information und Dokumentation. 3. Aufl. Hrsg.: M. Buder u.a. Bd.1
-
Yang, C.C.; Wang, F.L.: Hierarchical summarization of large documents (2008)
0.02
0.016485397 = product of:
0.06594159 = sum of:
0.06594159 = weight(_text_:however in 2719) [ClassicSimilarity], result of:
0.06594159 = score(doc=2719,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.22941813 = fieldWeight in 2719, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=2719)
0.25 = coord(1/4)
- Abstract
- Many automatic text summarization models have been developed in the last decades. Related research in information science has shown that human abstractors extract sentences for summaries based on the hierarchical structure of documents; however, the existing automatic summarization models do not take into account the human abstractor's behavior of sentence extraction and only consider the document as a sequence of sentences during the process of extraction of sentences as a summary. In general, a document exhibits a well-defined hierarchical structure that can be described as fractals - mathematical objects with a high degree of redundancy. In this article, we introduce the fractal summarization model based on the fractal theory. The important information is captured from the source document by exploring the hierarchical structure and salient features of the document. A condensed version of the document that is informatively close to the source document is produced iteratively using the contractive transformation in the fractal theory. The fractal summarization model is the first attempt to apply fractal theory to document summarization. It significantly improves the divergence of information coverage of summary and the precision of summary. User evaluations have been conducted. Results have indicated that fractal summarization is promising and outperforms current summarization techniques that do not consider the hierarchical structure of documents.
-
Wei, F.; Li, W.; Lu, Q.; He, Y.: Applying two-level reinforcement ranking in query-oriented multidocument summarization (2009)
0.02
0.016485397 = product of:
0.06594159 = sum of:
0.06594159 = weight(_text_:however in 107) [ClassicSimilarity], result of:
0.06594159 = score(doc=107,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.22941813 = fieldWeight in 107, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=107)
0.25 = coord(1/4)
- Abstract
- Sentence ranking is the issue of most concern in document summarization today. While traditional feature-based approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graph-based ranking algorithms (such as the PageRank-like algorithms) recursively compute sentence significance using the global information in a text graph that links sentences together. In general, the existing PageRank-like algorithms can model well the phenomena that a sentence is important if it is linked by many other important sentences. Or they are capable of modeling the mutual reinforcement among the sentences in the text graph. However, when dealing with multidocument summarization these algorithms often assemble a set of documents into one large file. The document dimension is totally ignored. In this article we present a framework to model the two-level mutual reinforcement among sentences as well as documents. Under this framework we design and develop a novel ranking algorithm such that the document reinforcement is taken into account in the process of sentence ranking. The convergence issue is examined. We also explore an interesting and important property of the proposed algorithm. When evaluated on the DUC 2005 and 2006 query-oriented multidocument summarization datasets, significant results are achieved.
-
Finegan-Dollak, C.; Radev, D.R.: Sentence simplification, compression, and disaggregation for summarization of sophisticated documents (2016)
0.02
0.016485397 = product of:
0.06594159 = sum of:
0.06594159 = weight(_text_:however in 4122) [ClassicSimilarity], result of:
0.06594159 = score(doc=4122,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.22941813 = fieldWeight in 4122, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=4122)
0.25 = coord(1/4)
- Abstract
- Sophisticated documents like legal cases and biomedical articles can contain unusually long sentences. Extractive summarizers can select such sentences-potentially adding hundreds of unnecessary words to the summary-or exclude them and lose important content. Sentence simplification or compression seems on the surface to be a promising solution. However, compression removes words before the selection algorithm can use them, and simplification generates sentences that may be ambiguous in an extractive summary. We therefore compare the performance of an extractive summarizer selecting from the sentences of the original document with that of the summarizer selecting from sentences shortened in three ways: simplification, compression, and disaggregation, which splits one sentence into several according to rules designed to keep all meaning. We find that on legal cases and biomedical articles, these shortening methods generate ungrammatical output. Human evaluators performed an extrinsic evaluation consisting of comprehension questions about the summaries. Evaluators given compressed, simplified, or disaggregated versions of the summaries answered fewer questions correctly than did those given summaries with unaltered sentences. Error analysis suggests 2 causes: Altered sentences sometimes interact with the sentence selection algorithm, and alterations to sentences sometimes obscure information in the summary. We discuss future work to alleviate these problems.
-
Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023)
0.02
0.016485397 = product of:
0.06594159 = sum of:
0.06594159 = weight(_text_:however in 1890) [ClassicSimilarity], result of:
0.06594159 = score(doc=1890,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.22941813 = fieldWeight in 1890, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=1890)
0.25 = coord(1/4)
- Abstract
- The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.
-
Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023)
0.02
0.016485397 = product of:
0.06594159 = sum of:
0.06594159 = weight(_text_:however in 2014) [ClassicSimilarity], result of:
0.06594159 = score(doc=2014,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.22941813 = fieldWeight in 2014, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.0390625 = fieldNorm(doc=2014)
0.25 = coord(1/4)
- Abstract
- With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.
-
Ruda, S.: Abstracting: eine Auswahlbibliographie (1992)
0.02
0.016124314 = product of:
0.064497255 = sum of:
0.064497255 = weight(_text_:und in 6671) [ClassicSimilarity], result of:
0.064497255 = score(doc=6671,freq=12.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.42016557 = fieldWeight in 6671, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0546875 = fieldNorm(doc=6671)
0.25 = coord(1/4)
- Abstract
- Die vorliegende Auswahlbibliographie ist in 9 Themenbereiche unterteilt. Der erste Abschnitt enthält Literatur, in der auf Abstracts und Abstracting-Verfahren allgemein eingegangen und ein Überblick über den Stand der Forschung gegeben wird. Im nächsten Abschnitt werden solche Aufsätze referiert, die die historische Entwicklung des Abstracting beschreiben. Im dritten Teil sind Abstracting-Richtlinien verschiedener Institutionen aufgelistet. Lexikalische, syntaktische und semantische Textkondensierungsverfahren sind das Thema der in Abschnitt 4 präsentierten Arbeiten. Textstrukturen von Abstracts werden unter Punkt 5 betrachtet, und die Arbeiten des nächsten Themenbereiches befassen sich mit dem Problem des Schreibens von Abstracts. Der siebte Abschnitt listet sog. 'maschinelle' und maschinen-unterstützte Abstracting-Methoden auf. Anschließend werden 'maschinelle' und maschinenunterstützte Abstracting-Verfahren, Abstracts im Vergleich zu ihren Primärtexten sowie Abstracts im allgemeien bewertet. Den Abschluß bilden Bibliographien
-
Kuhlen, R.: Abstracts, abstracting : intellektuelle und maschinelle Verfahren (1997)
0.02
0.015958933 = product of:
0.06383573 = sum of:
0.06383573 = weight(_text_:und in 869) [ClassicSimilarity], result of:
0.06383573 = score(doc=869,freq=4.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.41585606 = fieldWeight in 869, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.09375 = fieldNorm(doc=869)
0.25 = coord(1/4)
- Source
- Grundlagen der praktischen Information und Dokumentation: ein Handbuch zur Einführung in die fachliche Informationsarbeit. 4. Aufl. Hrsg.: M. Buder u.a
-
Endres-Niggemeyer, B.: Referierregeln und Referate : Abstracting als regelgesteuerter Textverarbeitungsprozeß (1985)
0.01
0.014719419 = product of:
0.058877677 = sum of:
0.058877677 = weight(_text_:und in 6670) [ClassicSimilarity], result of:
0.058877677 = score(doc=6670,freq=10.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.38355696 = fieldWeight in 6670, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0546875 = fieldNorm(doc=6670)
0.25 = coord(1/4)
- Abstract
- Referierregeln steuern Referierprozesse. Inhaltsbezogene Vorschriften aus drei Referierregeln wurden mit zugehö-rigen Abstracts verglichen. Das Ergebnis war unbefriedi-gend: Referierregeln sind teilweise inkonsistent, ihre Angaben sind nicht immer sachgerecht und oft als Hand-lungsanleitung nicht geeignet. Referieren erscheint als unterbestimmter Denk- und Textverarbeitungsvorgang mit beachtlichem Klärungs- und Gestaltungsbedarf. Die Regeln enthalten zuwenig Wissen über die von ihnen geregelten Sachverhalte. Sie geben oft zu einfache und sachferne Inhaltsstrukturen vor. Ideen für differenziertere Referatstrukturen werden entwickelt. Sie berücksichtigen die Abhängigkeit der Referatstruktur von der Textstruktur des Originaldokuments stärker. Die Klärung des Referier-vorganges bis zu einer gemeinsamen Zieldefinition ist für die weitere Entwicklung des intellektuellen wie des automatischen Referierens wichtig.
-
Kuhlen, R.: In Richtung Summarizing für Diskurse in K3 (2006)
0.01
0.014719419 = product of:
0.058877677 = sum of:
0.058877677 = weight(_text_:und in 67) [ClassicSimilarity], result of:
0.058877677 = score(doc=67,freq=10.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.38355696 = fieldWeight in 67, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0546875 = fieldNorm(doc=67)
0.25 = coord(1/4)
- Abstract
- Der Bedarf nach Summarizing-Leistungen, in Situationen der Fachinformation, aber auch in kommunikativen Umgebungen (Diskursen) wird aufgezeigt. Summarizing wird dazu in den Kontext des bisherigen (auch automatischen) Abstracting/Extracting gestellt. Der aktuelle Forschungsstand, vor allem mit Blick auf Multi-Document-Summarizing, wird dargestellt. Summarizing ist eine wichtige Funktion in komplex und umfänglich werdenden Diskussionen in elektronischen Foren. Dies wird am Beispiel des e-Learning-Systems K3 aufgezeigt. Rudimentäre Summarizing-Funktionen von K3 und des zugeordneten K3VIS-Systems werden dargestellt. Der Rahmen für ein elaborierteres, Template-orientiertes Summarizing unter Verwendung der vielfältigen Auszeichnungsfunktionen von K3 (Rollen, Diskurstypen, Inhaltstypen etc.) wird aufgespannt.
- Source
- Information und Sprache: Beiträge zu Informationswissenschaft, Computerlinguistik, Bibliothekswesen und verwandten Fächern. Festschrift für Harald H. Zimmermann. Herausgegeben von Ilse Harms, Heinz-Dirk Luckhardt und Hans W. Giessen
-
Endres-Niggemeyer, B.; Jauris-Heipke, S.; Pinsky, S.M.; Ulbricht, U.: Wissen gewinnen durch Wissen : Ontologiebasierte Informationsextraktion (2006)
0.01
0.014105836 = product of:
0.056423344 = sum of:
0.056423344 = weight(_text_:und in 16) [ClassicSimilarity], result of:
0.056423344 = score(doc=16,freq=18.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.36756828 = fieldWeight in 16, product of:
4.2426405 = tf(freq=18.0), with freq of:
18.0 = termFreq=18.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.0390625 = fieldNorm(doc=16)
0.25 = coord(1/4)
- Abstract
- Die ontologiebasierte Informationsextraktion, über die hier berichtet wird, ist Teil eines Systems zum automatischen Zusammenfassen, das sich am Vorgehen kompetenter Menschen orientiert. Dahinter steht die Annahme, dass Menschen die Ergebnisse eines Systems leichter übernehmen können, wenn sie mit Verfahren erarbeitet worden sind, die sie selbst auch benutzen. Das erste Anwendungsgebiet ist Knochenmarktransplantation (KMT). Im Kern des Systems Summit-BMT (Summarize It in Bone Marrow Transplantation) steht eine Ontologie des Fachgebietes. Sie ist als MySQL-Datenbank realisiert und versorgt menschliche Benutzer und Systemkomponenten mit Wissen. Summit-BMT unterstützt die Frageformulierung mit einem empirisch fundierten Szenario-Interface. Die Retrievalergebnisse werden durch ein Textpassagenretrieval vorselektiert und dann kognitiv fundierten Agenten unterbreitet, die unter Einsatz ihrer Wissensbasis / Ontologie genauer prüfen, ob die Propositionen aus der Benutzerfrage getroffen werden. Die relevanten Textclips aus dem Duelldokument werden in das Szenarioformular eingetragen und mit einem Link zu ihrem Vorkommen im Original präsentiert. In diesem Artikel stehen die Ontologie und ihr Gebrauch zur wissensbasierten Informationsextraktion im Mittelpunkt. Die Ontologiedatenbank hält unterschiedliche Wissenstypen so bereit, dass sie leicht kombiniert werden können: Konzepte, Propositionen und ihre syntaktisch-semantischen Schemata, Unifikatoren, Paraphrasen und Definitionen von Frage-Szenarios. Auf sie stützen sich die Systemagenten, welche von Menschen adaptierte Zusammenfassungsstrategien ausführen. Mängel in anderen Verarbeitungsschritten führen zu Verlusten, aber die eigentliche Qualität der Ergebnisse steht und fällt mit der Qualität der Ontologie. Erste Tests der Extraktionsleistung fallen verblüffend positiv aus.
- Source
- Information - Wissenschaft und Praxis. 57(2006) H.6/7, S.301-308
-
Hahn, U.: Automatisches Abstracting (2013)
0.01
0.013299111 = product of:
0.053196445 = sum of:
0.053196445 = weight(_text_:und in 1721) [ClassicSimilarity], result of:
0.053196445 = score(doc=1721,freq=4.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.34654674 = fieldWeight in 1721, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.078125 = fieldNorm(doc=1721)
0.25 = coord(1/4)
- Source
- Grundlagen der praktischen Information und Dokumentation. Handbuch zur Einführung in die Informationswissenschaft und -praxis. 6., völlig neu gefaßte Ausgabe. Hrsg. von R. Kuhlen, W. Semar u. D. Strauch. Begründet von Klaus Laisiepen, Ernst Lutterbeck, Karl-Heinrich Meyer-Uhlenried
-
Kannan, R.; Ghinea, G.; Swaminathan, S.: What do you wish to see? : A summarization system for movies based on user preferences (2015)
0.01
0.013188317 = product of:
0.05275327 = sum of:
0.05275327 = weight(_text_:however in 3683) [ClassicSimilarity], result of:
0.05275327 = score(doc=3683,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.1835345 = fieldWeight in 3683, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.03125 = fieldNorm(doc=3683)
0.25 = coord(1/4)
- Abstract
- Video summarization aims at producing a compact version of a full-length video while preserving the significant content of the original video. Movie summarization condenses a full-length movie into a summary that still retains the most significant and interesting content of the original movie. In the past, several movie summarization systems have been proposed to generate a movie summary based on low-level video features such as color, motion, texture, etc. However, a generic summary, which is common to everyone and is produced based only on low-level video features will not satisfy every user. As users' preferences for the summary differ vastly for the same movie, there is a need for a personalized movie summarization system nowadays. To address this demand, this paper proposes a novel system to generate semantically meaningful video summaries for the same movie, which are tailored to the preferences and interests of a user. For a given movie, shots and scenes are automatically detected and their high-level features are semi-automatically annotated. Preferences over high-level movie features are explicitly collected from the user using a query interface. The user preferences are generated by means of a stored-query. Movie summaries are generated at shot level and scene level, where shots or scenes are selected for summary skim based on the similarity measured between shots and scenes, and the user's preferences. The proposed movie summarization system is evaluated subjectively using a sample of 20 subjects with eight movies in the English language. The quality of the generated summaries is assessed by informativeness, enjoyability, relevance, and acceptance metrics and Quality of Perception measures. Further, the usability of the proposed summarization system is subjectively evaluated by conducting a questionnaire survey. The experimental results on the performance of the proposed movie summarization approach show the potential of the proposed system.
-
Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018)
0.01
0.013188317 = product of:
0.05275327 = sum of:
0.05275327 = weight(_text_:however in 89) [ClassicSimilarity], result of:
0.05275327 = score(doc=89,freq=2.0), product of:
0.28742972 = queryWeight, product of:
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.06921162 = queryNorm
0.1835345 = fieldWeight in 89, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.1529117 = idf(docFreq=1897, maxDocs=44421)
0.03125 = fieldNorm(doc=89)
0.25 = coord(1/4)
- Abstract
- Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews. QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon. On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users' needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.
-
Yusuff, A.: Automatisches Indexing and Abstracting : Grundlagen und Beispiele (2002)
0.01
0.013165447 = product of:
0.052661788 = sum of:
0.052661788 = weight(_text_:und in 2577) [ClassicSimilarity], result of:
0.052661788 = score(doc=2577,freq=2.0), product of:
0.15350439 = queryWeight, product of:
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.06921162 = queryNorm
0.34306374 = fieldWeight in 2577, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.217899 = idf(docFreq=13141, maxDocs=44421)
0.109375 = fieldNorm(doc=2577)
0.25 = coord(1/4)