Search (285 results, page 1 of 15)

  • × theme_ss:"Computerlinguistik"
  1. Zimmermann, H.H.: Maschinelle und Computergestützte Übersetzung (2004) 0.08
    0.08407743 = product of:
      0.16815487 = sum of:
        0.044942625 = weight(_text_:und in 3943) [ClassicSimilarity], result of:
          0.044942625 = score(doc=3943,freq=8.0), product of:
            0.15283768 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.068911016 = queryNorm
            0.29405463 = fieldWeight in 3943, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.046875 = fieldNorm(doc=3943)
        0.12321224 = weight(_text_:human in 3943) [ClassicSimilarity], result of:
          0.12321224 = score(doc=3943,freq=4.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.40941924 = fieldWeight in 3943, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.046875 = fieldNorm(doc=3943)
      0.5 = coord(2/4)
    
    Abstract
    Unter Maschineller Übersetzung (Machine Translation, MT) wird im Folgenden die vollautomatische Übersetzung eines Textes in natürlicher Sprache in eine andere natürliche Sprache verstanden. Unter Human-Übersetzung (Human Translation, HT) wird die intellektuelle Übersetzung eines Textes mit oder ohne maschinelle lexikalische Hilfen mit oder ohne Textverarbeitung verstanden. Unter computergestützter bzw computerunterstützter Übersetzung (CAT) wird einerseits eine intellektuelle Übersetzung verstanden, die auf einer maschinellen Vorübersetzung/Rohübersetzung (MT) aufbaut, die nachfolgend intellektuell nachbereitet wird (Postedition); andererseits wird darunter eine intellektuelle Übersetzung verstanden, bei der vor oder während des intellektuellen Übersetzungsprozesses ein Translation Memory und/ oder eine Terminologie-Bank verwendet werden. Unter ICAT wird eine spezielle Variante von CAT verstanden, bei der ein Nutzer ohne (hinreichende) Kenntnis der Zielsprache bei einer Übersetzung aus seiner Muttersprache so unterstützt wird, dass das zielsprachige Äquivalent relativ fehlerfrei ist.
    Source
    Grundlagen der praktischen Information und Dokumentation. 5., völlig neu gefaßte Ausgabe. 2 Bde. Hrsg. von R. Kuhlen, Th. Seeger u. D. Strauch. Begründet von Klaus Laisiepen, Ernst Lutterbeck, Karl-Heinrich Meyer-Uhlenried. Bd.1: Handbuch zur Einführung in die Informationswissenschaft und -praxis
  2. New tools for human translators (1997) 0.07
    0.07187381 = product of:
      0.28749523 = sum of:
        0.28749523 = weight(_text_:human in 2179) [ClassicSimilarity], result of:
          0.28749523 = score(doc=2179,freq=4.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.95531154 = fieldWeight in 2179, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.109375 = fieldNorm(doc=2179)
      0.25 = coord(1/4)
    
    Abstract
    A special issue devoted to the theme of new tools for human tranlators
  3. Thomas, I.S.; Wang, J.; GPT-3: Was euch zu Menschen macht : Antworten einer künstlichen Intelligenz auf die großen Fragen des Lebens (2022) 0.06
    0.06302284 = product of:
      0.12604567 = sum of:
        0.038921464 = weight(_text_:und in 1879) [ClassicSimilarity], result of:
          0.038921464 = score(doc=1879,freq=6.0), product of:
            0.15283768 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.068911016 = queryNorm
            0.25465882 = fieldWeight in 1879, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.046875 = fieldNorm(doc=1879)
        0.087124206 = weight(_text_:human in 1879) [ClassicSimilarity], result of:
          0.087124206 = score(doc=1879,freq=2.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.2895031 = fieldWeight in 1879, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.046875 = fieldNorm(doc=1879)
      0.5 = coord(2/4)
    
    Abstract
    Das erste durch KI verfasste Weisheitsbuch. »Die Künstliche Intelligenz sieht den Menschen, wie er ist. Es gibt für sie keinen Gott, keine Rituale, keinen Himmel, keine Hölle, keine Engel. Es gibt für sie nur empfindsame Wesen.« GPT-3. Dieses Buch enthält Weisheitstexte, die durch die modernste KI im Bereich der Spracherkennung verfasst wurden. Es ist die GPT-3, die durch die Technikerin Jasmine Wang gesteuert wird. Die originären Texte von GPT-3 werden von dem international bekannten Dichter Iain S. Thomas kuratiert. Die Basis von GPT-3 reicht von den Weisheitsbücher der Menschheit bis hin zu modernen Texten. GPT-3 antwortet auf Fragen wie: Was macht den Mensch zum Menschen? Was bedeutet es zu lieben? Wie führen wir ein erfülltes Leben? etc. und ist in der Lage, eigene Sätze zu kreieren. So wird eine zeitgenössische und noch nie dagewesene Erforschung von Sinn und Spiritualität geschaffen, die zu einem neuen Verständnis dessen inspiriert, was uns zu Menschen macht.
    Footnote
    Originaltitel: What makes us human.
  4. Harari, Y.N.: ¬[Yuval-Noah-Harari-argues-that] AI has hacked the operating system of human civilisation (2023) 0.06
    0.062876485 = product of:
      0.25150594 = sum of:
        0.25150594 = weight(_text_:human in 1954) [ClassicSimilarity], result of:
          0.25150594 = score(doc=1954,freq=6.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.8357235 = fieldWeight in 1954, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.078125 = fieldNorm(doc=1954)
      0.25 = coord(1/4)
    
    Abstract
    Storytelling computers will change the course of human history, says the historian and philosopher.
    Source
    https://www.economist.com/by-invitation/2023/04/28/yuval-noah-harari-argues-that-ai-has-hacked-the-operating-system-of-human-civilisation?giftId=6982bba3-94bc-441d-9153-6d42468817ad
  5. Jurafsky, D.; Martin, J.H.: Speech and language processing : ani ntroduction to natural language processing, computational linguistics and speech recognition (2009) 0.05
    0.049543105 = product of:
      0.09908621 = sum of:
        0.0264827 = weight(_text_:und in 2081) [ClassicSimilarity], result of:
          0.0264827 = score(doc=2081,freq=4.0), product of:
            0.15283768 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.068911016 = queryNorm
            0.17327337 = fieldWeight in 2081, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.0390625 = fieldNorm(doc=2081)
        0.07260351 = weight(_text_:human in 2081) [ClassicSimilarity], result of:
          0.07260351 = score(doc=2081,freq=2.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.2412526 = fieldWeight in 2081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0390625 = fieldNorm(doc=2081)
      0.5 = coord(2/4)
    
    Abstract
    For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing. An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology at all levels and with all modern technologies this text takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. The authors cover areas that traditionally are taught in different courses, to describe a unified vision of speech and language processing. Emphasis is on practical applications and scientific evaluation. An accompanying Website contains teaching materials for instructors, with pointers to language processing resources on the Web. The Second Edition offers a significant amount of new and extended material.
    BK
    18.00 Einzelne Sprachen und Literaturen allgemein
    Classification
    18.00 Einzelne Sprachen und Literaturen allgemein
  6. Rötzer, F.: Computer ergooglen die Bedeutung von Worten (2005) 0.05
    0.048723213 = product of:
      0.09744643 = sum of:
        0.053884324 = weight(_text_:und in 4385) [ClassicSimilarity], result of:
          0.053884324 = score(doc=4385,freq=46.0), product of:
            0.15283768 = queryWeight, product of:
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.068911016 = queryNorm
            0.35255915 = fieldWeight in 4385, product of:
              6.78233 = tf(freq=46.0), with freq of:
                46.0 = termFreq=46.0
              2.217899 = idf(docFreq=13141, maxDocs=44421)
              0.0234375 = fieldNorm(doc=4385)
        0.043562103 = weight(_text_:human in 4385) [ClassicSimilarity], result of:
          0.043562103 = score(doc=4385,freq=2.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.14475155 = fieldWeight in 4385, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0234375 = fieldNorm(doc=4385)
      0.5 = coord(2/4)
    
    Content
    "Wie könnten Computer Sprache lernen und dabei auch die Bedeutung von Worten sowie die Beziehungen zwischen ihnen verstehen? Dieses Problem der Semantik stellt eine gewaltige, bislang nur ansatzweise bewältigte Aufgabe dar, da Worte und Wortverbindungen oft mehrere oder auch viele Bedeutungen haben, die zudem vom außersprachlichen Kontext abhängen. Die beiden holländischen (Ein künstliches Bewusstsein aus einfachen Aussagen (1)). Paul Vitanyi (2) und Rudi Cilibrasi vom Nationalen Institut für Mathematik und Informatik (3) in Amsterdam schlagen eine elegante Lösung vor: zum Nachschlagen im Internet, der größten Datenbank, die es gibt, wird einfach Google benutzt. Objekte wie eine Maus können mit ihren Namen "Maus" benannt werden, die Bedeutung allgemeiner Begriffe muss aus ihrem Kontext gelernt werden. Ein semantisches Web zur Repräsentation von Wissen besteht aus den möglichen Verbindungen, die Objekte und ihre Namen eingehen können. Natürlich können in der Wirklichkeit neue Namen, aber auch neue Bedeutungen und damit neue Verknüpfungen geschaffen werden. Sprache ist lebendig und flexibel. Um einer Künstlichen Intelligenz alle Wortbedeutungen beizubringen, müsste mit der Hilfe von menschlichen Experten oder auch vielen Mitarbeitern eine riesige Datenbank mit den möglichen semantischen Netzen aufgebaut und dazu noch ständig aktualisiert werden. Das aber müsste gar nicht notwendig sein, denn mit dem Web gibt es nicht nur die größte und weitgehend kostenlos benutzbare semantische Datenbank, sie wird auch ständig von zahllosen Internetnutzern aktualisiert. Zudem gibt es Suchmaschinen wie Google, die Verbindungen zwischen Worten und damit deren Bedeutungskontext in der Praxis in ihrer Wahrscheinlichkeit quantitativ mit der Angabe der Webseiten, auf denen sie gefunden wurden, messen.
    Mit einem bereits zuvor von Paul Vitanyi und anderen entwickeltem Verfahren, das den Zusammenhang von Objekten misst (normalized information distance - NID ), kann die Nähe zwischen bestimmten Objekten (Bilder, Worte, Muster, Intervalle, Genome, Programme etc.) anhand aller Eigenschaften analysiert und aufgrund der dominanten gemeinsamen Eigenschaft bestimmt werden. Ähnlich können auch die allgemein verwendeten, nicht unbedingt "wahren" Bedeutungen von Namen mit der Google-Suche erschlossen werden. 'At this moment one database stands out as the pinnacle of computer-accessible human knowledge and the most inclusive summary of statistical information: the Google search engine. There can be no doubt that Google has already enabled science to accelerate tremendously and revolutionized the research process. It has dominated the attention of internet users for years, and has recently attracted substantial attention of many Wall Street investors, even reshaping their ideas of company financing.' (Paul Vitanyi und Rudi Cilibrasi) Gibt man ein Wort ein wie beispielsweise "Pferd", erhält man bei Google 4.310.000 indexierte Seiten. Für "Reiter" sind es 3.400.000 Seiten. Kombiniert man beide Begriffe, werden noch 315.000 Seiten erfasst. Für das gemeinsame Auftreten beispielsweise von "Pferd" und "Bart" werden zwar noch immer erstaunliche 67.100 Seiten aufgeführt, aber man sieht schon, dass "Pferd" und "Reiter" enger zusammen hängen. Daraus ergibt sich eine bestimmte Wahrscheinlichkeit für das gemeinsame Auftreten von Begriffen. Aus dieser Häufigkeit, die sich im Vergleich mit der maximalen Menge (5.000.000.000) an indexierten Seiten ergibt, haben die beiden Wissenschaftler eine statistische Größe entwickelt, die sie "normalised Google distance" (NGD) nennen und die normalerweise zwischen 0 und 1 liegt. Je geringer NGD ist, desto enger hängen zwei Begriffe zusammen. "Das ist eine automatische Bedeutungsgenerierung", sagt Vitanyi gegenüber dern New Scientist (4). "Das könnte gut eine Möglichkeit darstellen, einen Computer Dinge verstehen und halbintelligent handeln zu lassen." Werden solche Suchen immer wieder durchgeführt, lässt sich eine Karte für die Verbindungen von Worten erstellen. Und aus dieser Karte wiederum kann ein Computer, so die Hoffnung, auch die Bedeutung der einzelnen Worte in unterschiedlichen natürlichen Sprachen und Kontexten erfassen. So habe man über einige Suchen realisiert, dass ein Computer zwischen Farben und Zahlen unterscheiden, holländische Maler aus dem 17. Jahrhundert und Notfälle sowie Fast-Notfälle auseinander halten oder elektrische oder religiöse Begriffe verstehen könne. Überdies habe eine einfache automatische Übersetzung Englisch-Spanisch bewerkstelligt werden können. Auf diese Weise ließe sich auch, so hoffen die Wissenschaftler, die Bedeutung von Worten erlernen, könne man Spracherkennung verbessern oder ein semantisches Web erstellen und natürlich endlich eine bessere automatische Übersetzung von einer Sprache in die andere realisieren.
  7. Melby, A.: Some notes on 'The proper place of men and machines in language translation' (1997) 0.04
    0.044013537 = product of:
      0.17605415 = sum of:
        0.17605415 = weight(_text_:human in 1330) [ClassicSimilarity], result of:
          0.17605415 = score(doc=1330,freq=6.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.5850065 = fieldWeight in 1330, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0546875 = fieldNorm(doc=1330)
      0.25 = coord(1/4)
    
    Abstract
    Responds to Kay, M.: The proper place of men and machines in language translation. Examines the appropriateness of machine translation (MT) under the following special circumstances: controlled domain-specific text and high-quality output; controlled domain-specific text and indicative output; dynamic general text and indicative output and dynamic general text and high-quality output. MT is appropriate in the 1st 3 cases but the 4th case requires human translation. Examines how MT research could be more useful for aiding human translation
    Footnote
    Contribution to a special issue devoted to the theme of new tools for human translators
  8. McKevitt, P.; Partridge, D.; Wilks, Y.: Why machines should analyse intention in natural language dialogue (1999) 0.04
    0.043562103 = product of:
      0.17424841 = sum of:
        0.17424841 = weight(_text_:human in 1366) [ClassicSimilarity], result of:
          0.17424841 = score(doc=1366,freq=2.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.5790062 = fieldWeight in 1366, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.09375 = fieldNorm(doc=1366)
      0.25 = coord(1/4)
    
    Source
    International journal of human-computer studies. 51(1999) no.5, S.947-989
  9. Mustafa El Hadi, W.: Evaluating human language technology : general applications to information access and management (2002) 0.04
    0.043562103 = product of:
      0.17424841 = sum of:
        0.17424841 = weight(_text_:human in 2840) [ClassicSimilarity], result of:
          0.17424841 = score(doc=2840,freq=2.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.5790062 = fieldWeight in 2840, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.09375 = fieldNorm(doc=2840)
      0.25 = coord(1/4)
    
  10. Farreús, M.; Costa-jussà, M.R.; Popovic' Morse, M.: Study and correlation analysis of linguistic, perceptual, and automatic machine translation evaluations (2012) 0.04
    0.043562103 = product of:
      0.17424841 = sum of:
        0.17424841 = weight(_text_:human in 975) [ClassicSimilarity], result of:
          0.17424841 = score(doc=975,freq=8.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.5790062 = fieldWeight in 975, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.046875 = fieldNorm(doc=975)
      0.25 = coord(1/4)
    
    Abstract
    Evaluation of machine translation output is an important task. Various human evaluation techniques as well as automatic metrics have been proposed and investigated in the last decade. However, very few evaluation methods take the linguistic aspect into account. In this article, we use an objective evaluation method for machine translation output that classifies all translation errors into one of the five following linguistic levels: orthographic, morphological, lexical, semantic, and syntactic. Linguistic guidelines for the target language are required, and human evaluators use them in to classify the output errors. The experiments are performed on English-to-Catalan and Spanish-to-Catalan translation outputs generated by four different systems: 2 rule-based and 2 statistical. All translations are evaluated using the 3 following methods: a standard human perceptual evaluation method, several widely used automatic metrics, and the human linguistic evaluation. Pearson and Spearman correlation coefficients between the linguistic, perceptual, and automatic results are then calculated, showing that the semantic level correlates significantly with both perceptual evaluation and automatic metrics.
  11. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.04
    0.037815057 = product of:
      0.15126023 = sum of:
        0.15126023 = weight(_text_:java in 935) [ClassicSimilarity], result of:
          0.15126023 = score(doc=935,freq=2.0), product of:
            0.4856509 = queryWeight, product of:
              7.0475073 = idf(docFreq=104, maxDocs=44421)
              0.068911016 = queryNorm
            0.31145877 = fieldWeight in 935, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.0475073 = idf(docFreq=104, maxDocs=44421)
              0.03125 = fieldNorm(doc=935)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
  12. Melby, A.K.; Warner, C.T.: ¬The possibilities of language : a discussion of the nature of language, with implications for human and machine translation (1995) 0.04
    0.036301754 = product of:
      0.14520702 = sum of:
        0.14520702 = weight(_text_:human in 2121) [ClassicSimilarity], result of:
          0.14520702 = score(doc=2121,freq=2.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.4825052 = fieldWeight in 2121, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.078125 = fieldNorm(doc=2121)
      0.25 = coord(1/4)
    
  13. Mustafa el Hadi, W.: Human language technology and its role in information access and management (2003) 0.04
    0.036301754 = product of:
      0.14520702 = sum of:
        0.14520702 = weight(_text_:human in 524) [ClassicSimilarity], result of:
          0.14520702 = score(doc=524,freq=8.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.4825052 = fieldWeight in 524, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0390625 = fieldNorm(doc=524)
      0.25 = coord(1/4)
    
    Abstract
    The role of linguistics in information access, extraction and dissemination is essential. Radical changes in the techniques of information and communication at the end of the twentieth century have had a significant effect on the function of the linguistic paradigm and its applications in all forms of communication. The introduction of new technical means have deeply changed the possibilities for the distribution of information. In this situation, what is the role of the linguistic paradigm and its practical applications, i.e., natural language processing (NLP) techniques when applied to information access? What solutions can linguistics offer in human computer interaction, extraction and management? Many fields show the relevance of the linguistic paradigm through the various technologies that require NLP, such as document and message understanding, information detection, extraction, and retrieval, question and answer, cross-language information retrieval (CLIR), text summarization, filtering, and spoken document retrieval. This paper focuses on the central role of human language technologies in the information society, surveys the current situation, describes the benefits of the above mentioned applications, outlines successes and challenges, and discusses solutions. It reviews the resources and means needed to advance information access and dissemination across language boundaries in the twenty-first century. Multilingualism, which is a natural result of globalization, requires more effort in the direction of language technology. The scope of human language technology (HLT) is large, so we limit our review to applications that involve multilinguality.
  14. Park, J.S.; O'Brien, J.C.; Cai, C.J.; Ringel Morris, M.; Liang, P.; Bernstein, M.S.: Generative agents : interactive simulacra of human behavior (2023) 0.04
    0.036301754 = product of:
      0.14520702 = sum of:
        0.14520702 = weight(_text_:human in 1974) [ClassicSimilarity], result of:
          0.14520702 = score(doc=1974,freq=8.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.4825052 = fieldWeight in 1974, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0390625 = fieldNorm(doc=1974)
      0.25 = coord(1/4)
    
    Abstract
    Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior.
  15. Williams, S.; Huckle, J.: Easy problems that LLMs get wrong (2024) 0.04
    0.035936903 = product of:
      0.14374761 = sum of:
        0.14374761 = weight(_text_:human in 2394) [ClassicSimilarity], result of:
          0.14374761 = score(doc=2394,freq=4.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.47765577 = fieldWeight in 2394, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0546875 = fieldNorm(doc=2394)
      0.25 = coord(1/4)
    
    Abstract
    We introduce a comprehensive Linguistic Benchmark designed to evaluate the limitations of Large Language Models (LLMs) in domains such as logical reasoning, spatial intelligence, and linguistic understanding, among others. Through a series of straightforward questions, it uncovers the significant limitations of well-regarded models to perform tasks that humans manage with ease. It also highlights the potential of prompt engineering to mitigate some errors and underscores the necessity for better training methodologies. Our findings stress the importance of grounding LLMs with human reasoning and common sense, emphasising the need for human-in-the-loop for enterprise applications. We hope this work paves the way for future research to enhance the usefulness and reliability of new models.
  16. Aydin, Ö.; Karaarslan, E.: OpenAI ChatGPT generated literature review: : digital twin in healthcare (2022) 0.04
    0.035568308 = product of:
      0.14227323 = sum of:
        0.14227323 = weight(_text_:human in 1852) [ClassicSimilarity], result of:
          0.14227323 = score(doc=1852,freq=12.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.4727566 = fieldWeight in 1852, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.03125 = fieldNorm(doc=1852)
      0.25 = coord(1/4)
    
    Abstract
    Literature review articles are essential to summarize the related work in the selected field. However, covering all related studies takes too much time and effort. This study questions how Artificial Intelligence can be used in this process. We used ChatGPT to create a literature review article to show the stage of the OpenAI ChatGPT artificial intelligence application. As the subject, the applications of Digital Twin in the health field were chosen. Abstracts of the last three years (2020, 2021 and 2022) papers were obtained from the keyword "Digital twin in healthcare" search results on Google Scholar and paraphrased by ChatGPT. Later on, we asked ChatGPT questions. The results are promising; however, the paraphrased parts had significant matches when checked with the Ithenticate tool. This article is the first attempt to show the compilation and expression of knowledge will be accelerated with the help of artificial intelligence. We are still at the beginning of such advances. The future academic publishing process will require less human effort, which in turn will allow academics to focus on their studies. In future studies, we will monitor citations to this study to evaluate the academic validity of the content produced by the ChatGPT. 1. Introduction OpenAI ChatGPT (ChatGPT, 2022) is a chatbot based on the OpenAI GPT-3 language model. It is designed to generate human-like text responses to user input in a conversational context. OpenAI ChatGPT is trained on a large dataset of human conversations and can be used to create responses to a wide range of topics and prompts. The chatbot can be used for customer service, content creation, and language translation tasks, creating replies in multiple languages. OpenAI ChatGPT is available through the OpenAI API, which allows developers to access and integrate the chatbot into their applications and systems. OpenAI ChatGPT is a variant of the GPT (Generative Pre-trained Transformer) language model developed by OpenAI. It is designed to generate human-like text, allowing it to engage in conversation with users naturally and intuitively. OpenAI ChatGPT is trained on a large dataset of human conversations, allowing it to understand and respond to a wide range of topics and contexts. It can be used in various applications, such as chatbots, customer service agents, and language translation systems. OpenAI ChatGPT is a state-of-the-art language model able to generate coherent and natural text that can be indistinguishable from text written by a human. As an artificial intelligence, ChatGPT may need help to change academic writing practices. However, it can provide information and guidance on ways to improve people's academic writing skills.
  17. Gill, A.J.; Hinrichs-Krapels, S.; Blanke, T.; Grant, J.; Hedges, M.; Tanner, S.: Insight workflow : systematically combining human and computational methods to explore textual data (2017) 0.03
    0.031438243 = product of:
      0.12575297 = sum of:
        0.12575297 = weight(_text_:human in 4682) [ClassicSimilarity], result of:
          0.12575297 = score(doc=4682,freq=6.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.41786176 = fieldWeight in 4682, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0390625 = fieldNorm(doc=4682)
      0.25 = coord(1/4)
    
    Abstract
    Analyzing large quantities of real-world textual data has the potential to provide new insights for researchers. However, such data present challenges for both human and computational methods, requiring a diverse range of specialist skills, often shared across a number of individuals. In this paper we use the analysis of a real-world data set as our case study, and use this exploration as a demonstration of our "insight workflow," which we present for use and adaptation by other researchers. The data we use are impact case study documents collected as part of the UK Research Excellence Framework (REF), consisting of 6,679 documents and 6.25 million words; the analysis was commissioned by the Higher Education Funding Council for England (published as report HEFCE 2015). In our exploration and analysis we used a variety of techniques, ranging from keyword in context and frequency information to more sophisticated methods (topic modeling), with these automated techniques providing an empirical point of entry for in-depth and intensive human analysis. We present the 60 topics to demonstrate the output of our methods, and illustrate how the variety of analysis techniques can be combined to provide insights. We note potential limitations and propose future work.
  18. Hou, Y.; Pascale, A.; Carnerero-Cano, J.; Sattigeri, P.; Tchrakian, T.; Marinescu, R.; Daly, E.; Padhi, I.: WikiContradict : a benchmark for evaluating LLMs on real-world knowledge conflicts from Wikipedia (2024) 0.03
    0.031438243 = product of:
      0.12575297 = sum of:
        0.12575297 = weight(_text_:human in 2368) [ClassicSimilarity], result of:
          0.12575297 = score(doc=2368,freq=6.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.41786176 = fieldWeight in 2368, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.0390625 = fieldNorm(doc=2368)
      0.25 = coord(1/4)
    
    Abstract
    Retrieval-augmented generation (RAG) has emerged as a promising solution to mitigate the limitations of large language models (LLMs), such as hallucinations and outdated information. However, it remains unclear how LLMs handle knowledge conflicts arising from different augmented retrieved passages, especially when these passages originate from the same source and have equal trustworthiness. In this work, we conduct a comprehensive evaluation of LLM-generated answers to questions that have varying answers based on contradictory passages from Wikipedia, a dataset widely regarded as a high-quality pre-training resource for most LLMs. Specifically, we introduce WikiContradict, a benchmark consisting of 253 high-quality, human-annotated instances designed to assess LLM performance when augmented with retrieved passages containing real-world knowledge conflicts. We benchmark a diverse range of both closed and open-source LLMs under different QA scenarios, including RAG with a single passage, and RAG with 2 contradictory passages. Through rigorous human evaluations on a subset of WikiContradict instances involving 5 LLMs and over 3,500 judgements, we shed light on the behaviour and limitations of these models. For instance, when provided with two passages containing contradictory facts, all models struggle to generate answers that accurately reflect the conflicting nature of the context, especially for implicit conflicts requiring reasoning. Since human evaluation is costly, we also introduce an automated model that estimates LLM performance using a strong open-source language model, achieving an F-score of 0.8. Using this automated metric, we evaluate more than 1,500 answers from seven LLMs across all WikiContradict instances. To facilitate future work, we release WikiContradict on: https://ibm.biz/wikicontradict.
  19. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.03
    0.03080306 = product of:
      0.12321224 = sum of:
        0.12321224 = weight(_text_:human in 1563) [ClassicSimilarity], result of:
          0.12321224 = score(doc=1563,freq=4.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.40941924 = fieldWeight in 1563, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.046875 = fieldNorm(doc=1563)
      0.25 = coord(1/4)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
  20. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.03
    0.03080306 = product of:
      0.12321224 = sum of:
        0.12321224 = weight(_text_:human in 1863) [ClassicSimilarity], result of:
          0.12321224 = score(doc=1863,freq=4.0), product of:
            0.30094394 = queryWeight, product of:
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.068911016 = queryNorm
            0.40941924 = fieldWeight in 1863, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3671384 = idf(docFreq=1531, maxDocs=44421)
              0.046875 = fieldNorm(doc=1863)
      0.25 = coord(1/4)
    
    Abstract
    This research revisits the classic Turing test and compares recent large language models such as ChatGPT for their abilities to reproduce human-level comprehension and compelling text generation. Two task challenges- summary and question answering- prompt ChatGPT to produce original content (98-99%) from a single text entry and sequential questions initially posed by Turing in 1950. We score the original and generated content against the OpenAI GPT-2 Output Detector from 2019, and establish multiple cases where the generated content proves original and undetectable (98%). The question of a machine fooling a human judge recedes in this work relative to the question of "how would one prove it?" The original contribution of the work presents a metric and simple grammatical set for understanding the writing mechanics of chatbots in evaluating their readability and statistical clarity, engagement, delivery, overall quality, and plagiarism risks. While Turing's original prose scores at least 14% below the machine-generated output, whether an algorithm displays hints of Turing's true initial thoughts (the "Lovelace 2.0" test) remains unanswerable.

Years

Languages

  • d 194
  • e 86
  • m 4
  • chi 1
  • ru 1
  • More… Less…

Types

  • a 207
  • el 51
  • m 43
  • s 14
  • x 12
  • p 5
  • d 2
  • More… Less…

Subjects

Classifications