Search (1250 results, page 5 of 63)

Pan, B.; Gay, G.; Saylor, J.; Hembrooke, H.: One digital library, two undergraduate casses, and four learning modules : uses of a digital library in cassrooms (2006) 0.05
```
0.05261735 = product of:
  0.2104694 = sum of:
    0.2104694 = weight(_text_:java in 907) [ClassicSimilarity], result of:
      0.2104694 = score(doc=907,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.46718815 = fieldWeight in 907, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.046875 = fieldNorm(doc=907)
  0.25 = coord(1/4)
```
Abstract

The KMODDL (kinematic models for design digital library) is a digital library based on a historical collection of kinematic models made of steel and bronze. The digital library contains four types of learning modules including textual materials, QuickTime virtual reality movies, Java simulations, and stereolithographic files of the physical models. The authors report an evaluation study on the uses of the KMODDL in two undergraduate classes. This research reveals that the users in different classes encountered different usability problems, and reported quantitatively different subjective experiences. Further, the results indicate that depending on the subject area, the two user groups preferred different types of learning modules, resulting in different uses of the available materials and different learning outcomes. These findings are discussed in terms of their implications for future digital library design.
Mongin, L.; Fu, Y.Y.; Mostafa, J.: Open Archives data Service prototype and automated subject indexing using D-Lib archive content as a testbed (2003) 0.05
```
0.05261735 = product of:
  0.2104694 = sum of:
    0.2104694 = weight(_text_:java in 2167) [ClassicSimilarity], result of:
      0.2104694 = score(doc=2167,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.46718815 = fieldWeight in 2167, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.046875 = fieldNorm(doc=2167)
  0.25 = coord(1/4)
```
Abstract

The Indiana University School of Library and Information Science opened a new research laboratory in January 2003; The Indiana University School of Library and Information Science Information Processing Laboratory [IU IP Lab]. The purpose of the new laboratory is to facilitate collaboration between scientists in the department in the areas of information retrieval (IR) and information visualization (IV) research. The lab has several areas of focus. These include grid and cluster computing, and a standard Java-based software platform to support plug and play research datasets, a selection of standard IR modules and standard IV algorithms. Future development includes software to enable researchers to contribute datasets, IR algorithms, and visualization algorithms into the standard environment. We decided early on to use OAI-PMH as a resource discovery tool because it is consistent with our mission.
Song, R.; Luo, Z.; Nie, J.-Y.; Yu, Y.; Hon, H.-W.: Identification of ambiguous queries in web search (2009) 0.05
```
0.05261735 = product of:
  0.2104694 = sum of:
    0.2104694 = weight(_text_:java in 3441) [ClassicSimilarity], result of:
      0.2104694 = score(doc=3441,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.46718815 = fieldWeight in 3441, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.046875 = fieldNorm(doc=3441)
  0.25 = coord(1/4)
```
Abstract

It is widely believed that many queries submitted to search engines are inherently ambiguous (e.g., java and apple). However, few studies have tried to classify queries based on ambiguity and to answer "what the proportion of ambiguous queries is". This paper deals with these issues. First, we clarify the definition of ambiguous queries by constructing the taxonomy of queries from being ambiguous to specific. Second, we ask human annotators to manually classify queries. From manually labeled results, we observe that query ambiguity is to some extent predictable. Third, we propose a supervised learning approach to automatically identify ambiguous queries. Experimental results show that we can correctly identify 87% of labeled queries with the approach. Finally, by using our approach, we estimate that about 16% of queries in a real search log are ambiguous.
Croft, W.B.; Metzler, D.; Strohman, T.: Search engines : information retrieval in practice (2010) 0.05
```
0.05261735 = product of:
  0.2104694 = sum of:
    0.2104694 = weight(_text_:java in 3605) [ClassicSimilarity], result of:
      0.2104694 = score(doc=3605,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.46718815 = fieldWeight in 3605, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.046875 = fieldNorm(doc=3605)
  0.25 = coord(1/4)
```
Abstract

For introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice, is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book's numerous programming exercises make extensive use of Galago, a Java-based open source search engine. SUPPLEMENTS / Extensive lecture slides (in PDF and PPT format) / Solutions to selected end of chapter problems (Instructors only) / Test collections for exercises / Galago search engine
Tang, X.-B.; Wei Wei, G,-C.L.; Zhu, J.: ¬An inference model of medical insurance fraud detection : based on ontology and SWRL (2017) 0.05
```
0.05261735 = product of:
  0.2104694 = sum of:
    0.2104694 = weight(_text_:java in 4615) [ClassicSimilarity], result of:
      0.2104694 = score(doc=4615,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.46718815 = fieldWeight in 4615, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.046875 = fieldNorm(doc=4615)
  0.25 = coord(1/4)
```
Abstract

Medical insurance fraud is common in many countries' medical insurance systems and represents a serious threat to the insurance funds and the benefits of patients. In this paper, we present an inference model of medical insurance fraud detection, based on a medical detection domain ontology that incorporates the knowledge base provided by the Medical Terminology, NKIMed, and Chinese Library Classification systems. Through analyzing the behaviors of irregular and fraudulent medical services, we defined the scope of the medical domain ontology relevant to the task and built the ontology about medical sciences and medical service behaviors. The ontology then utilizes Semantic Web Rule Language (SWRL) and Java Expert System Shell (JESS) to detect medical irregularities and mine implicit knowledge. The system can be used to improve the management of medical insurance risks.
Thelwall, M.; Kousha, K.: Online presentations as a source of scientific impact? : an analysis of PowerPoint files citing academic journals (2008) 0.05
```
0.046978295 = product of:
  0.18791318 = sum of:
    0.18791318 = weight(_text_:harvard in 2614) [ClassicSimilarity], result of:
      0.18791318 = score(doc=2614,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.4029817 = fieldWeight in 2614, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.0390625 = fieldNorm(doc=2614)
  0.25 = coord(1/4)
```
Abstract

Open-access online publication has made available an increasingly wide range of document types for scientometric analysis. In this article, we focus on citations in online presentations, seeking evidence of their value as nontraditional indicators of research impact. For this purpose, we searched for online PowerPoint files mentioning any one of 1,807 ISI-indexed journals in ten science and ten social science disciplines. We also manually classified 1,378 online PowerPoint citations to journals in eight additional science and social science disciplines. The results showed that very few journals were cited frequently enough in online PowerPoint files to make impact assessment worthwhile, with the main exceptions being popular magazines like Scientific American and Harvard Business Review. Surprisingly, however, there was little difference overall in the number of PowerPoint citations to science and to the social sciences, and also in the proportion representing traditional impact (about 60%) and wider impact (about 15%). It seems that the main scientometric value for online presentations may be in tracking the popularization of research, or for comparing the impact of whole journals rather than individual articles.
Ho, Y.-S.; Kahn, M.: ¬A bibliometric study of highly cited reviews in the Science Citation Index expanded(TM) (2014) 0.05
```
0.046978295 = product of:
  0.18791318 = sum of:
    0.18791318 = weight(_text_:harvard in 2203) [ClassicSimilarity], result of:
      0.18791318 = score(doc=2203,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.4029817 = fieldWeight in 2203, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.0390625 = fieldNorm(doc=2203)
  0.25 = coord(1/4)
```
Abstract

Some 1,857 highly cited reviews, namely those cited at least 1,000 times since publication to 2011, were identified using the data hosted on the Science Citation Index ExpandedT database (Thomson Reuters, New York, NY) between 1899 and 2011. The data are disaggregated by publication date, citation counts, journals, Web of Science® (Thomson Reuters) subject areas, citation life cycles, and publications by Nobel Prize winners. Six indicators, total publications, independent publications, collaborative publications, first-author publications, corresponding-author publications, and single-author publications were applied to evaluate publication of institutions and countries. Among the highly cited reviews, 33% were single-author, 61% were single-institution, and 83% were single-country reviews. The United States ranked top for all 6 indicators. The G7 (United States, United Kingdom, Germany, Canada, France, Japan, and Italy) countries were the site of almost all the highly cited reviews. The top 12 most productive institutions were all located in the United States with Harvard University (Cambridge, MA) the leader. The top 3 most productive journals were Chemical Reviews, Nature, and the Annual Review of Biochemistry. In addition, the impact of the reviews was analyzed by total citations from publication to 2011, citations in 2011, and citation in publication year.
Bauer, J.; Leydesdorff, L.; Bornmann, L.: Highly cited papers in Library and Information Science (LIS) : authors, institutions, and network structures (2016) 0.05
```
0.046978295 = product of:
  0.18791318 = sum of:
    0.18791318 = weight(_text_:harvard in 4231) [ClassicSimilarity], result of:
      0.18791318 = score(doc=4231,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.4029817 = fieldWeight in 4231, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.0390625 = fieldNorm(doc=4231)
  0.25 = coord(1/4)
```
Abstract

As a follow-up to the highly cited authors list published by Thomson Reuters in June 2014, we analyzed the top 1% most frequently cited papers published between 2002 and 2012 included in the Web of Science (WoS) subject category "Information Science & Library Science." In all, 798 authors contributed to 305 top 1% publications; these authors were employed at 275 institutions. The authors at Harvard University contributed the largest number of papers, when the addresses are whole-number counted. However, Leiden University leads the ranking if fractional counting is used. Twenty-three of the 798 authors were also listed as most highly cited authors by Thomson Reuters in June 2014 (http://highlycited.com/). Twelve of these 23 authors were involved in publishing 4 or more of the 305 papers under study. Analysis of coauthorship relations among the 798 highly cited scientists shows that coauthorships are based on common interests in a specific topic. Three topics were important between 2002 and 2012: (a) collection and exploitation of information in clinical practices; (b) use of the Internet in public communication and commerce; and (c) scientometrics.
Standage, T.: Information overload is nothing new (2018) 0.05
```
0.046978295 = product of:
  0.18791318 = sum of:
    0.18791318 = weight(_text_:harvard in 473) [ClassicSimilarity], result of:
      0.18791318 = score(doc=473,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.4029817 = fieldWeight in 473, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.0390625 = fieldNorm(doc=473)
  0.25 = coord(1/4)
```
Content

"Overflowing inboxes, endlessly topped up by incoming emails. Constant alerts, notifications and text messages on your smartphone and computer. Infinitely scrolling streams of social-media posts. Access to all the music ever recorded, whenever you want it. And a deluge of high-quality television, with new series released every day on Netflix, Amazon Prime and elsewhere. The bounty of the internet is a marvellous thing, but the ever-expanding array of material can leave you feeling overwhelmed, constantly interrupted, unable to concentrate or worried that you are missing out or falling behind. No wonder some people are quitting social media, observing "digital sabbaths" when they unplug from the internet for a day, or buying old-fashioned mobile phones in an effort to avoid being swamped. This phenomenon may seem quintessentially modern, but it dates back centuries, as Ann Blair of Harvard University observes in "Too Much to Know", a history of information overload. Half a millennium ago, the printing press was to blame. "Is there anywhere on Earth exempt from these swarms of new books?" moaned Erasmus in 1525. New titles were appearing in such abundance, thousands every year. How could anyone figure out which ones were worth reading? Overwhelmed scholars across Europe worried that good ideas were being lost amid the deluge. Francisco Sanchez, a Spanish philosopher, complained in 1581 that 10m years was not long enough to read all the books in existence. The German polymath Gottfried Wilhelm Leibniz grumbled in 1680 of "that horrible mass of books which keeps on growing"."
Chen, H.; Chung, Y.-M.; Ramsey, M.; Yang, C.C.: ¬A smart itsy bitsy spider for the Web (1998) 0.04
```
0.04384779 = product of:
  0.17539117 = sum of:
    0.17539117 = weight(_text_:java in 1871) [ClassicSimilarity], result of:
      0.17539117 = score(doc=1871,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.38932347 = fieldWeight in 1871, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0390625 = fieldNorm(doc=1871)
  0.25 = coord(1/4)
```
Abstract

As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed 2 Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the Web, based on the links and keyword indexing. A graphical, dynamic, Jav-based interface was developed and is available for Web access. A system architecture for implementing such an agent-spider is presented, followed by deteiled discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process introduced in genetic algorithms allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent
Chen, C.: CiteSpace II : detecting and visualizing emerging trends and transient patterns in scientific literature (2006) 0.04
```
0.04384779 = product of:
  0.17539117 = sum of:
    0.17539117 = weight(_text_:java in 272) [ClassicSimilarity], result of:
      0.17539117 = score(doc=272,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.38932347 = fieldWeight in 272, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0390625 = fieldNorm(doc=272)
  0.25 = coord(1/4)
```
Abstract

This article describes the latest development of a generic approach to detecting and visualizing emerging trends and transient patterns in scientific literature. The work makes substantial theoretical and methodological contributions to progressive knowledge domain visualization. A specialty is conceptualized and visualized as a time-variant duality between two fundamental concepts in information science: research fronts and intellectual bases. A research front is defined as an emergent and transient grouping of concepts and underlying research issues. The intellectual base of a research front is its citation and co-citation footprint in scientific literature - an evolving network of scientific publications cited by research-front concepts. Kleinberg's (2002) burst-detection algorithm is adapted to identify emergent research-front concepts. Freeman's (1979) betweenness centrality metric is used to highlight potential pivotal points of paradigm shift over time. Two complementary visualization views are designed and implemented: cluster views and time-zone views. The contributions of the approach are that (a) the nature of an intellectual base is algorithmically and temporally identified by emergent research-front terms, (b) the value of a co-citation cluster is explicitly interpreted in terms of research-front concepts, and (c) visually prominent and algorithmically detected pivotal points substantially reduce the complexity of a visualized network. The modeling and visualization process is implemented in CiteSpace II, a Java application, and applied to the analysis of two research fields: mass extinction (1981-2004) and terrorism (1990-2003). Prominent trends and pivotal points in visualized networks were verified in collaboration with domain experts, who are the authors of pivotal-point articles. Practical implications of the work are discussed. A number of challenges and opportunities for future studies are identified.
Eddings, J.: How the Internet works (1994) 0.04
```
0.04384779 = product of:
  0.17539117 = sum of:
    0.17539117 = weight(_text_:java in 2514) [ClassicSimilarity], result of:
      0.17539117 = score(doc=2514,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.38932347 = fieldWeight in 2514, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0390625 = fieldNorm(doc=2514)
  0.25 = coord(1/4)
```
Abstract

How the Internet Works promises "an exciting visual journey down the highways and byways of the Internet," and it delivers. The book's high quality graphics and simple, succinct text make it the ideal book for beginners; however it still has much to offer for Net vets. This book is jam- packed with cool ways to visualize how the Net works. The first section visually explores how TCP/IP, Winsock, and other Net connectivity mysteries work. This section also helps you understand how e-mail addresses and domains work, what file types mean, and how information travels across the Net. Part 2 unravels the Net's underlying architecture, including good information on how routers work and what is meant by client/server architecture. The third section covers your own connection to the Net through an Internet Service Provider (ISP), and how ISDN, cable modems, and Web TV work. Part 4 discusses e-mail, spam, newsgroups, Internet Relay Chat (IRC), and Net phone calls. In part 5, you'll find out how other Net tools, such as gopher, telnet, WAIS, and FTP, can enhance your Net experience. The sixth section takes on the World Wide Web, including everything from how HTML works to image maps and forms. Part 7 looks at other Web features such as push technology, Java, ActiveX, and CGI scripting, while part 8 deals with multimedia on the Net. Part 9 shows you what intranets are and covers groupware, and shopping and searching the Net. The book wraps up with part 10, a chapter on Net security that covers firewalls, viruses, cookies, and other Web tracking devices, plus cryptography and parental controls.
Wu, D.; Shi, J.: Classical music recording ontology used in a library catalog (2016) 0.04
```
0.04384779 = product of:
  0.17539117 = sum of:
    0.17539117 = weight(_text_:java in 4179) [ClassicSimilarity], result of:
      0.17539117 = score(doc=4179,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.38932347 = fieldWeight in 4179, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0390625 = fieldNorm(doc=4179)
  0.25 = coord(1/4)
```
Abstract

In order to improve the organization of classical music information resources, we constructed a classical music recording ontology, on top of which we then designed an online classical music catalog. Our construction of the classical music recording ontology consisted of three steps: identifying the purpose, analyzing the ontology, and encoding the ontology. We identified the main classes and properties of the domain by investigating classical music recording resources and users' information needs. We implemented the ontology in the Web Ontology Language (OWL) using five steps: transforming the properties, encoding the transformed properties, defining ranges of the properties, constructing individuals, and standardizing the ontology. In constructing the online catalog, we first designed the structure and functions of the catalog based on investigations into users' information needs and information-seeking behaviors. Then we extracted classes and properties of the ontology using the Apache Jena application programming interface (API), and constructed a catalog in the Java environment. The catalog provides a hierarchical main page (built using the Functional Requirements for Bibliographic Records (FRBR) model), a classical music information network and integrated information service; this combination of features greatly eases the task of finding classical music recordings and more information about classical music.
Salton, G.: Automatic processing of foreign language documents (1985) 0.04
```
0.037582636 = product of:
  0.15033054 = sum of:
    0.15033054 = weight(_text_:harvard in 4650) [ClassicSimilarity], result of:
      0.15033054 = score(doc=4650,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.32238537 = fieldWeight in 4650, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.03125 = fieldNorm(doc=4650)
  0.25 = coord(1/4)
```
Abstract

The attempt to computerize a process, such as indexing, abstracting, classifying, or retrieving information, begins with an analysis of the process into its intellectual and nonintellectual components. That part of the process which is amenable to computerization is mechanical or algorithmic. What is not is intellectual or creative and requires human intervention. Gerard Salton has been an innovator, experimenter, and promoter in the area of mechanized information systems since the early 1960s. He has been particularly ingenious at analyzing the process of information retrieval into its algorithmic components. He received a doctorate in applied mathematics from Harvard University before moving to the computer science department at Cornell, where he developed a prototype automatic retrieval system called SMART. Working with this system he and his students contributed for over a decade to our theoretical understanding of the retrieval process. On a more practical level, they have contributed design criteria for operating retrieval systems. The following selection presents one of the early descriptions of the SMART system; it is valuable as it shows the direction automatic retrieval methods were to take beyond simple word-matching techniques. These include various word normalization techniques to improve recall, for instance, the separation of words into stems and affixes; the correlation and clustering, using statistical association measures, of related terms; and the identification, using a concept thesaurus, of synonymous, broader, narrower, and sibling terms. They include, as weIl, techniques, both linguistic and statistical, to deal with the thorny problem of how to automatically extract from texts index terms that consist of more than one word. They include weighting techniques and various documentrequest matching algorithms. Significant among the latter are those which produce a retrieval output of citations ranked in relevante order. During the 1970s, Salton and his students went an to further refine these various techniques, particularly the weighting and statistical association measures. Many of their early innovations seem commonplace today. Some of their later techniques are still ahead of their time and await technological developments for implementation. The particular focus of the selection that follows is an the evaluation of a particular component of the SMART system, a multilingual thesaurus. By mapping English language expressions and their German equivalents to a common concept number, the thesaurus permitted the automatic processing of German language documents against English language queries and vice versa. The results of the evaluation, as it turned out, were somewhat inconclusive. However, this SMART experiment suggested in a bold and optimistic way how one might proceed to answer such complex questions as What is meant by retrieval language compatability? How it is to be achieved, and how evaluated?
Tüür-Fröhlich, T.: ¬The non-trivial effects of trivial errors in scientific communication and evaluation (2016) 0.04
```
0.037582636 = product of:
  0.15033054 = sum of:
    0.15033054 = weight(_text_:harvard in 4137) [ClassicSimilarity], result of:
      0.15033054 = score(doc=4137,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.32238537 = fieldWeight in 4137, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.03125 = fieldNorm(doc=4137)
  0.25 = coord(1/4)
```
Abstract

"Thomson Reuters' citation indexes i.e. SCI, SSCI and AHCI are said to be "authoritative". Due to the huge influence of these databases on global academic evaluation of productivity and impact, Terje Tüür-Fröhlich decided to conduct case studies on the data quality of Social Sciences Citation Index (SSCI) records. Tüür-Fröhlich investigated articles from social science and law. The main findings: SSCI records contain tremendous amounts of "trivial errors", not only misspellings and typos as previously mentioned in bibliometrics and scientometrics literature. But Tüür-Fröhlich's research documented fatal errors which have not been mentioned in the scientometrics literature yet at all. Tüür-Fröhlich found more than 80 fatal mutations and mutilations of Pierre Bourdieu (e.g. "Atkinson" or "Pierre, B. and "Pierri, B."). SSCI even generated zombie references (phantom authors and works) by data fields' confusion - a deadly sin for a database producer - as fragments of Patent Laws were indexed as fictional author surnames/initials. Additionally, horrific OCR-errors (e.g. "nuxure" instead of "Nature" as journal title) were identified. Tüür-Fröhlich´s extensive quantitative case study of an article of the Harvard Law Review resulted in a devastating finding: only 1% of all correct references from the original article were indexed by SSCI without any mistake or error. Many scientific communication experts and database providers' believe that errors in databanks are of less importance: There are many errors, yes - but they would counterbalance each other, errors would not result in citation losses and would not bear any effect on retrieval and evaluation outcomes. Terje Tüür-Fröhlich claims the contrary: errors and inconsistencies are not evenly distributed but linked with languages biases and publication cultures."
Somers, J.: Torching the modern-day library of Alexandria : somewhere at Google there is a database containing 25 million books and nobody is allowed to read them. (2017) 0.04
```
0.037582636 = product of:
  0.15033054 = sum of:
    0.15033054 = weight(_text_:harvard in 4608) [ClassicSimilarity], result of:
      0.15033054 = score(doc=4608,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.32238537 = fieldWeight in 4608, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.03125 = fieldNorm(doc=4608)
  0.25 = coord(1/4)
```
Abstract

You were going to get one-click access to the full text of nearly every book that's ever been published. Books still in print you'd have to pay for, but everything else-a collection slated to grow larger than the holdings at the Library of Congress, Harvard, the University of Michigan, at any of the great national libraries of Europe-would have been available for free at terminals that were going to be placed in every local library that wanted one. At the terminal you were going to be able to search tens of millions of books and read every page of any book you found. You'd be able to highlight passages and make annotations and share them; for the first time, you'd be able to pinpoint an idea somewhere inside the vastness of the printed record, and send somebody straight to it with a link. Books would become as instantly available, searchable, copy-pasteable-as alive in the digital world-as web pages. It was to be the realization of a long-held dream. "The universal library has been talked about for millennia," Richard Ovenden, the head of Oxford's Bodleian Libraries, has said. "It was possible to think in the Renaissance that you might be able to amass the whole of published knowledge in a single room or a single institution." In the spring of 2011, it seemed we'd amassed it in a terminal small enough to fit on a desk. "This is a watershed event and can serve as a catalyst for the reinvention of education, research, and intellectual life," one eager observer wrote at the time. On March 22 of that year, however, the legal agreement that would have unlocked a century's worth of books and peppered the country with access terminals to a universal library was rejected under Rule 23(e)(2) of the Federal Rules of Civil Procedure by the U.S. District Court for the Southern District of New York. When the library at Alexandria burned it was said to be an "international catastrophe." When the most significant humanities project of our time was dismantled in court, the scholars, archivists, and librarians who'd had a hand in its undoing breathed a sigh of relief, for they believed, at the time, that they had narrowly averted disaster.
Torres-Salinas, D.; Gorraiz, J.; Robinson-Garcia, N.: ¬The insoluble problems of books : what does Altmetric.com have to offer? (2018) 0.04
```
0.037582636 = product of:
  0.15033054 = sum of:
    0.15033054 = weight(_text_:harvard in 633) [ClassicSimilarity], result of:
      0.15033054 = score(doc=633,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.32238537 = fieldWeight in 633, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.03125 = fieldNorm(doc=633)
  0.25 = coord(1/4)
```
Abstract

Purpose The purpose of this paper is to analyze the capabilities, functionalities and appropriateness of Altmetric.com as a data source for the bibliometric analysis of books in comparison to PlumX. Design/methodology/approach The authors perform an exploratory analysis on the metrics the Altmetric Explorer for Institutions, platform offers for books. The authors use two distinct data sets of books. On the one hand, the authors analyze the Book Collection included in Altmetric.com. On the other hand, the authors use Clarivate's Master Book List, to analyze Altmetric.com's capabilities to download and merge data with external databases. Finally, the authors compare the findings with those obtained in a previous study performed in PlumX. Findings Altmetric.com combines and orderly tracks a set of data sources combined by DOI identifiers to retrieve metadata from books, being Google Books its main provider. It also retrieves information from commercial publishers and from some Open Access initiatives, including those led by university libraries, such as Harvard Library. We find issues with linkages between records and mentions or ISBN discrepancies. Furthermore, the authors find that automatic bots affect greatly Wikipedia mentions to books. The comparison with PlumX suggests that none of these tools provide a complete picture of the social attention generated by books and are rather complementary than comparable tools. Practical implications This study targets different audience which can benefit from the findings. First, bibliometricians and researchers who seek for alternative sources to develop bibliometric analyses of books, with a special focus on the Social Sciences and Humanities fields. Second, librarians and research managers who are the main clients to which these tools are directed. Third, Altmetric.com itself as well as other altmetric providers who might get a better understanding of the limitations users encounter and improve this promising tool. Originality/value This is the first study to analyze Altmetric.com's functionalities and capabilities for providing metric data for books and to compare results from this platform, with those obtained via PlumX.

Larson, E.J.: ¬The myth of artificial intelligence : why computers can't think the way we do (2021) 0.04

0.037582636 = product of:
  0.15033054 = sum of:
    0.15033054 = weight(_text_:harvard in 2343) [ClassicSimilarity], result of:
      0.15033054 = score(doc=2343,freq=2.0), product of:
        0.46630698 = queryWeight, product of:
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.06392366 = queryNorm
        0.32238537 = fieldWeight in 2343, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2947483 = idf(docFreq=81, maxDocs=44421)
          0.03125 = fieldNorm(doc=2343)
  0.25 = coord(1/4)

Imprint: Cambridge, Mass. : The Belknap Press of Harvard University Press

Noerr, P.: ¬The Digital Library Tool Kit (2001) 0.04
```
0.03507823 = product of:
  0.14031292 = sum of:
    0.14031292 = weight(_text_:java in 774) [ClassicSimilarity], result of:
      0.14031292 = score(doc=774,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.31145877 = fieldWeight in 774, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.03125 = fieldNorm(doc=774)
  0.25 = coord(1/4)
```
Footnote

This Digital Library Tool Kit was sponsored by Sun Microsystems, Inc. to address some of the leading questions that academic institutions, public libraries, government agencies, and museums face in trying to develop, manage, and distribute digital content. The evolution of Java programming, digital object standards, Internet access, electronic commerce, and digital media management models is causing educators, CIOs, and librarians to rethink many of their traditional goals and modes of operation. New audiences, continuous access to collections, and enhanced services to user communities are enabled. As one of the leading technology providers to education and library communities, Sun is pleased to present this comprehensive introduction to digital libraries
Herrero-Solana, V.; Moya Anegón, F. de: Graphical Table of Contents (GTOC) for library collections : the application of UDC codes for the subject maps (2003) 0.04
```
0.03507823 = product of:
  0.14031292 = sum of:
    0.14031292 = weight(_text_:java in 3758) [ClassicSimilarity], result of:
      0.14031292 = score(doc=3758,freq=2.0), product of:
        0.45050243 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.06392366 = queryNorm
        0.31145877 = fieldWeight in 3758, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.03125 = fieldNorm(doc=3758)
  0.25 = coord(1/4)
```
Abstract

The representation of information contents by graphical maps is an extended ongoing research topic. In this paper we introduce the application of UDC codes for the subject maps development. We use the following graphic representation methodologies: 1) Multidimensional scaling (MDS), 2) Cluster analysis, 3) Neural networks (Self Organizing Map - SOM). Finally, we conclude about the application viability of every kind of map. 1. Introduction Advanced techniques for Information Retrieval (IR) currently make up one of the most active areas for research in the field of library and information science. New models representing document content are replacing the classic systems in which the search terms supplied by the user were compared against the indexing terms existing in the inverted files of a database. One of the topics most often studied in the last years is bibliographic browsing, a good complement to querying strategies. Since the 80's, many authors have treated this topic. For example, Ellis establishes that browsing is based an three different types of tasks: identification, familiarization and differentiation (Ellis, 1989). On the other hand, Cove indicates three different browsing types: searching browsing, general purpose browsing and serendipity browsing (Cove, 1988). Marcia Bates presents six different types (Bates, 1989), although the classification of Bawden is the one that really interests us: 1) similarity comparison, 2) structure driven, 3) global vision (Bawden, 1993). The global vision browsing implies the use of graphic representations, which we will call map displays, that allow the user to get a global idea of the nature and structure of the information in the database. In the 90's, several authors worked an this research line, developing different types of maps. One of the most active was Xia Lin what introduced the concept of Graphical Table of Contents (GTOC), comparing the maps to true table of contents based an graphic representations (Lin 1996). Lin applies the algorithm SOM to his own personal bibliography, analyzed in function of the words of the title and abstract fields, and represented in a two-dimensional map (Lin 1997). Later on, Lin applied this type of maps to create websites GTOCs, through a Java application.

Search (1250 results, page 5 of 63)

Authors

Years

Languages

Types

Themes

Subjects

Classifications