-
Mills, T.; Moody, K.; Rodden, K.: Providing world wide access to historical sources (1997)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 3697) [ClassicSimilarity], result of:
0.20484664 = score(doc=3697,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 3697, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=3697)
0.25 = coord(1/4)
- Abstract
- A unique collection of historical material covering the lives and events of an English village between 1400 and 1750 has been made available via a WWW enabled information retrieval system. Since the expected readership of the documents ranges from school children to experienced researchers, providing this information in an easily accessible form has offered many challenges requiring tools to aid searching and browsing. The file structure of the document collection was replaced by an database, enabling query results to be presented on the fly. A Java interface displays each user's context in a form that allows for easy and intuitive relevance feedback
-
Maarek, Y.S.: WebCutter : a system for dynamic and tailorable site mapping (1997)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 3739) [ClassicSimilarity], result of:
0.20484664 = score(doc=3739,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 3739, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=3739)
0.25 = coord(1/4)
- Abstract
- Presents an approach that integrates searching and browsing in a manner that improves both paradigms. When browsing is the primary task, it enables semantic content-based tailoring of Web maps in both the generation as well as the visualization phases. When search is the primary task, it enables contextualization of the results by augmenting them with the documents' neighbourhoods. This approach is embodied in WebCutter, a client-server system fully integrated with Web software. WebCutter consists of a map generator running off a standard Web server and a map visualization client implemented as a Java applet runalble from any standard Web browser and requiring no installation or external plug-in application. WebCutter is in beta stage and is in the process of being integrated into the Lotus Domino application product line
-
Pan, B.; Gay, G.; Saylor, J.; Hembrooke, H.: One digital library, two undergraduate casses, and four learning modules : uses of a digital library in cassrooms (2006)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 907) [ClassicSimilarity], result of:
0.20484664 = score(doc=907,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 907, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=907)
0.25 = coord(1/4)
- Abstract
- The KMODDL (kinematic models for design digital library) is a digital library based on a historical collection of kinematic models made of steel and bronze. The digital library contains four types of learning modules including textual materials, QuickTime virtual reality movies, Java simulations, and stereolithographic files of the physical models. The authors report an evaluation study on the uses of the KMODDL in two undergraduate classes. This research reveals that the users in different classes encountered different usability problems, and reported quantitatively different subjective experiences. Further, the results indicate that depending on the subject area, the two user groups preferred different types of learning modules, resulting in different uses of the available materials and different learning outcomes. These findings are discussed in terms of their implications for future digital library design.
-
Mongin, L.; Fu, Y.Y.; Mostafa, J.: Open Archives data Service prototype and automated subject indexing using D-Lib archive content as a testbed (2003)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 2167) [ClassicSimilarity], result of:
0.20484664 = score(doc=2167,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 2167, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=2167)
0.25 = coord(1/4)
- Abstract
- The Indiana University School of Library and Information Science opened a new research laboratory in January 2003; The Indiana University School of Library and Information Science Information Processing Laboratory [IU IP Lab]. The purpose of the new laboratory is to facilitate collaboration between scientists in the department in the areas of information retrieval (IR) and information visualization (IV) research. The lab has several areas of focus. These include grid and cluster computing, and a standard Java-based software platform to support plug and play research datasets, a selection of standard IR modules and standard IV algorithms. Future development includes software to enable researchers to contribute datasets, IR algorithms, and visualization algorithms into the standard environment. We decided early on to use OAI-PMH as a resource discovery tool because it is consistent with our mission.
-
Song, R.; Luo, Z.; Nie, J.-Y.; Yu, Y.; Hon, H.-W.: Identification of ambiguous queries in web search (2009)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 3441) [ClassicSimilarity], result of:
0.20484664 = score(doc=3441,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 3441, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=3441)
0.25 = coord(1/4)
- Abstract
- It is widely believed that many queries submitted to search engines are inherently ambiguous (e.g., java and apple). However, few studies have tried to classify queries based on ambiguity and to answer "what the proportion of ambiguous queries is". This paper deals with these issues. First, we clarify the definition of ambiguous queries by constructing the taxonomy of queries from being ambiguous to specific. Second, we ask human annotators to manually classify queries. From manually labeled results, we observe that query ambiguity is to some extent predictable. Third, we propose a supervised learning approach to automatically identify ambiguous queries. Experimental results show that we can correctly identify 87% of labeled queries with the approach. Finally, by using our approach, we estimate that about 16% of queries in a real search log are ambiguous.
-
Croft, W.B.; Metzler, D.; Strohman, T.: Search engines : information retrieval in practice (2010)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 3605) [ClassicSimilarity], result of:
0.20484664 = score(doc=3605,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 3605, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=3605)
0.25 = coord(1/4)
- Abstract
- For introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice, is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book's numerous programming exercises make extensive use of Galago, a Java-based open source search engine. SUPPLEMENTS / Extensive lecture slides (in PDF and PPT format) / Solutions to selected end of chapter problems (Instructors only) / Test collections for exercises / Galago search engine
-
Tang, X.-B.; Wei Wei, G,-C.L.; Zhu, J.: ¬An inference model of medical insurance fraud detection : based on ontology and SWRL (2017)
0.05
0.05121166 = product of:
0.20484664 = sum of:
0.20484664 = weight(_text_:java in 4615) [ClassicSimilarity], result of:
0.20484664 = score(doc=4615,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.46718815 = fieldWeight in 4615, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.046875 = fieldNorm(doc=4615)
0.25 = coord(1/4)
- Abstract
- Medical insurance fraud is common in many countries' medical insurance systems and represents a serious threat to the insurance funds and the benefits of patients. In this paper, we present an inference model of medical insurance fraud detection, based on a medical detection domain ontology that incorporates the knowledge base provided by the Medical Terminology, NKIMed, and Chinese Library Classification systems. Through analyzing the behaviors of irregular and fraudulent medical services, we defined the scope of the medical domain ontology relevant to the task and built the ontology about medical sciences and medical service behaviors. The ontology then utilizes Semantic Web Rule Language (SWRL) and Java Expert System Shell (JESS) to detect medical irregularities and mine implicit knowledge. The system can be used to improve the management of medical insurance risks.
-
Khoo, S.G.; Na, J.-C.: Semantic relations in information science (2006)
0.05
0.048727002 = product of:
0.19490801 = sum of:
0.19490801 = weight(_text_:herrmann in 2978) [ClassicSimilarity], result of:
0.19490801 = score(doc=2978,freq=4.0), product of:
0.50862175 = queryWeight, product of:
8.175107 = idf(docFreq=33, maxDocs=44421)
0.062215917 = queryNorm
0.38320816 = fieldWeight in 2978, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
8.175107 = idf(docFreq=33, maxDocs=44421)
0.0234375 = fieldNorm(doc=2978)
0.25 = coord(1/4)
- Abstract
- This chapter examines the nature of semantic relations and their main applications in information science. The nature and types of semantic relations are discussed from the perspectives of linguistics and psychology. An overview of the semantic relations used in knowledge structures such as thesauri and ontologies is provided, as well as the main techniques used in the automatic extraction of semantic relations from text. The chapter then reviews the use of semantic relations in information extraction, information retrieval, question-answering, and automatic text summarization applications. Concepts and relations are the foundation of knowledge and thought. When we look at the world, we perceive not a mass of colors but objects to which we automatically assign category labels. Our perceptual system automatically segments the world into concepts and categories. Concepts are the building blocks of knowledge; relations act as the cement that links concepts into knowledge structures. We spend much of our lives identifying regular associations and relations between objects, events, and processes so that the world has an understandable structure and predictability. Our lives and work depend on the accuracy and richness of this knowledge structure and its web of relations. Relations are needed for reasoning and inferencing. Chaffin and Herrmann (1988b, p. 290) noted that "relations between ideas have long been viewed as basic to thought, language, comprehension, and memory." Aristotle's Metaphysics (Aristotle, 1961; McKeon, expounded on several types of relations. The majority of the 30 entries in a section of the Metaphysics known today as the Philosophical Lexicon referred to relations and attributes, including cause, part-whole, same and opposite, quality (i.e., attribute) and kind-of, and defined different types of each relation. Hume (1955) pointed out that there is a connection between successive ideas in our minds, even in our dreams, and that the introduction of an idea in our mind automatically recalls an associated idea. He argued that all the objects of human reasoning are divided into relations of ideas and matters of fact and that factual reasoning is founded on the cause-effect relation. His Treatise of Human Nature identified seven kinds of relations: resemblance, identity, relations of time and place, proportion in quantity or number, degrees in quality, contrariety, and causation. Mill (1974, pp. 989-1004) discoursed on several types of relations, claiming that all things are either feelings, substances, or attributes, and that attributes can be a quality (which belongs to one object) or a relation to other objects.
Linguists in the structuralist tradition (e.g., Lyons, 1977; Saussure, 1959) have asserted that concepts cannot be defined on their own but only in relation to other concepts. Semantic relations appear to reflect a logical structure in the fundamental nature of thought (Caplan & Herrmann, 1993). Green, Bean, and Myaeng (2002) noted that semantic relations play a critical role in how we represent knowledge psychologically, linguistically, and computationally, and that many systems of knowledge representation start with a basic distinction between entities and relations. Green (2001, p. 3) said that "relationships are involved as we combine simple entities to form more complex entities, as we compare entities, as we group entities, as one entity performs a process on another entity, and so forth. Indeed, many things that we might initially regard as basic and elemental are revealed upon further examination to involve internal structure, or in other words, internal relationships." Concepts and relations are often expressed in language and text. Language is used not just for communicating concepts and relations, but also for representing, storing, and reasoning with concepts and relations. We shall examine the nature of semantic relations from a linguistic and psychological perspective, with an emphasis on relations expressed in text. The usefulness of semantic relations in information science, especially in ontology construction, information extraction, information retrieval, question-answering, and text summarization is discussed. Research and development in information science have focused on concepts and terms, but the focus will increasingly shift to the identification, processing, and management of relations to achieve greater effectiveness and refinement in information science techniques. Previous chapters in ARIST on natural language processing (Chowdhury, 2003), text mining (Trybula, 1999), information retrieval and the philosophy of language (Blair, 2003), and query expansion (Efthimiadis, 1996) provide a background for this discussion, as semantic relations are an important part of these applications.
-
Chen, H.; Chung, Y.-M.; Ramsey, M.; Yang, C.C.: ¬A smart itsy bitsy spider for the Web (1998)
0.04
0.042676385 = product of:
0.17070554 = sum of:
0.17070554 = weight(_text_:java in 1871) [ClassicSimilarity], result of:
0.17070554 = score(doc=1871,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.38932347 = fieldWeight in 1871, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=1871)
0.25 = coord(1/4)
- Abstract
- As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed 2 Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the Web, based on the links and keyword indexing. A graphical, dynamic, Jav-based interface was developed and is available for Web access. A system architecture for implementing such an agent-spider is presented, followed by deteiled discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process introduced in genetic algorithms allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent
-
Chen, C.: CiteSpace II : detecting and visualizing emerging trends and transient patterns in scientific literature (2006)
0.04
0.042676385 = product of:
0.17070554 = sum of:
0.17070554 = weight(_text_:java in 272) [ClassicSimilarity], result of:
0.17070554 = score(doc=272,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.38932347 = fieldWeight in 272, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=272)
0.25 = coord(1/4)
- Abstract
- This article describes the latest development of a generic approach to detecting and visualizing emerging trends and transient patterns in scientific literature. The work makes substantial theoretical and methodological contributions to progressive knowledge domain visualization. A specialty is conceptualized and visualized as a time-variant duality between two fundamental concepts in information science: research fronts and intellectual bases. A research front is defined as an emergent and transient grouping of concepts and underlying research issues. The intellectual base of a research front is its citation and co-citation footprint in scientific literature - an evolving network of scientific publications cited by research-front concepts. Kleinberg's (2002) burst-detection algorithm is adapted to identify emergent research-front concepts. Freeman's (1979) betweenness centrality metric is used to highlight potential pivotal points of paradigm shift over time. Two complementary visualization views are designed and implemented: cluster views and time-zone views. The contributions of the approach are that (a) the nature of an intellectual base is algorithmically and temporally identified by emergent research-front terms, (b) the value of a co-citation cluster is explicitly interpreted in terms of research-front concepts, and (c) visually prominent and algorithmically detected pivotal points substantially reduce the complexity of a visualized network. The modeling and visualization process is implemented in CiteSpace II, a Java application, and applied to the analysis of two research fields: mass extinction (1981-2004) and terrorism (1990-2003). Prominent trends and pivotal points in visualized networks were verified in collaboration with domain experts, who are the authors of pivotal-point articles. Practical implications of the work are discussed. A number of challenges and opportunities for future studies are identified.
-
Eddings, J.: How the Internet works (1994)
0.04
0.042676385 = product of:
0.17070554 = sum of:
0.17070554 = weight(_text_:java in 2514) [ClassicSimilarity], result of:
0.17070554 = score(doc=2514,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.38932347 = fieldWeight in 2514, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=2514)
0.25 = coord(1/4)
- Abstract
- How the Internet Works promises "an exciting visual journey down the highways and byways of the Internet," and it delivers. The book's high quality graphics and simple, succinct text make it the ideal book for beginners; however it still has much to offer for Net vets. This book is jam- packed with cool ways to visualize how the Net works. The first section visually explores how TCP/IP, Winsock, and other Net connectivity mysteries work. This section also helps you understand how e-mail addresses and domains work, what file types mean, and how information travels across the Net. Part 2 unravels the Net's underlying architecture, including good information on how routers work and what is meant by client/server architecture. The third section covers your own connection to the Net through an Internet Service Provider (ISP), and how ISDN, cable modems, and Web TV work. Part 4 discusses e-mail, spam, newsgroups, Internet Relay Chat (IRC), and Net phone calls. In part 5, you'll find out how other Net tools, such as gopher, telnet, WAIS, and FTP, can enhance your Net experience. The sixth section takes on the World Wide Web, including everything from how HTML works to image maps and forms. Part 7 looks at other Web features such as push technology, Java, ActiveX, and CGI scripting, while part 8 deals with multimedia on the Net. Part 9 shows you what intranets are and covers groupware, and shopping and searching the Net. The book wraps up with part 10, a chapter on Net security that covers firewalls, viruses, cookies, and other Web tracking devices, plus cryptography and parental controls.
-
Wu, D.; Shi, J.: Classical music recording ontology used in a library catalog (2016)
0.04
0.042676385 = product of:
0.17070554 = sum of:
0.17070554 = weight(_text_:java in 4179) [ClassicSimilarity], result of:
0.17070554 = score(doc=4179,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.38932347 = fieldWeight in 4179, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=4179)
0.25 = coord(1/4)
- Abstract
- In order to improve the organization of classical music information resources, we constructed a classical music recording ontology, on top of which we then designed an online classical music catalog. Our construction of the classical music recording ontology consisted of three steps: identifying the purpose, analyzing the ontology, and encoding the ontology. We identified the main classes and properties of the domain by investigating classical music recording resources and users' information needs. We implemented the ontology in the Web Ontology Language (OWL) using five steps: transforming the properties, encoding the transformed properties, defining ranges of the properties, constructing individuals, and standardizing the ontology. In constructing the online catalog, we first designed the structure and functions of the catalog based on investigations into users' information needs and information-seeking behaviors. Then we extracted classes and properties of the ontology using the Apache Jena application programming interface (API), and constructed a catalog in the Java environment. The catalog provides a hierarchical main page (built using the Functional Requirements for Bibliographic Records (FRBR) model), a classical music information network and integrated information service; this combination of features greatly eases the task of finding classical music recordings and more information about classical music.
-
Noerr, P.: ¬The Digital Library Tool Kit (2001)
0.03
0.03414111 = product of:
0.13656443 = sum of:
0.13656443 = weight(_text_:java in 774) [ClassicSimilarity], result of:
0.13656443 = score(doc=774,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.31145877 = fieldWeight in 774, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.03125 = fieldNorm(doc=774)
0.25 = coord(1/4)
- Footnote
- This Digital Library Tool Kit was sponsored by Sun Microsystems, Inc. to address some of the leading questions that academic institutions, public libraries, government agencies, and museums face in trying to develop, manage, and distribute digital content. The evolution of Java programming, digital object standards, Internet access, electronic commerce, and digital media management models is causing educators, CIOs, and librarians to rethink many of their traditional goals and modes of operation. New audiences, continuous access to collections, and enhanced services to user communities are enabled. As one of the leading technology providers to education and library communities, Sun is pleased to present this comprehensive introduction to digital libraries
-
Herrero-Solana, V.; Moya Anegón, F. de: Graphical Table of Contents (GTOC) for library collections : the application of UDC codes for the subject maps (2003)
0.03
0.03414111 = product of:
0.13656443 = sum of:
0.13656443 = weight(_text_:java in 3758) [ClassicSimilarity], result of:
0.13656443 = score(doc=3758,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.31145877 = fieldWeight in 3758, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.03125 = fieldNorm(doc=3758)
0.25 = coord(1/4)
- Abstract
- The representation of information contents by graphical maps is an extended ongoing research topic. In this paper we introduce the application of UDC codes for the subject maps development. We use the following graphic representation methodologies: 1) Multidimensional scaling (MDS), 2) Cluster analysis, 3) Neural networks (Self Organizing Map - SOM). Finally, we conclude about the application viability of every kind of map. 1. Introduction Advanced techniques for Information Retrieval (IR) currently make up one of the most active areas for research in the field of library and information science. New models representing document content are replacing the classic systems in which the search terms supplied by the user were compared against the indexing terms existing in the inverted files of a database. One of the topics most often studied in the last years is bibliographic browsing, a good complement to querying strategies. Since the 80's, many authors have treated this topic. For example, Ellis establishes that browsing is based an three different types of tasks: identification, familiarization and differentiation (Ellis, 1989). On the other hand, Cove indicates three different browsing types: searching browsing, general purpose browsing and serendipity browsing (Cove, 1988). Marcia Bates presents six different types (Bates, 1989), although the classification of Bawden is the one that really interests us: 1) similarity comparison, 2) structure driven, 3) global vision (Bawden, 1993). The global vision browsing implies the use of graphic representations, which we will call map displays, that allow the user to get a global idea of the nature and structure of the information in the database. In the 90's, several authors worked an this research line, developing different types of maps. One of the most active was Xia Lin what introduced the concept of Graphical Table of Contents (GTOC), comparing the maps to true table of contents based an graphic representations (Lin 1996). Lin applies the algorithm SOM to his own personal bibliography, analyzed in function of the words of the title and abstract fields, and represented in a two-dimensional map (Lin 1997). Later on, Lin applied this type of maps to create websites GTOCs, through a Java application.
-
Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010)
0.03
0.03414111 = product of:
0.13656443 = sum of:
0.13656443 = weight(_text_:java in 935) [ClassicSimilarity], result of:
0.13656443 = score(doc=935,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.31145877 = fieldWeight in 935, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.03125 = fieldNorm(doc=935)
0.25 = coord(1/4)
- Abstract
- Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
-
Radhakrishnan, A.: Swoogle : an engine for the Semantic Web (2007)
0.03
0.03414111 = product of:
0.13656443 = sum of:
0.13656443 = weight(_text_:java in 709) [ClassicSimilarity], result of:
0.13656443 = score(doc=709,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.31145877 = fieldWeight in 709, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.03125 = fieldNorm(doc=709)
0.25 = coord(1/4)
- Content
- "Swoogle, the Semantic web search engine, is a research project carried out by the ebiquity research group in the Computer Science and Electrical Engineering Department at the University of Maryland. It's an engine tailored towards finding documents on the semantic web. The whole research paper is available here. Semantic web is touted as the next generation of online content representation where the web documents are represented in a language that is not only easy for humans but is machine readable (easing the integration of data as never thought possible) as well. And the main elements of the semantic web include data model description formats such as Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, Turtle, N-Triples), and notations such as RDF Schema (RDFS), the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain (Wikipedia). And Swoogle is an attempt to mine and index this new set of web documents. The engine performs crawling of semantic documents like most web search engines and the search is available as web service too. The engine is primarily written in Java with the PHP used for the front-end and MySQL for database. Swoogle is capable of searching over 10,000 ontologies and indexes more that 1.3 million web documents. It also computes the importance of a Semantic Web document. The techniques used for indexing are the more google-type page ranking and also mining the documents for inter-relationships that are the basis for the semantic web. For more information on how the RDF framework can be used to relate documents, read the link here. Being a research project, and with a non-commercial motive, there is not much hype around Swoogle. However, the approach to indexing of Semantic web documents is an approach that most engines will have to take at some point of time. When the Internet debuted, there were no specific engines available for indexing or searching. The Search domain only picked up as more and more content became available. One fundamental question that I've always wondered about it is - provided that the search engines return very relevant results for a query - how to ascertain that the documents are indeed the most relevant ones available. There is always an inherent delay in indexing of document. Its here that the new semantic documents search engines can close delay. Experimenting with the concept of Search in the semantic web can only bore well for the future of search technology."
-
Piros, A.: Automatic interpretation of complex UDC numbers : towards support for library systems (2015)
0.03
0.03414111 = product of:
0.13656443 = sum of:
0.13656443 = weight(_text_:java in 3301) [ClassicSimilarity], result of:
0.13656443 = score(doc=3301,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.31145877 = fieldWeight in 3301, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.03125 = fieldNorm(doc=3301)
0.25 = coord(1/4)
- Abstract
- Analytico-synthetic and faceted classifications, such as Universal Decimal Classification (UDC) express content of documents with complex, pre-combined classification codes. Without classification authority control that would help manage and access structured notations, the use of UDC codes in searching and browsing is limited. Existing UDC parsing solutions are usually created for a particular database system or a specific task and are not widely applicable. The approach described in this paper provides a solution by which the analysis and interpretation of UDC notations would be stored into an intermediate format (in this case, in XML) by automatic means without any data or information loss. Due to its richness, the output file can be converted into different formats, such as standard mark-up and data exchange formats or simple lists of the recommended entry points of a UDC number. The program can also be used to create authority records containing complex UDC numbers which can be comprehensively analysed in order to be retrieved effectively. The Java program, as well as the corresponding schema definition it employs, is under continuous development. The current version of the interpreter software is now available online for testing purposes at the following web site: http://interpreter-eto.rhcloud.com. The future plan is to implement conversion methods for standard formats and to create standard online interfaces in order to make it possible to use the features of software as a service. This would result in the algorithm being able to be employed both in existing and future library systems to analyse UDC numbers without any significant programming effort.
-
Rosenfeld, L.; Morville, P.: Information architecture for the World Wide Web : designing large-scale Web sites (1998)
0.03
0.029873468 = product of:
0.11949387 = sum of:
0.11949387 = weight(_text_:java in 1493) [ClassicSimilarity], result of:
0.11949387 = score(doc=1493,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.2725264 = fieldWeight in 1493, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.02734375 = fieldNorm(doc=1493)
0.25 = coord(1/4)
- Abstract
- Some web sites "work" and some don't. Good web site consultants know that you can't just jump in and start writing HTML, the same way you can't build a house by just pouring a foundation and putting up some walls. You need to know who will be using the site, and what they'll be using it for. You need some idea of what you'd like to draw their attention to during their visit. Overall, you need a strong, cohesive vision for the site that makes it both distinctive and usable. Information Architecture for the World Wide Web is about applying the principles of architecture and library science to web site design. Each web site is like a public building, available for tourists and regulars alike to breeze through at their leisure. The job of the architect is to set up the framework for the site to make it comfortable and inviting for people to visit, relax in, and perhaps even return to someday. Most books on web development concentrate either on the aesthetics or the mechanics of the site. This book is about the framework that holds the two together. With this book, you learn how to design web sites and intranets that support growth, management, and ease of use. Special attention is given to: * The process behind architecting a large, complex site * Web site hierarchy design and organization Information Architecture for the World Wide Web is for webmasters, designers, and anyone else involved in building a web site. It's for novice web designers who, from the start, want to avoid the traps that result in poorly designed sites. It's for experienced web designers who have already created sites but realize that something "is missing" from their sites and want to improve them. It's for programmers and administrators who are comfortable with HTML, CGI, and Java but want to understand how to organize their web pages into a cohesive site. The authors are two of the principals of Argus Associates, a web consulting firm. At Argus, they have created information architectures for web sites and intranets of some of the largest companies in the United States, including Chrysler Corporation, Barron's, and Dow Chemical.
-
Tennant, R.: Library catalogs : the wrong solution (2003)
0.03
0.02560583 = product of:
0.10242332 = sum of:
0.10242332 = weight(_text_:java in 2558) [ClassicSimilarity], result of:
0.10242332 = score(doc=2558,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.23359407 = fieldWeight in 2558, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0234375 = fieldNorm(doc=2558)
0.25 = coord(1/4)
- Content
- - User Interface hostility - Recently I used the Library catalogs of two public libraries, new products from two major library vendors. A link an one catalog said "Knowledge Portal," whatever that was supposed to mean. Clicking an it brought you to two choices: Z39.50 Bibliographic Sites and the World Wide Web. No public library user will have the faintest clue what Z39.50 is. The other catalog launched a Java applet that before long froze my web browser so badly I was forced to shut the program down. Pick a popular book and pretend you are a library patron. Choose three to five libraries at random from the lib web-cats site (pick catalogs that are not using your system) and attempt to find your book. Try as much as possible to see the system through the eyes of your patrons-a teenager, a retiree, or an older faculty member. You may not always like what you see. Now go back to your own system and try the same thing. - What should the public see? - Our users deserve an information system that helps them find all different kinds of resources-books, articles, web pages, working papers in institutional repositories-and gives them the tools to focus in an what they want. This is not, and should not be, the library catalog. It must communicate with the catalog, but it will also need to interface with other information systems, such as vendor databases and web search engines. What will such a tool look like? We are seeing the beginnings of such a tool in the current offerings of cross-database search tools from a few vendors (see "Cross-Database Search," LJ 10/15/01, p. 29ff). We are in the early stages of developing the kind of robust, userfriendly tool that will be required before we can pull our catalogs from public view. Meanwhile, we can begin by making what we have easier to understand and use."
-
OWLED 2009; OWL: Experiences and Directions, Sixth International Workshop, Chantilly, Virginia, USA, 23-24 October 2009, Co-located with ISWC 2009. (2009)
0.03
0.02560583 = product of:
0.10242332 = sum of:
0.10242332 = weight(_text_:java in 378) [ClassicSimilarity], result of:
0.10242332 = score(doc=378,freq=2.0), product of:
0.43846712 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.062215917 = queryNorm
0.23359407 = fieldWeight in 378, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0234375 = fieldNorm(doc=378)
0.25 = coord(1/4)
- Content
- Long Papers * Suggestions for OWL 3, Pascal Hitzler. * BestMap: Context-Aware SKOS Vocabulary Mappings in OWL 2, Rinke Hoekstra. * Mechanisms for Importing Modules, Bijan Parsia, Ulrike Sattler and Thomas Schneider. * A Syntax for Rules in OWL 2, Birte Glimm, Matthew Horridge, Bijan Parsia and Peter Patel-Schneider. * PelletSpatial: A Hybrid RCC-8 and RDF/OWL Reasoning and Query Engine, Markus Stocker and Evren Sirin. * The OWL API: A Java API for Working with OWL 2 Ontologies, Matthew Horridge and Sean Bechhofer. * From Justifications to Proofs for Entailments in OWL, Matthew Horridge, Bijan Parsia and Ulrike Sattler. * A Solution for the Man-Man Problem in the Family History Knowledge Base, Dmitry Tsarkov, Ulrike Sattler and Robert Stevens. * Towards Integrity Constraints in OWL, Evren Sirin and Jiao Tao. * Processing OWL2 ontologies using Thea: An application of logic programming, Vangelis Vassiliadis, Jan Wielemaker and Chris Mungall. * Reasoning in Metamodeling Enabled Ontologies, Nophadol Jekjantuk, Gerd Gröner and Jeff Z. Pan.