Search (1405 results, page 11 of 71)

McMillan, G.: Electronic theses and dissertations : merging perspectives (1996) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 736) [ClassicSimilarity], result of:
      0.1716406 = score(doc=736,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 736, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=736)
  0.25 = coord(1/4)
```
Abstract

Reports the work of the ad hoc task force, coordinated by the Scholarly Communications Project (SCP) at Virginia Polytechnic Institute and State University, to discuss the best means of cataloguing the theses palnned to be produced directly in electronic forms by postgraduate students. The main goals were to determine a process for handling electronic theses so that access would be at least as good as for hard copy and to find a way to derive cataloguing information from the electronic text and avoid rekeying as much as possible. An important part of the study was the application of existing MARC format tagged record structures to the new system. Concludes with brief notes on the concerns of UMI regarding Internet access of electronic theses
Decurtins, C.; Norrie, M.C.; Signer, B.: Putting the gloss on paper : a framework for cross-media annotation (2003) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 933) [ClassicSimilarity], result of:
      0.1716406 = score(doc=933,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 933, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=933)
  0.25 = coord(1/4)
```
Abstract

We present a general framework for cross-media annotation that can be used to support the many different forms and uses of annotation. Specifically, we discuss the need for digital annotation of printed materials and describe how various technologies for digitally augmented paper can be used in support of work practices. The state of the art in terms of both commercial and research solutions is described in some detail, with an analysis of the extent to which they can support both the writing and reading activities associated with annotation. Our framework is based on an extension of the information server that was developed within the Paper++ project to support enhanced reading. It is capable of handling both formal and informal annotation across printed and digital media, exploiting a range of technologies for information capture and display. A prototype demonstrator application for mammography is presented to illustrate both the functionality of the framework and the status of existing technologies.
Hoban, M.S.: Sound recording cataloging : a practical approach (1990) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 625) [ClassicSimilarity], result of:
      0.1716406 = score(doc=625,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 625, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=625)
  0.25 = coord(1/4)
```
Abstract

The cataloging of music sound recordings is both challenging and interesting. As the technologies used to produce sound recordings change, these changes must be reflected in both cataloging theory and practice. Three formats: analog disc, cassette tape, and compact disk, all of which are readily available on the market, present special challenges to catalogers who must consider what might be the most effective way of handling these materials following the AACR2 cataloging rules and interpretations from Library of Congress. This paper examines the actual cataloging of those formats as done by several institutions and raises questions such as how to handle these materials in ways which will eliminate redundancy and increase efficiency in the practice of cataloging. Finally, an alternative approach, drawing on AACR2 practice in other areas, is suggested.
Zhan, J.; Loh, H.T.: Using latent semantic indexing to improve the accuracy of document clustering (2007) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 1264) [ClassicSimilarity], result of:
      0.1716406 = score(doc=1264,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 1264, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=1264)
  0.25 = coord(1/4)
```
Abstract

Document clustering is a significant research issue in information retrieval and text mining. Traditionally, most clustering methods were based on the vector space model which has a few limitations such as high dimensionality and weakness in handling synonymous and polysemous problems. Latent semantic indexing (LSI) is able to deal with such problems to some extent. Previous studies have shown that using LSI could reduce the time in clustering a large document set while having little effect on clustering accuracy. However, when conducting clustering upon a small document set, the accuracy is more concerned than efficiency. In this paper, we demonstrate that LSI can improve the clustering accuracy of a small document set and we also recommend the dimensions needed to achieve the best clustering performance.
Nicholson, S.; Smith, C.A.: Using lessons from health care to protect the privacy of library users : guidelines for the de-identification of library data based on HIPAA (2007) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 1451) [ClassicSimilarity], result of:
      0.1716406 = score(doc=1451,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 1451, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=1451)
  0.25 = coord(1/4)
```
Abstract

Although libraries have employed policies to protect the data about use of their services, these policies are rarely specific or standardized. Since 1996, the U.S. health care system has been grappling with the Health Insurance Portability and Accountability Act (HIPAA; Health Insurance Portability and Accountability Act, 1996), which is designed to provide those handling personal health information with standardized, definitive instructions as to the protection of data. In this work, the authors briefly discuss the present situation of privacy policies about library use data, outline the HIPAA guidelines to understand parallels between the two, and finally propose methods to create a de-identified library data warehouse based on HIPAA for the protection of user privacy.
Singh, S.; Dey, L.: ¬A rough-fuzzy document grading system for customized text information retrieval (2005) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 2007) [ClassicSimilarity], result of:
      0.1716406 = score(doc=2007,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 2007, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=2007)
  0.25 = coord(1/4)
```
Abstract

Due to the large repository of documents available on the web, users are usually inundated by a large volume of information, most of which is found to be irrelevant. Since user perspectives vary, a client-side text filtering system that learns the user's perspective can reduce the problem of irrelevant retrieval. In this paper, we have provided the design of a customized text information filtering system which learns user preferences and modifies the initial query to fetch better documents. It uses a rough-fuzzy reasoning scheme. The rough-set based reasoning takes care of natural language nuances, like synonym handling, very elegantly. The fuzzy decider provides qualitative grading to the documents for the user's perusal. We have provided the detailed design of the various modules and some results related to the performance analysis of the system.
Seetharama, S.: Knowledge orgnization system over time (2006) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 2466) [ClassicSimilarity], result of:
      0.1716406 = score(doc=2466,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 2466, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=2466)
  0.25 = coord(1/4)
```
Abstract

Presents an overview of the concepts, techniques and tools of knowledge organization. Knowledge Organization Systems (KOS), such as, authority files, glossaries, dictionaries, subject headings lists, classification schemes, taxonomies, categorization schemes, thesauri, semantic networks, ontologies, etc are involved in organizing, retrieval and dissemination of information. Specifically, KOS are useful -in Users Needs Assessment, Database(s) creation, Online systems, OPAC and for generation of information services and products. Practical experiences suggest that new information systems to access a wider range, quantity, and forms of information and enable provision of a variety of information services and products to meet the needs of people, KOS can perform functions even in an electronic/digital environment as efficiently and effectively as in a traditional library environment. Hence, it is necessary that information professionals and computer-scientists work in an integrated manner to enhance information handling operations in electronic/digital libraries.
Twidale, M.B.; Gruzd, A.A.; Nichols, D.M.: Writing in the library : exploring tighter integration of digital library use with the writing process (2008) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 3045) [ClassicSimilarity], result of:
      0.1716406 = score(doc=3045,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 3045, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=3045)
  0.25 = coord(1/4)
```
Abstract

Information provision via digital libraries often separates the writing process from that of information searching. In this paper we investigate the potential of a tighter integration between searching for information in digital libraries and using those results in academic writing. We consider whether it may sometimes be advantageous to encourage searching while writing instead of the more conventional approach of searching first and then writing. The provision of ambient search is explored, taking the user's ongoing writing as a source for the generation of search terms used to provide possibly useful results. A rapid prototyping approach exploiting web services was used as a way to explore the design space and to have working demonstrations that can provoke reactions, design suggestions and discussions about desirable functionalities and interfaces. This design process and some preliminary user studies are described. The results of these studies lead to a consideration of issues arising in exploring this design space, including handling irrelevant results and the particular challenges of evaluation.
Morrison, P.J.: Tagging and searching : search retrieval effectiveness of folksonomies on the World Wide Web (2008) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 3109) [ClassicSimilarity], result of:
      0.1716406 = score(doc=3109,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 3109, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=3109)
  0.25 = coord(1/4)
```
Abstract

Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.
Kruk, S.R.; McDaniel, B.: Goals of semantic digital libraries (2009) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 365) [ClassicSimilarity], result of:
      0.1716406 = score(doc=365,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=365)
  0.25 = coord(1/4)
```
Abstract

Digital libraries have become commodity in the current world of Internet. More and more information is produced, and more and more non-digital information is being rendered available. The new, more user friendly, community-oriented technologies used throughout the Internet are raising the bar of expectations. Digital libraries cannot stand still with their technologies; if not for the sake of handling rapidly growing amount and diversity of information, they must provide for better user experience matching and overgrowing standards set by the industry. The next generation of digital libraries combine technological solutions, such as P2P, SOA, or Grid, with recent research on semantics and social networks. These solutions are put into practice to answer a variety of requirements imposed on digital libraries.

Madalli, D.P.; Prasad, A.R.D.: Analytico-synthetic approach for handling knowledge diversity in media content analysis (2011) 0.04

0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 827) [ClassicSimilarity], result of:
      0.1716406 = score(doc=827,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 827, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=827)
  0.25 = coord(1/4)

Jayroe, T.J.: ¬A humble servant : the work of Helen L. Brownson and the early years of information science research (2012) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 1458) [ClassicSimilarity], result of:
      0.1716406 = score(doc=1458,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 1458, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=1458)
  0.25 = coord(1/4)
```
Abstract

Helen Brownson was a federal government employee from 1942 to 1970. At a time when scientific data were becoming exceedingly hard to manage, Brownson was instrumental in coordinating national and international efforts for more efficient, cost-effective, and universal information exchange. Her most significant contributions to documentation/information science were during her years at the National Science Foundation's Office of Scientific Information. From 1951 to 1966, Brownson played a key role in identifying and subsequently distributing government funds toward projects that sought to resolve information-handling problems of the time: information access, preservation, storage, classification, and retrieval. She is credited for communicating the need for information systems and indexing mechanisms to have stricter criteria, standards, and evaluation methods; laying the foundation for present-day NSF-funded computational linguistics projects; and founding several pertinent documentation/information science publications including the Annual Review of Information Science and Technology.
Padmavathi, T.; Krishnamurthy, M.: Ontological representation of knowledge for developing information services in food science and technology (2012) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 1839) [ClassicSimilarity], result of:
      0.1716406 = score(doc=1839,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 1839, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=1839)
  0.25 = coord(1/4)
```
Abstract

Knowledge explosion in various fields during recent years has resulted in the creation of vast amounts of on-line scientific literature. Food Science &Technology (FST) is also an important subject domain where rapid developments are taking place due to diverse research and development activities. As a result, information storage and retrieval has become very complex and current information retrieval systems (IRs) are being challenged in terms of both adequate precision and response time. To overcome these limitations as well as to provide naturallanguage based effective retrieval, a suitable knowledge engineering framework needs to be applied to represent, share and discover information. Semantic web technologies provide mechanisms for creating knowledge bases, ontologies and rules for handling data that promise to improve the quality of information retrieval. Ontologies are the backbone of such knowledge systems. This paper presents a framework for semantic representation of a large repository of content in the domain of FST.
Jeffery, K.G.; Bailo, D.: EPOS: using metadata in geoscience (2014) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 2581) [ClassicSimilarity], result of:
      0.1716406 = score(doc=2581,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 2581, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=2581)
  0.25 = coord(1/4)
```
Abstract

One of the key aspects of the approaching data-intensive science era is integration of data through interoperability of systems providing data products or visualisation and processing services. Far from being simple, interoperability requires robust and scalable e-infrastructures capable of supporting it. In this work we present the case of EPOS, a project for data integration in the field of Earth Sciences. We describe the design of its e-infrastructure and show its main characteristics. One of the main elements enabling the system to integrate data, data products and services is the metadata catalog based on the CERIF metadata model. Such a model, modified to fit into the general e-infrastructure design, is part of a three-layer metadata architecture. CERIF guarantees a robust handling of metadata, which is in this case the key to the interoperability and to one of the feature of the EPOS system: the possibility of carrying on data intensive science orchestrating the distributed resources made available by EPOS data providers and stakeholders.
Schöneberg, U.; Sperber, W.: POS tagging and its applications for mathematics (2014) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 2748) [ClassicSimilarity], result of:
      0.1716406 = score(doc=2748,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 2748, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=2748)
  0.25 = coord(1/4)
```
Abstract

Content analysis of scientific publications is a nontrivial task, but a useful and important one for scientific information services. In the Gutenberg era it was a domain of human experts; in the digital age many machine-based methods, e.g., graph analysis tools and machine-learning techniques, have been developed for it. Natural Language Processing (NLP) is a powerful machine-learning approach to semiautomatic speech and language processing, which is also applicable to mathematics. The well established methods of NLP have to be adjusted for the special needs of mathematics, in particular for handling mathematical formulae. We demonstrate a mathematics-aware part of speech tagger and give a short overview about our adaptation of NLP methods for mathematical publications. We show the use of the tools developed for key phrase extraction and classification in the database zbMATH.
Atherton Cochrane, P.: Knowledge space revisited : challenges for twenty-first century library and information science researchers (2013) 0.04
```
0.04291015 = product of:
  0.1716406 = sum of:
    0.1716406 = weight(_text_:handling in 553) [ClassicSimilarity], result of:
      0.1716406 = score(doc=553,freq=2.0), product of:
        0.4128091 = queryWeight, product of:
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.0658165 = queryNorm
        0.41578686 = fieldWeight in 553, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.272122 = idf(docFreq=227, maxDocs=44421)
          0.046875 = fieldNorm(doc=553)
  0.25 = coord(1/4)
```
Abstract

This paper suggests writing a companion work to the Bourne and Hahn book, History of Online Information Services, 1963-1976 (2003), which would feature milestone improvements in subject access mechanisms developed over time. To provide a background for such a work, a 1976 paper by Meincke and Atherton is revisited wherein the concept of Knowledge Space is defined as "online mechanisms used for handling a user's knowledge level while a search was being formulated and processed." Research that followed in the 1980s and 1990s is linked together for the first time. Seven projects are suggested for current researchers to undertake so they can assess the utility of earlier research ideas that did not get a proper chance for development. It is just possible that they may have value and be found useful in today's information environment.
Noerr, P.: ¬The Digital Library Tool Kit (2001) 0.04
```
0.036116935 = product of:
  0.14446774 = sum of:
    0.14446774 = weight(_text_:java in 774) [ClassicSimilarity], result of:
      0.14446774 = score(doc=774,freq=2.0), product of:
        0.46384227 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0658165 = queryNorm
        0.31145877 = fieldWeight in 774, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.03125 = fieldNorm(doc=774)
  0.25 = coord(1/4)
```
Footnote

This Digital Library Tool Kit was sponsored by Sun Microsystems, Inc. to address some of the leading questions that academic institutions, public libraries, government agencies, and museums face in trying to develop, manage, and distribute digital content. The evolution of Java programming, digital object standards, Internet access, electronic commerce, and digital media management models is causing educators, CIOs, and librarians to rethink many of their traditional goals and modes of operation. New audiences, continuous access to collections, and enhanced services to user communities are enabled. As one of the leading technology providers to education and library communities, Sun is pleased to present this comprehensive introduction to digital libraries
Herrero-Solana, V.; Moya Anegón, F. de: Graphical Table of Contents (GTOC) for library collections : the application of UDC codes for the subject maps (2003) 0.04
```
0.036116935 = product of:
  0.14446774 = sum of:
    0.14446774 = weight(_text_:java in 3758) [ClassicSimilarity], result of:
      0.14446774 = score(doc=3758,freq=2.0), product of:
        0.46384227 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0658165 = queryNorm
        0.31145877 = fieldWeight in 3758, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.03125 = fieldNorm(doc=3758)
  0.25 = coord(1/4)
```
Abstract

The representation of information contents by graphical maps is an extended ongoing research topic. In this paper we introduce the application of UDC codes for the subject maps development. We use the following graphic representation methodologies: 1) Multidimensional scaling (MDS), 2) Cluster analysis, 3) Neural networks (Self Organizing Map - SOM). Finally, we conclude about the application viability of every kind of map. 1. Introduction Advanced techniques for Information Retrieval (IR) currently make up one of the most active areas for research in the field of library and information science. New models representing document content are replacing the classic systems in which the search terms supplied by the user were compared against the indexing terms existing in the inverted files of a database. One of the topics most often studied in the last years is bibliographic browsing, a good complement to querying strategies. Since the 80's, many authors have treated this topic. For example, Ellis establishes that browsing is based an three different types of tasks: identification, familiarization and differentiation (Ellis, 1989). On the other hand, Cove indicates three different browsing types: searching browsing, general purpose browsing and serendipity browsing (Cove, 1988). Marcia Bates presents six different types (Bates, 1989), although the classification of Bawden is the one that really interests us: 1) similarity comparison, 2) structure driven, 3) global vision (Bawden, 1993). The global vision browsing implies the use of graphic representations, which we will call map displays, that allow the user to get a global idea of the nature and structure of the information in the database. In the 90's, several authors worked an this research line, developing different types of maps. One of the most active was Xia Lin what introduced the concept of Graphical Table of Contents (GTOC), comparing the maps to true table of contents based an graphic representations (Lin 1996). Lin applies the algorithm SOM to his own personal bibliography, analyzed in function of the words of the title and abstract fields, and represented in a two-dimensional map (Lin 1997). Later on, Lin applied this type of maps to create websites GTOCs, through a Java application.
Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.04
```
0.036116935 = product of:
  0.14446774 = sum of:
    0.14446774 = weight(_text_:java in 935) [ClassicSimilarity], result of:
      0.14446774 = score(doc=935,freq=2.0), product of:
        0.46384227 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0658165 = queryNorm
        0.31145877 = fieldWeight in 935, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.03125 = fieldNorm(doc=935)
  0.25 = coord(1/4)
```
Abstract

Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
Radhakrishnan, A.: Swoogle : an engine for the Semantic Web (2007) 0.04
```
0.036116935 = product of:
  0.14446774 = sum of:
    0.14446774 = weight(_text_:java in 709) [ClassicSimilarity], result of:
      0.14446774 = score(doc=709,freq=2.0), product of:
        0.46384227 = queryWeight, product of:
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.0658165 = queryNorm
        0.31145877 = fieldWeight in 709, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.0475073 = idf(docFreq=104, maxDocs=44421)
          0.03125 = fieldNorm(doc=709)
  0.25 = coord(1/4)
```
Content

"Swoogle, the Semantic web search engine, is a research project carried out by the ebiquity research group in the Computer Science and Electrical Engineering Department at the University of Maryland. It's an engine tailored towards finding documents on the semantic web. The whole research paper is available here. Semantic web is touted as the next generation of online content representation where the web documents are represented in a language that is not only easy for humans but is machine readable (easing the integration of data as never thought possible) as well. And the main elements of the semantic web include data model description formats such as Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, Turtle, N-Triples), and notations such as RDF Schema (RDFS), the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain (Wikipedia). And Swoogle is an attempt to mine and index this new set of web documents. The engine performs crawling of semantic documents like most web search engines and the search is available as web service too. The engine is primarily written in Java with the PHP used for the front-end and MySQL for database. Swoogle is capable of searching over 10,000 ontologies and indexes more that 1.3 million web documents. It also computes the importance of a Semantic Web document. The techniques used for indexing are the more google-type page ranking and also mining the documents for inter-relationships that are the basis for the semantic web. For more information on how the RDF framework can be used to relate documents, read the link here. Being a research project, and with a non-commercial motive, there is not much hype around Swoogle. However, the approach to indexing of Semantic web documents is an approach that most engines will have to take at some point of time. When the Internet debuted, there were no specific engines available for indexing or searching. The Search domain only picked up as more and more content became available. One fundamental question that I've always wondered about it is - provided that the search engines return very relevant results for a query - how to ascertain that the documents are indeed the most relevant ones available. There is always an inherent delay in indexing of document. Its here that the new semantic documents search engines can close delay. Experimenting with the concept of Search in the semantic web can only bore well for the future of search technology."

Search (1405 results, page 11 of 71)

Authors

Years

Languages

Types

Themes

Subjects

Classifications