-
Tognoli, N.B.; Chaves Guimarães, J.A.: Challenges of knowledge representation in contemporary archival science (2012)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 1860) [ClassicSimilarity], result of:
0.18536724 = score(doc=1860,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 1860, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=1860)
0.25 = coord(1/4)
- Abstract
- Since its emergence as a discipline, in the nineteenth century (1889), the theory and practice of Archival Science have focused on the arrangement and description of archival materials as complementary and inseparable nuclear processes that aim to classify, to order, to describe and to give access to records. These processes have their specific goals sharing one in common: the representation of archival knowledge. In the late 1980 a paradigm shift was announced in Archival Science, especially after the appearance of the new forms of document production and information technologies. The discipline was then invited to rethink its theoretical and methodological bases founded in the nineteenth century so it could handle the contemporary archival knowledge production, organization and representation. In this sense, the present paper aims to discuss, under a theoretical perspective, the archival representation, more specifically the archival description facing these changes and proposals, in order to illustrate the challenges faced by Contemporary Archival Science in a new context of production, organization and representation of archival knowledge.
-
Gödert, W.; Hubrich, J.; Nagelschmidt, M.: Semantic knowledge representation for information retrieval (2014)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 1987) [ClassicSimilarity], result of:
0.18536724 = score(doc=1987,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 1987, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=1987)
0.25 = coord(1/4)
- Content
- Introduction: envisioning semantic information spacesIndexing and knowledge organization -- Semantic technologies for knowledge representation -- Information retrieval and knowledge exploration -- Approaches to handle heterogeneity -- Problems with establishing semantic interoperability -- Formalization in indexing languages -- Typification of semantic relations -- Inferences in retrieval processes -- Semantic interoperability and inferences -- Remaining research questions.
-
Rehurek, R.; Sojka, P.: Software framework for topic modelling with large corpora (2010)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 2058) [ClassicSimilarity], result of:
0.18536724 = score(doc=2058,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 2058, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=2058)
0.25 = coord(1/4)
- Content
- Für die Software, vgl.: http://radimrehurek.com/gensim/index.html. Für eine Demo, vgl.: http://dml.cz/handle/10338.dmlcz/100785/SimilarArticles.
-
Serpa, F.G.; Graves, A.M.; Javier, A.: Statistical common author networks (2013)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 2133) [ClassicSimilarity], result of:
0.18536724 = score(doc=2133,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 2133, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=2133)
0.25 = coord(1/4)
- Abstract
- A new method for visualizing the relatedness of scientific areas has been developed that is based on measuring the overlap of researchers between areas. It is found that closely related areas have a high propensity to share a larger number of common authors. A method for comparing areas of vastly different sizes and to handle name homonymy is constructed, allowing for the robust deployment of this method on real data sets. A statistical analysis of the probability distributions of the common author overlap that accounts for noise is carried out along with the production of network maps with weighted links proportional to the overlap strength. This is demonstrated on 2 case studies, complexity science and neutrino physics, where the level of relatedness of areas within each area is expected to vary greatly. It is found that the results returned by this method closely match the intuitive expectation that the broad, multidisciplinary area of complexity science possesses areas that are weakly related to each other, whereas the much narrower area of neutrino physics shows very strongly related areas.
-
Colace, F.; Santo, M. de; Greco, L.; Napoletano, P.: Improving relevance feedback-based query expansion by the use of a weighted word pairs approach (2015)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 3263) [ClassicSimilarity], result of:
0.18536724 = score(doc=3263,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 3263, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=3263)
0.25 = coord(1/4)
- Abstract
- In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]-6, -7, -8, -9, and -10). Results demonstrated that the QE method based on this new structure outperforms the baseline.
-
Lubetzky, S.: Development of cataloging rules (1953)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 3626) [ClassicSimilarity], result of:
0.18536724 = score(doc=3626,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 3626, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=3626)
0.25 = coord(1/4)
- Content
- Vgl.: https://www.ideals.illinois.edu/bitstream/handle/2142/5511/librarytrendsv2i2c_opt.pdf.
-
Thornton, K: Powerful structure : inspecting infrastructures of information organization in Wikimedia Foundation projects (2016)
0.05
0.04634181 = product of:
0.18536724 = sum of:
0.18536724 = weight(_text_:handle in 4288) [ClassicSimilarity], result of:
0.18536724 = score(doc=4288,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.43370473 = fieldWeight in 4288, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.046875 = fieldNorm(doc=4288)
0.25 = coord(1/4)
- Content
- Vgl. auch: https://digital.lib.washington.edu/researchworks/bitstream/handle/1773/38160/Thornton_washington_0250E_16572.pdf?sequence=1&isAllowed=y.
-
Chen, H.; Chung, Y.-M.; Ramsey, M.; Yang, C.C.: ¬A smart itsy bitsy spider for the Web (1998)
0.04
0.04481125 = product of:
0.179245 = sum of:
0.179245 = weight(_text_:java in 1871) [ClassicSimilarity], result of:
0.179245 = score(doc=1871,freq=2.0), product of:
0.4604012 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.06532823 = queryNorm
0.38932347 = fieldWeight in 1871, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=1871)
0.25 = coord(1/4)
- Abstract
- As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed 2 Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the Web, based on the links and keyword indexing. A graphical, dynamic, Jav-based interface was developed and is available for Web access. A system architecture for implementing such an agent-spider is presented, followed by deteiled discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process introduced in genetic algorithms allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent
-
Chen, C.: CiteSpace II : detecting and visualizing emerging trends and transient patterns in scientific literature (2006)
0.04
0.04481125 = product of:
0.179245 = sum of:
0.179245 = weight(_text_:java in 272) [ClassicSimilarity], result of:
0.179245 = score(doc=272,freq=2.0), product of:
0.4604012 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.06532823 = queryNorm
0.38932347 = fieldWeight in 272, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=272)
0.25 = coord(1/4)
- Abstract
- This article describes the latest development of a generic approach to detecting and visualizing emerging trends and transient patterns in scientific literature. The work makes substantial theoretical and methodological contributions to progressive knowledge domain visualization. A specialty is conceptualized and visualized as a time-variant duality between two fundamental concepts in information science: research fronts and intellectual bases. A research front is defined as an emergent and transient grouping of concepts and underlying research issues. The intellectual base of a research front is its citation and co-citation footprint in scientific literature - an evolving network of scientific publications cited by research-front concepts. Kleinberg's (2002) burst-detection algorithm is adapted to identify emergent research-front concepts. Freeman's (1979) betweenness centrality metric is used to highlight potential pivotal points of paradigm shift over time. Two complementary visualization views are designed and implemented: cluster views and time-zone views. The contributions of the approach are that (a) the nature of an intellectual base is algorithmically and temporally identified by emergent research-front terms, (b) the value of a co-citation cluster is explicitly interpreted in terms of research-front concepts, and (c) visually prominent and algorithmically detected pivotal points substantially reduce the complexity of a visualized network. The modeling and visualization process is implemented in CiteSpace II, a Java application, and applied to the analysis of two research fields: mass extinction (1981-2004) and terrorism (1990-2003). Prominent trends and pivotal points in visualized networks were verified in collaboration with domain experts, who are the authors of pivotal-point articles. Practical implications of the work are discussed. A number of challenges and opportunities for future studies are identified.
-
Eddings, J.: How the Internet works (1994)
0.04
0.04481125 = product of:
0.179245 = sum of:
0.179245 = weight(_text_:java in 2514) [ClassicSimilarity], result of:
0.179245 = score(doc=2514,freq=2.0), product of:
0.4604012 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.06532823 = queryNorm
0.38932347 = fieldWeight in 2514, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=2514)
0.25 = coord(1/4)
- Abstract
- How the Internet Works promises "an exciting visual journey down the highways and byways of the Internet," and it delivers. The book's high quality graphics and simple, succinct text make it the ideal book for beginners; however it still has much to offer for Net vets. This book is jam- packed with cool ways to visualize how the Net works. The first section visually explores how TCP/IP, Winsock, and other Net connectivity mysteries work. This section also helps you understand how e-mail addresses and domains work, what file types mean, and how information travels across the Net. Part 2 unravels the Net's underlying architecture, including good information on how routers work and what is meant by client/server architecture. The third section covers your own connection to the Net through an Internet Service Provider (ISP), and how ISDN, cable modems, and Web TV work. Part 4 discusses e-mail, spam, newsgroups, Internet Relay Chat (IRC), and Net phone calls. In part 5, you'll find out how other Net tools, such as gopher, telnet, WAIS, and FTP, can enhance your Net experience. The sixth section takes on the World Wide Web, including everything from how HTML works to image maps and forms. Part 7 looks at other Web features such as push technology, Java, ActiveX, and CGI scripting, while part 8 deals with multimedia on the Net. Part 9 shows you what intranets are and covers groupware, and shopping and searching the Net. The book wraps up with part 10, a chapter on Net security that covers firewalls, viruses, cookies, and other Web tracking devices, plus cryptography and parental controls.
-
Wu, D.; Shi, J.: Classical music recording ontology used in a library catalog (2016)
0.04
0.04481125 = product of:
0.179245 = sum of:
0.179245 = weight(_text_:java in 4179) [ClassicSimilarity], result of:
0.179245 = score(doc=4179,freq=2.0), product of:
0.4604012 = queryWeight, product of:
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.06532823 = queryNorm
0.38932347 = fieldWeight in 4179, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
7.0475073 = idf(docFreq=104, maxDocs=44421)
0.0390625 = fieldNorm(doc=4179)
0.25 = coord(1/4)
- Abstract
- In order to improve the organization of classical music information resources, we constructed a classical music recording ontology, on top of which we then designed an online classical music catalog. Our construction of the classical music recording ontology consisted of three steps: identifying the purpose, analyzing the ontology, and encoding the ontology. We identified the main classes and properties of the domain by investigating classical music recording resources and users' information needs. We implemented the ontology in the Web Ontology Language (OWL) using five steps: transforming the properties, encoding the transformed properties, defining ranges of the properties, constructing individuals, and standardizing the ontology. In constructing the online catalog, we first designed the structure and functions of the catalog based on investigations into users' information needs and information-seeking behaviors. Then we extracted classes and properties of the ontology using the Apache Jena application programming interface (API), and constructed a catalog in the Java environment. The catalog provides a hierarchical main page (built using the Functional Requirements for Bibliographic Records (FRBR) model), a classical music information network and integrated information service; this combination of features greatly eases the task of finding classical music recordings and more information about classical music.
-
Keane, D.: ¬The information behaviour of senior executives (1999)
0.04
0.04369148 = product of:
0.17476591 = sum of:
0.17476591 = weight(_text_:handle in 1278) [ClassicSimilarity], result of:
0.17476591 = score(doc=1278,freq=4.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.40890077 = fieldWeight in 1278, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.03125 = fieldNorm(doc=1278)
0.25 = coord(1/4)
- Abstract
- For senior executives, the ability to work with large quantities of information - sorting the wheat from the chaff- has long been recognised as a key determinant of achievement. What an executive believes to be important information can have a significant influence on what they think and how they think about it. Senior executives, because of their critical leadership role, are challenged in their daily lives to develop effective ways of acquiring, using and sharing important information. Some executives are undoubtedly better than others in how they handle such information and there is a high level of interest in identifying those information behavior characteristics that lead to executive excellence (Davenport & Prusak, 1998). Because of their position within organizations, CEOs - those senior executives who have overall responsibility for the management of the organization or business unit - are particularly concerned with enhancing their information behavior. CEOs have the task of managing the organization so that it achieves its strategic goals and objectives. And a critical part of this task is becoming highly effective in managing a wide range of information and in developing skills of influence and decision making. It is therefore important for us to understand how senior executives handle information on a day-to-day basis. What information do they consider important? And why? Several studies have sought to address these questions with varying degrees of success. Some have set out to better understand what type of information senior executives need (McLeod & Jones, 1987) while other studies have attempted to provide a comprehensive theoretical base for executive work (Mintzberg, 1968; 1973; 1975). Yet other work has tried to devise various tools and methodologies for eliciting the unique information requirements of individual executives (Rockart, 1979).
-
Taniguchi, S.: ¬A system for analyzing cataloguing rules : a feasibility study (1996)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 4266) [ClassicSimilarity], result of:
0.15447271 = score(doc=4266,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 4266, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=4266)
0.25 = coord(1/4)
- Abstract
- The quality control of cataloging standards is as important as the quality control of bibliographic records. In order to aid the quality control of cataloging standards, a prototype system to analyze the ambiguity and complexity of cataloging rules was developed. Before developing the system, a standard rule unit was defined and a simple, function-like format was devised to indicate the syntactic structure of each unit rule. The AACR2 chapter 1 rules were then manually transformed into this function-like, unit role format. The systems reads the manually transformed unit rules and puts them into their basic forms based on their syntactic components. The system then applies rule-templates, which are skeletal schemata for specific types of cataloging rules, to the converted rules. As a result of this rule-template application, the internal structure of each unit rule is determined. The system is also used to explore inter-rule relationships. That is, the system determines whether two rules have an exclusive, parallel, complementary, or non-relationship. These relationships are based on the analysis of the structural parts described above in terms of the given rule-template. To assists in this process, the system applies external knowledge represented in the same fashion as the rule units themselves. Although the prototype system can handle only a restricted range of rules, the proposed approach is positively validated and shown to be useful. However, it is possibly impractical to build a complete rule-analyzing system of this type at this stage
-
Rijsbergen, C.J. van; Lalmas, M.: Information calculus for information retrieval (1996)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 4269) [ClassicSimilarity], result of:
0.15447271 = score(doc=4269,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 4269, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=4269)
0.25 = coord(1/4)
- Abstract
- Information is and always has been an elusive concept; nevertheless many philosophers, mathematicians, logicians and computer scientists have felt that it is fundamental. Many attempts have been made to come up with some sensible and intuitively acceptable definition of information; up to now, none of these have succeeded. This work is based on the approach followed by Dretske, Barwise, and Devlin, who claimed that the notion of information starts from the position that given an ontology of objects individuated by a cognitive agent, it makes sense to speak of the information an object (e.g., a text, an image, a video) contains about another object (e.g. the query). This phenomenon is captured by the flow of information between objects. Its exploitation is the task of an information retrieval system. These authors proposes a theory of information that provides an analysis of the concept of information (any type, from any media) and the manner in which intelligent organisms (referring to as cognitive agents) handle and respond to the information picked up from their environment. They defined the nature of information flow and the mechanisms that give rise to such a flow. The theory, which is based on Situation Theory, is expressed with a calculus defined on channels. The calculus was defined so that it satisfies properties that are attributes to information and its flows. This paper demonstrates the connection between this calculus and information retrieval, and porposes a model of an information retrieval system based on this calculus
-
Carpineto, C.; Romano, G.: Order-theoretical ranking (2000)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 5766) [ClassicSimilarity], result of:
0.15447271 = score(doc=5766,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 5766, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=5766)
0.25 = coord(1/4)
- Abstract
- Current best-match ranking (BMR) systems perform well but cannot handle word mismatch between a query and a document. The best known alternative ranking method, hierarchical clustering-based ranking (HCR), seems to be more robust than BMR with respect to this problem, but it is hampered by theoretical and practical limitations. We present an approach to document ranking that explicitly addresses the word mismatch problem by exploiting interdocument similarity information in a novel way. Document ranking is seen as a query-document transformation driven by a conceptual representation of the whole document collection, into which the query is merged. Our approach is nased on the theory of concept (or Galois) lattices, which, er argue, provides a powerful, well-founded, and conputationally-tractable framework to model the space in which documents and query are represented and to compute such a transformation. We compared information retrieval using concept lattice-based ranking (CLR) to BMR and HCR. The results showed that HCR was outperformed by CLR as well as BMR, and suggested that, of the two best methods, BMR achieved better performance than CLR on the whole document set, whereas CLR compared more favorably when only the first retrieved documents were used for evaluation. We also evaluated the three methods' specific ability to rank documents that did not match the query, in which case the speriority of CLR over BMR and HCR was apparent
-
Gatzemeier, F.H.: Patterns, schemata, and types : author support through formalized experience (2000)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 6069) [ClassicSimilarity], result of:
0.15447271 = score(doc=6069,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 6069, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=6069)
0.25 = coord(1/4)
- Abstract
- Conceptual authoring support provides tools to help authors construct and organize their document on the conceptual level. As computer-based tools are purely formal entities, they cannot handle natural language itself. Instead, they provide the author with directions and examples that (if adopted) remain linked to the text. This paper discusses several levels of such directions: A Pattern describes a solution for a common problem, here a combination of audience and topic. It may point to several Schemata, which may be expanded in the document structure graph, leaving the author with more specific graph structures to expand and text gaps to fill in. A Type Definition is finally a restriction on the possible document structures the author is allowed to build. Several examples of such patterns, schemata and types are presented. These levels of support are being implemented in an authoring support environment called CHASID. It extends conventional authoring applications, currently ToolBook. The graph transformation aspects are implemented as an executable PROGRES specification
-
Drabenstott, K.M.; Weller, M.S.: Handling spelling errors in online catalog searches (1996)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 6973) [ClassicSimilarity], result of:
0.15447271 = score(doc=6973,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 6973, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=6973)
0.25 = coord(1/4)
- Abstract
- Reports results of 2 separate but related projects to study the influence of spelling errors (misspellings), made by searchers, on the subject searching of online catalogues and to suggest ways of improving error detection systems to handle the errors that they detect. This involved the categorization of user queries for subjects that were extracted from the online catalogue transaction logs of 4 USA university libraries. The research questions considered: the prevalence of misspellings in user queries for subjects; and how users respond to online catalogues that detect possible spelling errors in their subject queries. Less than 6% of user queries that match the catalogue's controlled and free text terms were found to contain spelling errors. While the majority of users corrected misspelled query words, a sizable proportion made an action that was even more detrimental than the original misspelling. Concludes with 3 recommended improvements: online catalogues should be equipped with search trees to place the burden of selecting a subject the system instead of the user; systems should be equipped with automatic spelling checking routines that inform users of possibly misspelled words; and online catalogues should be enhanced with tools and techniques to distinguish between queries that fail due to misspellings and correction failures. Cautions that spelling is not a serious problem but can seriously hinder the most routine subject search
-
Sharretts, C.W.; Shieh, J.; French, J.C.: Electronic theses and dissertations at the University of Virginia (1999)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 702) [ClassicSimilarity], result of:
0.15447271 = score(doc=702,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 702, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=702)
0.25 = coord(1/4)
- Abstract
- Although technology has made life easier in many ways, one constant complaint has been the time it takes to learn it. This is why simplicity was the main concern of the University of Virginia (UVa) when implementing the Electronic Theses and Dissertations (ETD). ETD are not a new concept. The uniqueness of the Virginia ETD lies in the fact that the whole process was assimilated through the technical skills and intellectual efforts of faculty and students. The ETD creates no extra network load and is fully automatic from the submission of data, to the conversion into MARC and subsequent loading into the Library's online catalog, VIRGO. This paper describes the trajectory of an ETD upon submission. The system is designed to be easy and self-explanatory. Submission instructions guide the student step by step. Screen messages, such as errors, are generated automatically when appropriate, while e-mail messages, regarding the status of the process, are automatically posted to students, advisors, catalogers, and school officials. The paradigms and methodologies will help to push forward the ETD project at the University. Planned enhancements are: Indexing the data for searching and retrieval using Dienst for Web interface, to synchronize the searching experience in both VIRGO and the Web; Securing the authorship of the data; Automating the upload and indexing bibliographic data in VIRGO; Employing Uniform Resource Names (URN) using the Corporation for National Research Initiatives (CNRI) Handle architecture scheme; Adding Standard Generalized Markup Language (SGML) to the list of formats acceptable for archiving ETD
-
McIlwaine, I.C.: ¬The Universal Decimal Classification : a guide to its use (2000)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 1161) [ClassicSimilarity], result of:
0.15447271 = score(doc=1161,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 1161, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=1161)
0.25 = coord(1/4)
- Abstract
- This book is an extension and total revision of the author's earlier Guide to the use of UDC. The original was written in 1993 and in the intervening years much has happened with the classification. In particular, a much more rigorous approach has been undertaken in revision to ensure that the scheme is able to handle the requirements of a networked world. The book outlines the history and development of the Universal Decimal Classification, provides practical hints on its application and works through all the auxiliary and main tables highlighting aspects that need to be noted in applying the scheme. It also provides guidance on the use of the Master Reference File and discusses the ways in which the classification is used in the 21st century and its suitability as an aid to subject description in tagging metadata and consequently for application on the Internet. It is intended as a source for information about the scheme, for practical usage by classifiers in their daily work and as a guide to the student learning how to apply the classification. It is amply provided with examples to illustrate the many ways in which the scheme can be applied and will be a useful source for a wide range of information workers
-
Chinenyanga, T.T.; Kushmerick, N.: ¬An expressive and efficient language for XML information retrieval (2002)
0.04
0.038618177 = product of:
0.15447271 = sum of:
0.15447271 = weight(_text_:handle in 1462) [ClassicSimilarity], result of:
0.15447271 = score(doc=1462,freq=2.0), product of:
0.42740422 = queryWeight, product of:
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.06532823 = queryNorm
0.36142063 = fieldWeight in 1462, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.5424123 = idf(docFreq=173, maxDocs=44421)
0.0390625 = fieldNorm(doc=1462)
0.25 = coord(1/4)
- Abstract
- Several languages for querying and transforming XML, including XML-QL, Quilt, and XQL, have been proposed. However, these languages do not support ranked queries based on textual similarity, in the spirit of traditional IR. Several extensions to these XML query languages to support keyword search have been made, but the resulting languages cannot express IR-style queries such as "find books and CDs with similar titles." In some of these languages keywords are used merely as boolean filters without support for true ranked retrieval; others permit similarity calculations only between a data value and a constant, and thus cannot express the above query. WHIRL avoids both problems, but assumes relational data. We propose ELIXIR, an expressive and efficient language for XML information retrieval that extends XML-QL with a textual similarity operator that can be used for similarity joins, so ELIXIR is sufficiently expressive to handle the sample query above. ELIXIR thus qualifies as a general-purpose XML IR query language. Our central contribution is an efficient algorithm for answering ELIXIR queries that rewrites the original ELIXIR query into a series of XML-QL queries to generate intermediate relational data, and uses WHIRL to efficiently evaluate the similarity operators on this intermediate data, yielding an XML document with nodes ranked by similarity. Our experiments demonstrate that our prototype scales well with the size of the query and the XML data.