Font Size:
COMPARING THE USE OF FULL TEXT SEARCH BETWEEN A CONVENTIONAL IR SYSTEM AND A DBMS
Last modified: 2018-04-27
Abstract
This project aims to analyze the utilization characteristics of the complete indexing of text implemented in a traditional Information Retrieval System (IRS), comparing it with the Database Management System (DBMS). To perform the experiments the chosen DBMS was Postgres 9.4 and the Terrier IR Platform v. 3.6 developed and maintained by the School of Computing Science - University of Glasgow. The corpus used is composed of 1,260 scientific articles in PDF, which were first transformed into text using the Tika application of the Apache Foundation. This was done so that both tools worked with the same textual content. The objective is to validate comparatively the advantages and disadvantages of using these platforms that have the common objective of assisting in the search of information contained in collections of documents in the form of text. The importance of this work is due to the scarcity of similar studies, in a research carried out in the Capes Portal, both with terms in Portuguese and in English, no related work was found in any of the scientific works bases. It is hoped that this work will be able to support the researchers and users of these tools in order to help in the decision making of which would be the most suitable platform to be used to meet the demands of information retrieval.