The IUP Journal of Computer Sciences
Processing the Textual Information Using Open Natural Language Processing

Article Details
Pub. Date : Oct, 2019
Product Name : The IUP Journal of Computer Sciences
Product Type : Article
Product Code : IJCS11910
Author Name : A Muthusamy
Availability : YES
Subject/Domain : Management
Download Format : PDF Format
No. of Pages : 18



Natural Language Processing (NLP) is the application of computational technique and component of AI which helps in the analysis and synthesis of Standard English Language. It draws from many disciplines like computer science and computational linguistics to process and analyze a large amount of natural language data to identify the phrases in language that refer to specific types of entities and relations in the text. The problem identified in N-gram approach is: balance weight is placed between in-frequent and frequent grams. It is efficient only for a small amount of textual data. Thus, it is difficult to discover the named entities available in the corpus. The main aim of this paper is to overcome the limitations available in N-gram approach and find desired pieces of entities by using Open NLP tool kit. It helps to store the information in XML format that provides an easy way of querying and processing. Hence, the information extracted from the web is in an unstructured form. This approach is a promising way for processing the text in an efficient manner. To enhance the effectiveness of text mining, the researcher focused on the task of the NLP to discover knowledge from many unstructured text documents that leads to the largest available source of knowledge. The experiments and results of this paper, with accuracy of 0.95, prove that the confidence level is better than that of the N-grams approach.


Nowadays, text mining (Patel and Soni, 2012) is becoming an important research area. It is otherwise referred to as Text Data Mining. It is a method of obtaining high feature information from text mining. Text mining usually involves the process of structuring the entered text (parsing), originating patterns within the ordered data and finally evaluation and interpretation of the output (Sagayam et al., 2012), as shown in Figure 1. The flow of the text mining obtains the word documents as input, and then extracts words from the web document. Text mining has an ability to process the unstructured document (Patel and Soni, 2012), typically a very large set of documents such as thousands or millions, deduce the implication and repeatedly recognize and extract their concepts as well as the association among the concepts and directly respond to the query. Text mining is different from web search (Michael, 2004).


Information extraction, Computing, Retrieval, Natural Language Processing (NLP), Open NLP

Upload Articles
Click here to upload your Articles



Articles of the Month
ISBN: 978-81-314-2793-4
Price: ₹250
Payment by D.D. favouring
"ICFAI A/c IUP", Hyderabad

Reach us at
Tel: +91 8498843633