Multisource keyword extraction and graph construction for privacy preservation


Downloads per month over past year

Nethravathi, N.P. and Desai, V.J. and Aishwarya, R. and Mahesh, R.B. and Venugopal, K.R. and Indiramma, M. (2017) Multisource keyword extraction and graph construction for privacy preservation. In: 5th International Conference on Information and Education Technology, ICIET 2017, 10 January 2017 through 12 January 2017, Tokyo.

[img] Text
nethravathi2017.pdf - Published Version
Restricted to Registered users only

Download (917kB)
Official URL:


Privacy preservation is an important branch of Data Mining which handles hiding of an individual's sensitive data without affecting the data usability. This paper proposes a new technique to provide privacy preservation of sensitive data based on the semantic context. Multisource Keyword Extraction and Graph Construction for Privacy Preservation involves extracting keywords from various data formats and preserving privacy among the keywords extracted using the techniques of Vector Marking. Initially, data cleaning and preprocessing is done on the document to extract keywords by applying techniques such as parsing, duplicate elimination, stemming and indexing. The document can be either PDF, SQL or Word files. After preprocessing, a context graph is generated from the keywords extracted with the help of context dictionaries such as WordNet and DBpedia. This context graph acts as a primary source of reference for all user queries. Privacy preservation of sensitive information is achieved using various Vector Marking techniques. The data input by the user can be classified as structured, unstructured and semi-structured data. Appropriate Vector Marking approaches are used for the given input data format. The keyword specified by the user in the input data as private is queried in the context graph to obtain the correlated words and these words are hidden from the access of the other users. Thus solving some of the issues related to privacy leakage. © 2017 ACM.

Item Type: Conference or Workshop Item (Paper)
Additional Information: cited By 0
Uncontrolled Keywords: Data privacy; Education; Extraction; Indexing (of information); Input output programs; Semantics; Vectors, Context Graph; Duplicate elimination; Parsing; Privacy preservation; Privacy preserving data mining; Semi structured data; Sensitive informations; Stemming, Data mining
Subjects: Faculty of Science > Applied Sciences > Computer Applications
Divisions: Jnana Bharathi / Central College Campus > Department of Computer Applications
Depositing User: Dr. M Anjanappa
Date Deposited: 13 Jun 2018 09:50
Last Modified: 13 Jun 2018 09:50

Actions (login required)

View Item View Item