CONTEXT AWARE SENSITIVE INFORMATION DETECTION Qiao Mu, Ong Yuya J, Routray Ramani, Raphael Roger C training set, threshold limit value, recurrent neural network, pattern recognition, information sensitivity, computer science, artificial neural network, artificial intelligence Abstract
A method loads training samples and forms training data set from the training samples. The method uses the bidirectional LSTM recurrent neural network that includes one or more input cells and one or more output cells and trains it with the training data set. The method determines a sensitive information and confidence values based on analyzing a text with the trained neural network. The method selects predicted samples from the text, where the sensitive information confidence value corresponding to a one or more predicted samples is above a threshold value, based on determining that a sensitive information accuracy has improved. The method forms a new training data set, where the new training data set comprises the samples and the verified one or more predicted samples based on the verified one or more predicted samples, and trains the previously trained neural network with the new training data set.