March 2016 – August 2017
Our proposal: Deep Rule EXtraction (D-REX)
In the scope of this project we intend to develop, implement, and evaluate a novel method for extracting rules from Deep Neural Networks. The method will be able:
1. to extract knowledge in the form of hierarchical rule representations to explain how Deep Neural Networks make their predictions
2. while preserving, as much as possible, the prediction accuracy of the neural network.
Motivation
Artificial neural networks, and their most recent and powerful incarnation Deep Neural Networks (DNNs), are well-known and accurate predictive models that learn from data.
However, they have as a drawback that they tend to lock the knowledge they learned in a “black-box”; i.e. little is known about the data or the causal relationships leading to their predictions. In the search for transparency, researchers have proposed rule-extraction methods that can be seen as complementary to conventional neural networks in the sense that they extract knowledge from the network to explain the reasons leading to a prediction.
Because of such complementarity, we believe that the recent advancement of deep neural networks should be closely followed by a renewed interest in rule extraction methods tailored to deeper architectures.
Moreover, despite their recent success, DNNs exhibit problematic properties that both attract criticism and trigger additional research e.g., they may be “easily fooled” 1. Evidence also suggests that the relationship between network modules is more meaning-driving than that between individual neurons 2. As a matter of fact, little is still known about how DNN predictions are made, and it can be a challenge to understand what exactly goes on at each layer.
Our approach: local rule extraction and global knowledge consolidation
The substrate of D-REX will be a carefully-conceived, rule-based knowledge representation able to capture the complexity of the knowledge contained in DNNs. The knowledge extraction will begin by a local rule-extraction stage in which a set of rules is created to explain the behavior of each module, or layer, of the DNN.
Then, these modules of local knowledge will be connected following the structure and data flow of the trained network in order to compose a global deep knowledge structure. This global knowledge could finally be consolidated and refined in order to improve its understandability and coherence. All along the process, several performance criteria related with numeric and linguistic behavior will be used to assess the quality and, thus, to drive the search of the knowledge-oriented model.