
Here, we include some useful resources for reasearch in concept representation learning.

Relevant Papers and Materials

Below we include a list of works in Concept representation learning, particularly in the areas of Interpretability/Explainability, that are relevant to concept-based interpretable deep learning. We will discuss several of these papers in our tutorial, however we thought that it may be benefitial to write them down in list format for people to access these works more easily. Please keep in mind that this is in no way an exhaustive list of important works within concept learning as this is a fast moving field and we have only so much space we can use here. Nevertheless, we still hope you may find this list helpful if you want to get a sense of where the field is and where it is heading.


These are some of the surveys that touch on concept representation learning and its use in interpretable/explainable AI:


  1. Concept-based Explainable Artificial Intelligence: A Survey
    Eleonora Poeta ,  Gabriele Ciravegna ,  Eliana Pastor , and 2 more authors
    arXiv preprint arXiv:2312.12936, 2023


  1. Concept embedding analysis: A review
    Gesina Schwalbe
    arXiv preprint arXiv:2203.13909, 2022


  1. Explainable AI: A review of machine learning interpretability methods
    Pantelis Linardatos ,  Vasilis Papastefanopoulos ,  and  Sotiris Kotsiantis
    Entropy, 2020

Supervised Concept Learning

Here we include some relevant works in concept representation learning that assume concept-labels are provided in some manner to learn concept representations from which explanations can be then constructed:


  1. Concept Gradient: Concept-based Interpretation Without Linear Assumption
    Andrew Bai ,  Chih-Kuan Yeh ,  Pradeep Ravikumar , and 2 more authors
    ICLR, 2023
  2. Learning to Receive Help: Intervention-Aware Concept Embedding Models
    Mateo Espinosa Zarlenga ,  Katherine M Collins ,  Krishnamurthy Dvijotham , and 3 more authors
    NeurIPS, 2023
  3. Label-Free Concept Bottleneck Models
    Tuomas Oikarinen ,  Subhro Das ,  Lam M Nguyen , and 1 more author
    ICLR, 2023
  4. Post-hoc concept bottleneck models
    Mert Yuksekgonul ,  Maggie Wang ,  and  James Zou
    ICLR, 2023
  5. Probabilistic Concept Bottleneck Models
    Eunji Kim ,  Dahuin Jung ,  Sangha Park , and 2 more authors
    ICML, 2023
  6. Probabilistic Concept Bottleneck Models
    Eunji Kim ,  Dahuin Jung ,  Sangha Park , and 2 more authors
    ICML, 2023


  1. Concept activation regions: A generalized framework for concept-based explanations
    Jonathan Crabbé ,  and  Mihaela Schaar
    Advances in Neural Information Processing Systems, 2022
  2. Glancenets: Interpretable, leak-proof concept-based models
    Emanuele Marconato ,  Andrea Passerini ,  and  Stefano Teso
    Advances in Neural Information Processing Systems, 2022
  3. Concept embedding models: Beyond the accuracy-explainability trade-off
    Mateo Espinosa Zarlenga ,  Pietro Barbiero ,  Gabriele Ciravegna , and 8 more authors
    Advances in Neural Information Processing Systems, 2022


  1. Concept bottleneck models
    Pang Wei Koh ,  Thao Nguyen ,  Yew Siang Tang , and 4 more authors
    In International conference on machine learning , 2020
  2. Concept whitening for interpretable image recognition
    Zhi Chen ,  Yijie Bei ,  and  Cynthia Rudin
    Nature Machine Intelligence, 2020


  1. Net2vec: Quantifying and explaining how concepts are encoded by filters in deep neural networks
    Ruth Fong ,  and  Andrea Vedaldi
    In Proceedings of the IEEE conference on computer vision and pattern recognition , 2018
  2. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV)
    Been Kim ,  Martin Wattenberg ,  Justin Gilmer , and 4 more authors
    In International conference on machine learning , 2018


  1. Network dissection: Quantifying interpretability of deep visual representations
    David Bau ,  Bolei Zhou ,  Aditya Khosla , and 2 more authors
    In Proceedings of the IEEE conference on computer vision and pattern recognition , 2017

Unsupervised Concept Learning

In contrast to the works above, the following papers attempt to learn concept representations without implicit or explicit concept labels. This is done by the means of concept discovery and represents a particularly active are of reasearch in this field:


  1. Label-Free Concept Bottleneck Models
    Tuomas Oikarinen ,  Subhro Das ,  Lam M Nguyen , and 1 more author
    ICLR, 2023
  2. Tabcbm: Concept-based interpretable neural networks for tabular data
    Mateo Espinosa Zarlenga ,  Zohreh Shams ,  Michael Edward Nelson , and 2 more authors
    Transactions on Machine Learning Research, 2023


  1. Gcexplainer: Human-in-the-loop concept-based explanations for graph neural networks
    Lucie Charlotte Magister ,  Dmitry Kazhdan ,  Vikash Singh , and 1 more author
    3rd ICML Workshop on Human in the Loop Learning,, 2021


  1. On completeness-aware concept-based explanations in deep neural networks
    Chih-Kuan Yeh ,  Been Kim ,  Sercan Arik , and 3 more authors
    Advances in neural information processing systems, 2020


  1. Towards automatic concept-based explanations
    Amirata Ghorbani ,  James Wexler ,  James Y Zou , and 1 more author
    Advances in neural information processing systems, 2019


  1. Towards robust interpretability with self-explaining neural networks
    David Alvarez Melis ,  and  Tommi Jaakkola
    Advances in neural information processing systems, 2018

Reasoning with Concepts

Finally, we include some papers that describe very interesting things one can do once one has learnt some concept representations (regardless of whether these representations were learnt with or without concept supervision). These works are highly related to the field of neuro-symbolic reasoning and we discuss them in more detail in our presentation:


  1. DiConStruct: Causal Concept-based Explanations through Black-Box Distillation
    Ricardo Moreira ,  Jacopo Bono ,  Mário Cardoso , and 3 more authors
    arXiv preprint arXiv:2401.08534, 2024


  1. Logic explained networks
    Gabriele Ciravegna ,  Pietro Barbiero ,  Francesco Giannini , and 4 more authors
    Artificial Intelligence, 2023
  2. Interpretable Neural-Symbolic Concept Reasoning
    Pietro Barbiero ,  Gabriele Ciravegna ,  Francesco Giannini , and 7 more authors
    ICML, 2023


  1. Logic tensor networks
    Samy Badreddine ,  Artur d’Avila Garcez ,  Luciano Serafini , and 1 more author
    Artificial Intelligence, 2022
  2. Entropy-based logic explanations of neural networks
    Pietro Barbiero ,  Gabriele Ciravegna ,  Francesco Giannini , and 3 more authors
    In Proceedings of the AAAI Conference on Artificial Intelligence , 2022
  3. Algorithmic concept-based explainable reasoning
    Dobrik Georgiev ,  Pietro Barbiero ,  Dmitry Kazhdan , and 2 more authors
    In Proceedings of the AAAI Conference on Artificial Intelligence , 2022


  1. Neural algorithmic reasoning
    Petar Veličković ,  and  Charles Blundell
    Patterns, 2021
  2. Meaningfully explaining model mistakes using conceptual counterfactuals
    Abubakar Abid ,  Mert Yuksekgonul ,  and  James Zou
    ICML, 2021


  1. Neural execution of graph algorithms
    Petar Veličković ,  Rex Ying ,  Matilde Padovano , and 2 more authors
    ICLR, 2020


  1. Explaining classifiers with causal concept effect (cace)
    Yash Goyal ,  Amir Feder ,  Uri Shalit , and 1 more author
    arXiv preprint arXiv:1907.07165, 2019


  1. Learning explanatory rules from noisy data
    Richard Evans ,  and  Edward Grefenstette
    Journal of Artificial Intelligence Research, 2018
  2. Deepproblog: Neural probabilistic logic programming
    Robin Manhaeve ,  Sebastijan Dumancic ,  Angelika Kimmig , and 2 more authors
    Advances in neural information processing systems, 2018


  1. Neural gpus learn algorithms
    Łukasz Kaiser ,  and  Ilya Sutskever
    ICLR, 2016


  1. Learning to execute
    Wojciech Zaremba ,  and  Ilya Sutskever
    arXiv preprint arXiv:1410.4615, 2014

Concept-Learning Public Codebases

Below we list some concept-based open-sourced libraries. As with our reference material, this is by no means an exaustive list but rather one that contains libraries we have had the chance to interact with in the past. If you wish to include your library here, and it is related to concept-learning, please do not hesitate to contact us and we will include it here.