Skip to content

[joss] software paper comments #21

@hbaniecki

Description

@hbaniecki

openjournals/joss-reviews#3934 Hi, I hope these comments help in improving the paper.

Comments

  1. The paper's title could see a change. It says "PySS3: A new interpretable and simple machine learning model for text classification", but the model is named "SS3" and seems not new. The title of the repository seems more accurate, "A Python package implementing a new simple and interpretable model for text classification", but even then one could drop "new" and use the PyPI package's title, e.g. "PySS3: A Python package implementing the SS3 interpretable text classifier [with interactive/visualization tools for explainable AI]". Just an example to be considered.
  2. I would recommend the authors to highlight in the article the software's aspect of "interactive" (explanation, analysis) and (model, machine learning) "monitoring" as this seems both novel and emerging in discussions lately.
  3. In the end, it would be useful to release a stable version 1.0 of the package (on GitHub, PyPI) and mark that in the paper, e.g. in the Summary section.

Summary

  • L10. "implements novel machine learning model" - It might not be seen as novel when the model was already published in 2019 and extended in 2020.
  • L11. mentioning "two useful tools" without describing what the second does seems off

Statement of need
This part discusses mainly the need for open-source implementation of the machine learning models. However, as I see it, the significant contributions of the software/paper, distinguishing it from the previous work, are the Live_Test/Evaluation tools allowing for visual explanation and hyperparameter optimization. This could be further underlined.

State of the field
The paper lacks a brief discussion on packages in the field of interpretable and explainable machine learning. In that, I suggest the authors reference/compare to the following software related to interactive explainability:

  1. Wexler et al. "The What-If Tool: Interactive Probing of Machine Learning Models" (IEEE TVCG, 2019) https://doi.org/10.1109/TVCG.2019.2934619
  2. Tenney et al. "The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models" (EMNLP, 2020) http://doi.org/10.18653/v1/2020.emnlp-demos.15
  3. Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann. "exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models" (ACL, 2020) https://www.doi.org/10.18653/v1/2020.acl-demos.22
  4. [Ours] "dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python" (JMLR, 2021) https://www.jmlr.org/papers/v22/20-1473.html

Other possibly missing/useful references:

  1. Pedregosa et al. "Scikit-learn: Machine Learning in Python" (JMLR, 2011) https://www.jmlr.org/papers/v12/pedregosa11a.html
  2. Christoph Molnar "Interpretable Machine Learning - A Guide for Making Black Box Models Explainable" (book, 2018) https://christophm.github.io/interpretable-ml-book
  3. Cynthia Rudin "Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead" (Nature Machine Intelligence, 2019) https://doi.org/10.1038/s42256-019-0048-x
  4. [Ours] "modelStudio: Interactive Studio with Explanations for ML Predictive Models" (JOSS, 2019) https://doi.org/10.21105/joss.01798

Implementation

  • L48 github -> GitHub
  • L54 "such as the one introduced later by the same authors" -> "by us" would be easier to read
  • L57 missing the citation of scikit-learn

Illustrative examples

  1. In the beginning, it lacks a brief description of the predictive task used for the example (dataset name, positive/negative text classification, etc.).
  2. Also, it could now be updated with the Dataset.load_from_url() function.

Conclusions
Again, I have doubts that the machine learning model is "novel", as it has been previously published etc.. It might be misunderstood as "introducing a novel machine learning model".

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions