pyIDS is a custom implementation of IDS (Interpretable Decision Sets) algorithm introduced in
LAKKARAJU, Himabindu; BACH, Stephen H.; LESKOVEC, Jure. Interpretable decision sets: A joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016. p. 1675-1684.
If you find this package useful in your research, please cite our paper on this Interpretable Decision Sets Implementation:
Jiri Filip, Tomas Kliegr. PyIDS - Python Implementation of Interpretable Decision Sets Algorithm by Lakkaraju et al, 2016. RuleML+RR2019@Rule Challenge 2019. http://ceur-ws.org/Vol-2438/paper8.pdf
The pyarc, pandas, scipy and numpy packages need to be installed before using pyIDS.
All of these packages can be installed using pip.
For pyarc, please refer to the Installation section of its README file.
training a simple IDS model
import pandas as pd
from pyids.algorithms.ids_classifier import mine_CARs
from pyids.algorithms.ids import IDS
from pyarc.qcba.data_structures import QuantitativeDataFrame
import io
import requests
url = "https://raw.githubusercontent.com/kliegr/arcBench/master/data/folds_discr/train/iris0.csv"
s = requests.get(url).content
df = pd.read_csv(io.StringIO(s.decode('utf-8')))
cars = mine_CARs(df, rule_cutoff=50)
lambda_array = [1, 1, 1, 1, 1, 1, 1]
quant_dataframe = QuantitativeDataFrame(df)
ids = IDS(algorithm="SLS")
ids.fit(quant_dataframe=quant_dataframe, class_association_rules=cars, lambda_array=lambda_array)
acc = ids.score(quant_dataframe)optimizing for best lambda parameters using coordinate ascent, as described in the original paper
import pandas as pd
import io
import requests
from pyids.algorithms.ids_classifier import mine_CARs
from pyids.algorithms.ids import IDS
from pyids.model_selection.coordinate_ascent import CoordinateAscent
from pyarc.qcba.data_structures import QuantitativeDataFrame
url = "https://raw.githubusercontent.com/jirifilip/pyids/master/data/titanic.csv"
s = requests.get(url).content
df = pd.read_csv(io.StringIO(s.decode('utf-8')))
quant_df = QuantitativeDataFrame(df)
cars = mine_CARs(df, 20)
def fmax(lambda_dict):
print(lambda_dict)
ids = IDS(algorithm="SLS")
ids.fit(class_association_rules=cars, quant_dataframe=quant_df, lambda_array=list(lambda_dict.values()))
auc = ids.score_auc(quant_df)
print(auc)
return auc
coord_asc = CoordinateAscent(
func=fmax,
func_args_ranges=dict(
l1=(1, 1000),
l2=(1, 1000),
l3=(1, 1000),
l4=(1, 1000),
l5=(1, 1000),
l6=(1, 1000),
l7=(1, 1000)
),
ternary_search_precision=50,
max_iterations=3
)
best_lambdas = coord_asc.fit()