Commit ab9a0eff authored by Charles Vernerey's avatar Charles Vernerey
Browse files

Update README.md

parent acb49027
Loading
Loading
Loading
Loading
+11 −11
Original line number Diff line number Diff line
@@ -47,17 +47,6 @@ for (Pattern closed : closedPatterns) {

![Summary of constraints implemented with Choco-mining \label{fig:app}](paper/app.svg)

We can model different problems using these constraints. The above figure shows examples of mining tasks (in blue) with the constraints (in red) involved in their modelling:

- Skypattern Mining: Given a set of measures $M$, find all the itemsets $x$ such that there exists no other itemset $y$ that dominates $x$. We say that $y$ dominates $x$ iff $\forall m \in M : m(y) \ge m(x)$ and $\exists m \in M : m(y) \gt m(x)$.
- Maximal Frequent Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \ge s$ and $\forall y \supset x : freq(y) \lt s$.
- Minimal Infrequent Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \lt s$ and $\forall y \subset x : freq(y) \ge s$.
- Generator Mining: Find all the itemsets $x$ such that $\nexists y \subset x : freq(y) = freq(x)$.
- Association Rule Mining: Find all the association rules $x \Rightarrow y$ that respect the constraints specified by the user.
- Diverse Itemset Mining: Given a diversity threshold $j$ and a minimum frequency threshold $s$, find all the diverse itemsets that are closed w.r.t. the frequency and such that $freq(x) \ge s$.
- Frequent Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \ge s$.
- Closed Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \ge s$ and $\nexists y \supset x : freq(x) = freq(y)$.

The following constraints are available in Choco-Mining:

- $CoverSize_{D}(x,f)$ [SchausAG17]: Given an integer variable $f$ that represents the frequency (noted $freq$) of an itemset $x$, the constraint ensures that $f = freq(x)$.
@@ -68,6 +57,17 @@ The following constraints are available in Choco-Mining:
- $Generator_{D}(x)$ [BelaidBL19]: The constraint ensures that $x$ is a generator, i.e. $\nexists y \subset x : freq(y) = freq(x)$.
- $ClosedDiversity_{D,\mathcal{H},j,s}(x)$ [HienLALLOZ20]: Given a history of itemsets $\mathcal{H}$, a diversity threshold $j$ and a minimum frequency threshold $s$, the constraint ensures that $x$ is a diverse itemset (i.e. $\nexists y \in \mathcal{H} : jaccard(x,y) \ge j$), $x$ is closed w.r.t. the frequency and $freq(x) \ge s$.

We can model different problems using these constraints. The above figure shows examples of mining tasks (in blue) with the constraints (in red) involved in their modelling:

- Frequent Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \ge s$.
- Closed Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \ge s$ and $\nexists y \supset x : freq(x) = freq(y)$.
- Skypattern Mining: Given a set of measures $M$, find all the itemsets $x$ such that there exists no other itemset $y$ that dominates $x$. We say that $y$ dominates $x$ iff $\forall m \in M : m(y) \ge m(x)$ and $\exists m \in M : m(y) \gt m(x)$.
- Maximal Frequent Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \ge s$ and $\forall y \supset x : freq(y) \lt s$.
- Minimal Infrequent Itemset Mining: Given a threshold $s$, find all the itemsets $x$ such that $freq(x) \lt s$ and $\forall y \subset x : freq(y) \ge s$.
- Generator Mining: Find all the itemsets $x$ such that $\nexists y \subset x : freq(y) = freq(x)$.
- Association Rule Mining: Find all the association rules $x \Rightarrow y$ that respect the constraints specified by the user.
- Diverse Itemset Mining: Given a diversity threshold $j$ and a minimum frequency threshold $s$, find all the diverse itemsets that are closed w.r.t. the frequency and such that $freq(x) \ge s$.

## Installation

To use the Choco-mining library, you need to have Java 8+ and [Maven 3](https://maven.apache.org/) installed on your computer. Then, you can simply install the library with the following command: