Data Mining


  1. 1.
    Association rules slides - apriori, eclat, fp growth - pretty complete
  2. 2.
    Terms - lift, confidence
FP Growth
  1. 2.
    The same example, but with a graph that shows that lower support cost less for fp-growth in terms of calc time.
  2. 4.
    Another clip video
  3. 5.
    How to validate these algorithms - probably the best way is confidence/support/lift
It depends on your task. But usually you want all three to be high.
  • high support: should apply to a large amount of cases
  • high confidence: should be correct often
  • high lift: indicates it is not just a coincidence