Machine Learning Fundamentals - Probability Theory - Bayesian Networks by Example

Introduction

Student Example

'This assumption can be encoded in a Bayesian Network with two nodes for each variable...' is confusing, cause you expect a graph with four nodes. 'This assumption can be encoded in a Bayesian Network with two nodes, one for each variable,...'

Example: Common Cause

Notation question: Why P implicates the independence of S and G given I? What is P in the equation, problem domain? The equation is only partially understandable from the given description.

P \models S \perp G \mid I

Observed Variables

What does 'active' in 'Active variables are depicted as nodes filled with a background color' mean? Is it an observed variable?

Inference and Subsections

After going through the whole notebook, I got the impression the recommended 'additional resources' are mandatory. The chapter and the subsections have excellent and clear examples, but as the reader, you do not have any idea why you are learning this at that moment. The part lacks some explanations, especially the aspect of observed variables. More comments or a guiding story would help the reader. An outlook what you can do with the learning content would be helpful either.

Inference: Indirect Causal Effect

'First we can calculate the joint distribution P(I, G, L) and then sum over the hidden variables I and G' Why is G hidden? Is it not possible to measure the grad? You get confused by the explanation given above.

Inference: Indirect Evidential Effect

'Normalization by a factor \frac{1}{Z} is required for probabilities in table P(I \mid l) to not sum to 1.' It is not clear to the reader why this is necessary.