Full Example

We have covered how to learn the structure and parameters of a Bayesian Belief Network BBN. Let’s see how we can combine the structure and parameters to create a BBN. Additionally, let’s see how we can use the BBN for exact inference.

Load data

Let’s read our data into a Spark DataFrame SDF.

[1]:
from pyspark_bbn.discrete.data import DiscreteData

sdf = spark.read.csv("hdfs://localhost/data-1479668986461.csv", header=True)
data = DiscreteData(sdf)

Structure learning

Let’s pick the naive Bayes algorithm to learn the structure.

[2]:
from pyspark_bbn.discrete.scblearn import Naive

naive = Naive(data, "n3")
g = naive.get_network()

Parameter learning

After we have a structure, we can learn the parameters.

[3]:
from pyspark_bbn.discrete.plearn import ParamLearner

param_learner = ParamLearner(data, g)
p = param_learner.get_params()

BBN

Now that we have the structure and parameters, we can build a BBN. Use the get_bbn utility method to help bring together the structure and parameters.

[4]:
from pyspark_bbn.discrete.bbn import get_pybbn_data

bbn = get_pybbn_data(g, p)

Inference

With a BBN defined, we can use py-bbn to proceed with exact inference.

[5]:
from pybbn.factory import create_reasoning_model

model = create_reasoning_model(bbn["d"], bbn["p"])
[6]:
q = model.pquery()
[7]:
q["n1"]
[7]:
n1 __p__
0 f 0.75065
1 t 0.24935
[8]:
q["n2"]
[8]:
n2 __p__
0 f 0.6509
1 t 0.3491
[9]:
q["n3"]
[9]:
n3 __p__
0 f 0.47345
1 t 0.52655
[10]:
q["n4"]
[10]:
n4 __p__
0 f 0.40255
1 t 0.59745
[11]:
q["n5"]
[11]:
n5 __p__
0 maybe 0.29585
1 no 0.29825
2 yes 0.40590