Table Of Contents

Previous topic

PRM

Next topic

Parameter Learning

Generic Data Structure

groundBN module

The Ground Bayes Network (GBN) in ProbReM is the smallest subset of data that is required to answer a specific query. While the PRM uses a first-order representation of the world, the inference process needs a propositional represenation of the data. The network.groundBN module implements an efficient data structure for that purpose.

class network.groundBN.GBNGraph[source]

A GBNGraph is a dictionary that contains a set of vertices of type GBNvertex.

The GBNGraph instance itself is a dictionary which is used to store vertices {key=vertex_id : value= GBNvertex}. There are also various different dictionaries of all GBNvertex objects that allow fast retrieval of sets of GBN vertices.

addEvidenceVertex(ID, attr, obj, value)[source]

Instantiates a new evidence GBNvertex and updates the corresponding GBN data structures.

Parameters:
  • ID – Unique ID
  • attrAttribute
  • obj – Primary Key of attribute object
  • value – Value of vertex
addReferenceVertex(gbnV, dependency)[source]

Adds a ReferenceVertex to the ground Bayesian network.

For now, the reference attribute (all exist attributes) are assumed to be sampling nodes (i.e. not in the evidence nor in the event variables)

addSamplingVertex(ID, attr, obj)[source]

Instantiates a new sampling GBNvertex and updates the corresponding GBN data structures.

Parameters:
  • ID – Unique ID
  • attrAttribute
  • obj – Primary Key of attribute object
addVertex(ID, attr, obj, **args)[source]

General method to add a vertex to the graph.

allByAttribute

A dicitonary that groups all GBNvertex instances according to their attribute class, e.g. {key=:class:Attribute : value=[ list of GBNvertex ]}

eventVertices

A dicitonary that groups all event GBNvertex instances according to their attribute class, e.g. {key=vertex_id : value= GBNvertex}

logLikelihood()[source]

Returns the loglikelihood of the GBNGraph

samplingVertices

A dicitonary that groups all sampling GBNvertex instances (event & latent vertices), e.g. {key=vertex_id : value= GBNvertex}

samplingVerticesByAttribute

A dicitonary that groups all sampling GBNvertex instances (event & latent vertices) according to their attribute class, e.g. {key=:class:Attribute : value=[ list of GBNvertex ]}

class network.groundBN.GBNqueue[source]

A queue that keeps track of vertices that need to be processed when constructing the Ground Bayesian network.

This class is also a dictionary as the information is stored in groups that correspond to sets of vertices that share the same local distribution (= the same attribute).

{ key=:class:Attribute : value=[ list of GBNvertex ] }

pop()[source]

Return a set of GBNvertex instances of the same attribute A \in A(X). This allows us to retrieve the required data in one call to the data interface. We choose an attribute, remove the key from the dictionary and return the associated list.

Returns:List of GBNvertex
push(gbnVertex)[source]

If another GBNvertex is pushed onto the stack, it is added to the list associated with the gbnVertex.attr

Parameters:gbnVertexGBNvertex

The ground Bayesian network implemented in network.groundBN consists of different kind of vertices implemented in network.vertices.

class network.vertices.GBNvertex(attr, obj=None, ID=None, event=False, fixed=False, value=None, deterministic=False, aggregation=None)[source]

A GBNvertex represents a vertex in the Ground Bayes net. It is a variable in the GBN representing an attribute object whose CPD is distributed according to the CPD of the attribute class. E.g. all attribute objects of the same attribute class share the same CPD. A GBNvertex instance can take on a value from the domain of the associated attribute.

A node is associated with a specific attribute object x \in \sigma_{ER}. We use sets to identify an obj, where the set contains a value of for each primary key in self.pk of attr.erClass

  • a GBNvertex for rating would have self.obj = (x,y) where x=`User.user_id` and y=`Item.item_id`
  • a GBNvertex for gender would have self.obj = (x) where x=`User.user_id`
ID

An identifier for the unrolled Attribute object, e.g. Student.success.1

addParent(parentVertex)[source]

Adds the parentVertex to the list of parent vertices associated with the parent vertex attribute. It adds the corresponding information to the children dictionary of the parent node.

Parameters:parentVertexGBNvertex
attr

The associated attribute class

children

The dictionary of children attribute objects {key=`child.attribute` : value= { key=`id` : value = GBNvertex}}. child.attribute is of type Attribute and the gbnVertices of type GBNvertex

conditionalDist()[source]

Returns the conditional probability distribution of the gbnV given its parent values.

Parameters:gbnVGBN instance
Returns:A 1 x |attr.domain| numpy.array probability distribution
erID

An identifier for the unrolled ERClass object, e.g. Student.1

event

Boolean. If True, we are interested in the posterior distribution of the vertex.

fixed

Boolean. If the value is fixed the vertex is part of the evidence

hasParents(paAttr)[source]

Returns True if the number of parents for the attribute paAttr is not zero. If paAttr is not the parent of self.attr, a key exception will be raised.

Parameters:paAttrAttribute
indegree(attr=None)[source]

Returns the number of parents for attr. If attr==None the total number of parents is returned.

Parameters:attrAttribute
logLikelihood()[source]

Returns the loglikelihood of the value of the GBN vertex

obj

List identifier for the vertex

outdegree(attr=None)[source]

Returns the number of parents for attr. If attr==None the total number of parents is returned.

Parameters:attrAttribute
parentAss

The parent assignment of the parents of this node. The order of the parent values is the same as the self.attr.parents list. It can be updated using parentAssignments()

parentAssignments()[source]

Computes the values of the parents of that GBN vertex (using aggregation if necessary). Note that since there is an GBNVertex instance for every node in the GBN, the parent assignments are stored in the instance variable self.parentAss. In the case of the local distribution instance of an attribute, this is not the case as the distribution is shared among many attribute objects.

parents

The dictionary of parents attribute objects {key=`parent.attribute` : value= { key=`id` : value = GBNvertex}}. parent.attribute is of type Attribute and the gbnVertices of type GBNvertex

sample()[source]

Samples a new value for that gbn vertex. Warning: self.value will be overwritten even if self.fixed=True. We opt of performance and trust our implementation.

value

Current value, must be in the domain of attr

class network.vertices.ReferenceVertex(ID, gbnV, dep)[source]

The class ReferenceVertex is a compact representation of the probabilitic variables required to represent reference uncertainty for one connection. A relationship r connects two entities e1, e2 with a certain type of connection; either a n:1 or a m:n connection. In case of a n:1 connection, like in the student professor example, each object in e1 is associated with exactly one object in e2, whereas an object in e2 can be associated with multiple objects in e1.

For example, when infering the success of a student s1.s. There will be one ReferenceVertex instance that contains a datastructure representing the shaded nodes in the network displayed below.

../_images/ref_unc_ex4.png

Note

For now this works, but there is a problem.

If there are multiple dependencies leading through the uncertain dependency dep, they all must use the same mapping of course (i.e. the same exist attributes)

Student/Prof Example: If the

student.success depends on Professor.fame

and a

student.phd also depends on Professor.fame

Assuming that we do inference for student1 on student.success and student.phd, then of course all exist attributes with student1 should be identical. This means that if we sample the exist attributes, then the edges for student.success and student.phd should be changed!

At this point, a GBN reference vertex is associated with only 1 GBN vertex (e.g. student1.success) of the n-Entity. In reality it should be associated with 1 object (e.g. student1) of the n-Entity.

This is not hard to do:

`self.refGBNvertex` should be a dictionary holding all attribute objects (e.g. student.success.1 student.iq.1 and student) of a certain object (e.g. student.1)

`self.dependency = dep` should be a dictionary holding all uncertain dependencies 
addReference(gbnV_new)[source]

Adds one reference in self.references and updates the parent/children information of the involved vertices.

Parameters:gbnV_newGBNvertex to be added.
dependency

The uncertain Dependency instance

existParents

{ key = k_entity_ID (e.g. Professor.2) : value = { key = parent.attr (e.g. prof.funding) : value = { key = parent.ID (e.g. ‘prof.funding.2’) : value = parent.Vertex (e.g. prof.funding.2.vertex)} } }

k

The relationship is assumed to be of type n:k, where k serves as a fixed-parameter to limit the size of the state space of the Markov chain for inference. Assuming that relationship R of type n:k is connecting entities (E1,`E2`). Thus every object in E1 is connected with at most k objects in E2. By definition, the E1 and E2 refer to the first and second entry in the relationship.pk list, respectively.

parentAssignments(k_gbnV)[source]

Note, this method is overwritten from GBNvertex. As a reference vertex is a represenation of multiple (i.e. self.k) exist attributes, there are also multiple parent assignment. The methods takes a k_gbnV_erID of the k-entity as argument and returns the parent assignments list of the exist attribute object associated with k_gbnV_erID. This is probably neither fast nor pretty, another way would be to also overwrite GBNVertex.parentAss to use a dictionary for all entries in ReferenceVertex.references. As is, a new list is returned at each execution.

Parameters:k_gbnV_erIDerID of GBNVertex instance
Returns:List of parents assignments
refGBNvertex

The referenced GBNvertex (which is on the n-side of the n:k relationship)

references

The theoretical exist attributes don’t have to stored explicitly. The deterministic constraint (k) limits the number of non-zero exist variables to k. ReferenceVertex.references is a compact represenation of all exist attributes for one n-side attribute object (i.e. a students success). The dictionary of length ReferenceVertex.k stores all links that exist (i.e. the exist attribute is 1) in the format {key = k_entity_ID : value = gbnV_E2 }.

The methods addReference(), removeReference() and replaceReference() can be used to manipulate this datastructure.

relationship

The uncertain :class:`.Relationship `instance

removeAllReferences()[source]

Removes all references from self.references.

removeReference(gbnV_old)[source]

Removes one reference in self.references and updates the parent/children information of the involved vertices.

Parameters:gbnV_oldGBNvertex to be removed.
replaceReference(gbnV_new, gbnV_old)[source]

Replaces one reference in self.references by another.

Parameters:
network.vertices.computeERID(er, obj)[source]

A simple helper function that computes a unique ID from an object (e.g. a student). It allows to identify an object (e.g. student.1), rather than an attribute object (e.g. student.success.1) computed by computeID().

Parameters:er – Instance of ERClass
Returns:A unique string ID for the object
network.vertices.computeID(attr, obj)[source]

A simple helper function that computes a unique ID from an attr and obj, the primary key of the attribute object which is part of the GBN.

Parameters:
Returns:

A unique string ID for the attribute object

network.vertices.computeRefID(gbnV)[source]

A simple helper function that computes a unique reference ID from an gbnV vertex

Parameters:gbnV – Instance of GBNvertex
Returns:A unique string ID for the reference vertex