AIB - Definition. More...
Functions | |
void | vl_aib_normalize_P (double *P, vl_uint nelem) |
Normalizes an array of probabilities to sum to 1. More... | |
vl_uint * | vl_aib_new_nodelist (vl_uint nentries) |
Allocates and creates a list of nodes. More... | |
double * | vl_aib_new_Px (double *Pcx, vl_uint nvalues, vl_uint nlabels) |
Allocates and creates the marginal distribution Px. More... | |
double * | vl_aib_new_Pc (double *Pcx, vl_uint nvalues, vl_uint nlabels) |
Allocates and creates the marginal distribution Pc. More... | |
void | vl_aib_min_beta (VlAIB *aib, vl_uint *besti, vl_uint *bestj, double *minbeta) |
Find the two nodes which have minimum beta. More... | |
void | vl_aib_merge_nodes (VlAIB *aib, vl_uint i, vl_uint j, vl_uint new) |
Merges two nodes i,j in the internal datastructure. More... | |
void | vl_aib_update_beta (VlAIB *aib) |
Updates aib->beta and aib->bidx according to aib->which . More... | |
void | vl_aib_calculate_information (VlAIB *aib, double *I, double *H) |
Calculates the current information and entropy. More... | |
VlAIB * | vl_aib_new (double *Pcx, vl_uint nvalues, vl_uint nlabels) |
Allocates and initializes the internal data structure. More... | |
void | vl_aib_delete (VlAIB *aib) |
Deletes AIB data structure. More... | |
void | vl_aib_process (VlAIB *aib) |
Runs AIB on Pcx. More... | |
Detailed Description
Function Documentation
◆ vl_aib_calculate_information()
void vl_aib_calculate_information | ( | VlAIB * | aib, |
double * | I, | ||
double * | H | ||
) |
- Parameters
-
aib A pointer to the internal data structure I The current mutual information (out). H The current entropy (out).
Calculates the current mutual information and entropy of Pcx and sets I and H to these new values.
◆ vl_aib_delete()
void vl_aib_delete | ( | VlAIB * | aib | ) |
- Parameters
-
aib data structure to delete.
◆ vl_aib_merge_nodes()
- Parameters
-
aib A pointer to the internal data structure i The index of one member of the pair to merge j The index of the other member of the pair to merge new The index of the new node which corresponds to the union of (i, j).
Nodes are merged by replacing the entry i with the union of ij
, moving the node stored in last position (called lastnode
) back to jth position and the entry at the end.
After the nodes have been merged, it updates which nodes should be considered on the next iteration based on which beta values could potentially change. The merged node will always be part of this list.
◆ vl_aib_min_beta()
- Parameters
-
aib A pointer to the internal data structure besti The index of one member of the pair which has mininum beta bestj The index of the other member of the pair which minimizes beta minbeta The minimum beta value corresponding to (i, j)
Searches aib->beta to find the minimum value and fills minbeta and besti and bestj with this information.
◆ vl_aib_new()
- Parameters
-
Pcx A pointer to a 2D array of probabilities nvalues The number of rows in the array nlabels The number of columns in the array
Creates a new VlAIB struct containing pointers to all the data that will be used during the AIB process.
Allocates memory for the following:
- Px (nvalues*sizeof(double))
- Pc (nlabels*sizeof(double))
- nodelist (nvalues*sizeof(vl_uint))
- which (nvalues*sizeof(vl_uint))
- beta (nvalues*sizeof(double))
- bidx (nvalues*sizeof(vl_uint))
- parents ((2*nvalues-1)*sizeof(vl_uint))
- costs (nvalues*sizeof(double))
Since it simply copies to pointer to Pcx, the total additional memory requirement is:
(3*nvalues+nlabels)*sizeof(double) + 4*nvalues*sizeof(vl_uint)
- Returns
- An allocated and initialized VlAIB pointer
◆ vl_aib_new_nodelist()
- Parameters
-
nentries The size of the list which will be created
- Returns
- an array containing elements 0...nentries
◆ vl_aib_new_Pc()
- Parameters
-
Pcx A two-dimensional array of probabilities nvalues The number of rows in Pcx nlabels The number of columns in Pcx
- Returns
- an array of size nlabels which contains the marginal distribution over the columns
◆ vl_aib_new_Px()
- Parameters
-
Pcx A two-dimensional array of probabilities nvalues The number of rows in Pcx nlabels The number of columns in Pcx
- Returns
- an array of size nvalues which contains the marginal distribution over the rows.
◆ vl_aib_normalize_P()
void vl_aib_normalize_P | ( | double * | P, |
vl_uint | nelem | ||
) |
- Parameters
-
P The array of probabilities nelem The number of elements in the array
- Returns
- Modifies P to contain values which sum to 1
◆ vl_aib_process()
void vl_aib_process | ( | VlAIB * | aib | ) |
- Parameters
-
aib AIB object to process
The function runs Agglomerative Information Bottleneck (AIB) on the joint probability table aib->Pcx which has labels along the columns and feature values along the rows. AIB iteratively merges the two values of the feature x
that causes the smallest decrease in mutual information between the random variables x
and c
.
Merge operations are arranged in a binary tree. The nodes of the tree correspond to the original feature values and any other value obtained as a result of a merge operation. The nodes are indexed in breadth-first order, starting from the leaves. The first index is zero. In this way, the leaves correspond directly to the original feature values. In total there are 2*nvalues-1
nodes.
The results may be accessed through vl_aib_get_parents which returns an array with one element per tree node. Each element is the index the parent node. The root parent is equal to zero. The array has 2*nvalues-1
elements.
Feature values with null probability are ignored by the algorithm and their nodes have parents indexing a non-existent tree node (a value bigger than 2*nvalues-1
).
Then the function will also compute the information level after each merge. vl_get_costs will return a vector with the information level after each merge. cost has nvalues
entries: The first is the value of the cost functional before any merge, and the others are the cost after the nvalues-1
merges.
◆ vl_aib_update_beta()
void vl_aib_update_beta | ( | VlAIB * | aib | ) |
- Parameters
-
aib AIB data structure.
The function calculates beta
[i] and bidx
[i] for the nodes i
listed in aib->which
. beta
[i] is the minimal variation of mutual information (or other score) caused by merging entry i
with another entry and bidx
[i] is the index of this best matching entry.
Notice that for each entry i
that we need to update, a full scan of all the other entries must be performed.