DAGNN
- Directed acyclic graph neural network
DagNN is a CNN wrapper alternative to SimpleNN. It is object oriented and allows constructing networks with a directed acyclic graph (DAG) topology. It is therefore far more flexible, although a little more complex and slightly slower for small CNNs.
A DAG object contains the following data members:
-
layers
: The network layers. -
vars
: The network variables. -
params
: The network parameters. -
meta
: Additional information relative to the CNN (e.g. inputimage format specification).
There are additional transient data members:
-
mode
[normal
]This flag can either be
normal
ortest
. In the latter case, certain blocks switch to a test mode suitable for validation or evaluation as opposed to training. For instance, dropout becomes a pass-through block intest
mode. -
accumulateParamDers
[false
]If this flag is set to
true
, then the derivatives of the network parameters are accumulated rather than rewritten the next time the derivatives are computed. -
conserveMemory
[true
]If this flag is set to
true
, the DagNN will discard intermediate variable values as soon as they are not needed anymore in the calculations. This is particularly important to save memory on GPUs. -
device
[cpu
]This flag tells whether the DagNN resides in CPU or GPU memory. Use the
DagNN.move()
function to move the DagNN between devices.
The DagNN is copyable handle, i.e. allows to create a deep copy using
copy
operator deep_copy = copy(dagnet);
. In all cases the deep copy
is located in CPU memory (i.e. is transfered from GPU before copy).
Remark: As a side effect the original network is being reset (all
variables are cleared) and only the network structure and parameters
are copied.
See Also: matlab.mixin.Copyable
DAGNN
- Initialize an empty DaG
OBJ = DAGNN() initializes an empty DaG.
See Also addLayer(), loadobj(), saveobj().
GETINPUTS
- Get the names of the input variables
INPUTS = GETINPUTS(obj) returns a cell array containing the name of the input variables of the DaG obj, i.e. the sources of the DaG (excluding the network parameters, which can also be considered sources).
GETOUTPUTS
- Get the names of the output variables
OUTPUT = GETOUTPUTS(obj) returns a cell array containing the name of the output variables of the DaG obj, i.e. the sinks of the DaG.
GETLAYERINDEX
- Get the index of a layer
INDEX = GETLAYERINDEX(obj, NAME) returns the index of the layer NAME. NAME can also be a cell array of strings. If no layer with such a name is found, the value NaN is returned for the index.
Layers can then be accessed as the obj.layers(INDEX)
property of the DaG.
Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster variable access.
See Also getParamIndex(), getVarIndex().
GETVARINDEX
- Get the index of a variable
INDEX = GETVARINDEX(obj, NAME) obtains the index of the variable with the specified NAME. NAME can also be a cell array of strings. If no variable with such a name is found, the value NaN is returned for the index.
Variables can then be accessed as the obj.vars(INDEX)
property of the DaG.
Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster variable access.
See Also getParamIndex(), getLayerIndex().
GETPARAMINDEX
- Get the index of a parameter
INDEX = GETPARAMINDEX(obj, NAME) obtains the index of the parameter with the specified NAME. NAME can also be a cell array of strings. If no parameter with such a name is found, the value NaN is returned for the index.
Parameters can then be accessed as the obj.params(INDEX)
property of the DaG.
Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster parameter access.
See Also getVarIndex(), getLayerIndex().
GETLAYER
- Get a copy of a layer definition
LAYER = GETLAYER(obj, NAME) returns a copy of the layer definition structure with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no parameter with a specified name or index exists, an error is thrown.
See Also getLayerIndex().
GETVAR
- Get a copy of a layer definition
VAR = GETVAR(obj, NAME) returns a copy of the network variable with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no variable with a specified name or index exists, an error is thrown.
See Also getVarIndex().
GETPARAM
- Get a copy of a layer parameter
PARAM = GETPARAM(obj, NAME) returns a copy of the network parameter with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no parameter with a specified name or index exists, an error is thrown.
See Also getParamIndex().
GETLAYEREXECUTIONORDER
- Get the order in which layers are evaluated
ORDER = GETLAYEREXECUTIONORDER(obj) returns a vector with the indexes of the layers in the order in which they are executed. This needs not to be the trivial order 1,2,...,L as it depends on the graph topology.
SETPARAMETERSERVER
- Set a parameter server for the parameter derivatives
SETPARAMETERSERVER(obj, PS) uses the specified ParameterServer PS to store and accumulate parameter derivatives across multiple MATLAB processes.
After setting this option, net.params.der is always empty and the derivative value must be retrieved from the server.
CLEARPARAMETERSERVER
- Remove the parameter server
CLEARPARAMETERSERVER(obj) stopts using the parameter server.
ADDVAR
- Add a variable to the DaG
V = ADDVAR(obj, NAME) adds a varialbe with the specified NAME to the DaG. This is an internal function; variables are automatically added when adding layers to the network.
ADDPARAM
- Add a parameter to the DaG
V = ADDPARAM(obj, NAME) adds a parameter with the specified NAME to the DaG. This is an internal function; parameters are automatically added when adding layers to the network.
ADDLAYER
- Adds a layer to a DagNN
ADDLAYER(NAME, LAYER, INPUTS, OUTPUTS, PARAMS) adds the specified layer to the network. NAME is a string with the layer name, used as a unique indentifier. BLOCK is the object implementing the layer, which should be a subclass of the Layer. INPUTS, OUTPUTS are cell arrays of variable names, and PARAMS of parameter names.
See Also REMOVELAYER().
EVAL
- Evaluate the DAGNN
EVAL(obj, inputs) evaluates the DaG for the specified input
values. inputs
is a cell array of the type {'inputName',
inputValue, ...}
. This call results in a forward pass through the
graph, computing the values of the output variables. These can
then be accessed using the obj.vars(outputIndex)
property of the
DaG object. The index of an output can be obtained using the
obj.getOutputIndex(outputName)
call.
EVAL(obj, inputs, derOutputs) evaluates the DaG forward and then
backward, performing backpropagation. Similar to inputs
,
derOutputs
is a cell array of the type {'outputName',
outputDerValue, ...} of output derivatives.
Understanding backpropagation
Only those outputs for which an outputDerValue
which is
non-empty are involved in backpropagation, while the others are
ignored. This is useful to attach to the graph auxiliary layers to
compute errors or other statistics, without however involving them
in backpropagation.
Usually one starts backpropagation from scalar outptus,
corresponding to loss functions. In this case outputDerValue
can
be interpreted as the weight of that output and is usually set to
one. For example: {'objective', 1}
backpropagates from the
'objective'
output variable with a weight of 1.
However, in some cases the DaG may contain more than one such
node, for example because one has more than one loss function. In
this case {'objective1', w1, 'objective2', w2, ...}
allows to
balance the different objectives.
Finally, one can backpropagate from outputs that are not
scalars. While this is unusual, it is possible by specifying a
value of outputDerValue
that has the same dimensionality as the
output; in this case, this value is used as a matrix of weights,
or projection.
Factors affecting evaluation
There are several factors affecting evaluation:
-
The evaluation mode can be either
normal
ortest
. Layers may behave differently depending on the mode. For example, dropout becomes a pass-through layer in test mode and batch normalization use fixed moments (this usually improves the test performance significantly). -
By default, the DaG aggressively conserves memory. This is particularly important on the GPU, where memory is scarce. However, this also means that the values of most variables and of their derivatives are dropped during the computation. For debugging purposes, it may be interesting to observe these variables; in this case you can set the
obj.conserveMemory
property of the DaG tofalse
. It is also possible to preserve individual variables by setting the propertyobj.vars(v).precious
totrue
.
FROMSIMPLENN
- Initialize a DagNN object from a SimpleNN network
FROMSIMPLENN(NET) initializes the DagNN object from the specified CNN using the SimpleNN format.
SimpleNN objects are linear chains of computational layers. These layers exchange information through variables and parameters that are not explicitly named. Hence, FROMSIMPLENN() uses a number of rules to assign such names automatically:
-
From the input to the output of the CNN, variables are called
x0
(input of the first layer),x1
,x2
, .... In this mannerxi
is the output of the i-th layer. -
Any loss layer requires two inputs, the second being a label. These are called
label
(for the first such layers), and thenlabel2
,label3
,... for any other similar layer.
Additionally, given the option CanonicalNames
the function can
change the names of some variables to make them more convenient to
use. With this option turned on:
-
The network input is called
input
instead ofx0
. -
The output of each SoftMax layer is called
prob
(orprob2
, ...). -
The output of each Loss layer is called
objective
(orobjective2
, ...). -
The input of each SoftMax or Loss layer of type softmax log loss is called
prediction
(orprediction2
, ...). If a Loss layer immediately follows a SoftMax layer, then the rule above takes precendence and the input name is not changed.
FROMSIMPLENN(___, 'OPT', VAL, ...) accepts the following options:
-
CanonicalNames
[false]If
true
use the rules above to assign more meaningful names to some of the variables.
GETVARRECEPTIVEFIELDS
- Get the receptive field of a variable
RFS = GETVARRECEPTIVEFIELDS(OBJ, VAR) gets the receptivie fields RFS of all the variables of the DagNN OBJ into variable VAR. VAR is a variable name or index.
RFS has one entry for each variable in the DagNN following the same
format as has DAGNN.GETRECEPTIVEFIELDS(). For example, RFS(i) is the
receptive field of the i-th variable in the DagNN into variable VAR. If
the i-th variable is not a descendent of VAR in the DAG, then there is
no receptive field, indicated by rfs(i).size == []
. If the receptive
field cannot be computed (e.g. because it depends on the values of
variables and not just on the network topology, or if it cannot be
expressed as a sliding window), then rfs(i).size = [NaN NaN]
.
GETVARSIZES
- Get the size of the variables
SIZES = GETVARSIZES(OBJ, INPUTSIZES) computes the SIZES of the
DagNN variables given the size of the inputs. inputSizes
is
a cell array of the type {'inputName', inputSize, ...}
Returns a cell array with sizes of all network variables.
Example, compute the storage needed for a batch size of 256 for an imagenet-like network:
batch_size = 256; single_num_bytes = 4;
input_size = [net.meta.normalization.imageSize, batch_size];
var_sizes = net.getVarSizes({'data', input_size});
fprintf('Network activations will take %.2fMiB in single.\n', ...
sum(prod(cell2mat(var_sizes, 1))) * single_num_bytes ./ 1024^3);
INITPARAM
- Initialize the paramers of the DagNN
OBJ.INITPARAM() uses the INIT() method of each layer to initialize the corresponding parameters (usually randomly).
LOADOBJ
- Initialize a DagNN object from a structure.
OBJ = LOADOBJ(S) initializes a DagNN objet from the structure
S. It is the opposite of S = OBJ.SAVEOBJ().
If S is a string, initializes the DagNN object with data
from a mat-file S. Otherwise, if S is an instance of dagnn.DagNN
,
returns S.
MOVE
- Move the DagNN to either CPU or GPU
MOVE(obj, 'cpu') moves the DagNN obj to the CPU.
MOVE(obj, 'gpu') moves the DagNN obj to the GPU.
PRINT
- Print information about the DagNN object
PRINT(OBJ) displays a summary of the functions and parameters in the network. STR = PRINT(OBJ) returns the summary as a string instead of printing it.
PRINT(OBJ, INPUTSIZES) where INPUTSIZES is a cell array of the type {'input1nam', input1size, 'input2name', input2size, ...} prints information using the specified size for each of the listed inputs.
PRINT(___, 'OPT', VAL, ...) accepts the following options:
-
All
[false]Display all the information below.
-
Layers
[''*]Specify which layers to print. This can be either a list of indexes, a cell array of array names, or the string '*', meaning all layers.
-
Parameters
[''*]Specify which parameters to print, similar to the option above.
-
Variables
[[]]Specify which variables to print, similar to the option above.
-
Dependencies
[false]Whether to display the dependency (geometric transformation) of each variables from each input.
-
Format
['ascii']Choose between
ascii
,latex
,csv
, 'digraph', anddot
. The first three format print tables;digraph
uses the plot function for adigraph
(supported in MATLAB>=R2015b) and the last one prints a graph indot
format. In case of zero outputs, it attmepts to compile and visualise the dot graph usingdot
command andstart
(Windows),display
(Linux) oropen
(Mac OSX) on your system. In the latter case, all variables and layers are included in the graph, regardless of the other parameters. -
FigurePath
['tempname.pdf']Sets the path where any generated
dot
figure will be saved. Currently, this is useful only in combination with the formatdot
. By default, a unique temporary filename is used (tempname
is replaced with atempname()
call). The extension specifies the output format (passed to dot as a-Text
parameter). If not extension provided, PDF used by default. Additionally, stores the .dot file used to generate the figure to the same location.dotArgs
:: '' Additional dot arguments. E.g. '-Gsize="7"' to generate a smaller output (for a review of the network structure etc.). -
MaxNumColumns
[18]Maximum number of columns in each table.
See also: DAGNN, DAGNN.GETVARSIZES().
REBUILD
- Rebuild the internal data structures of a DagNN object
REBUILD(obj) rebuilds the internal data structures of the DagNN obj. It is an helper function used internally to update the network when layers are added or removed.
REMOVELAYER
- Remove a layer from the network
REMOVELAYER(OBJ, NAME) removes the layer NAME from the DagNN object OBJ. NAME can be a string or a cell array of strings.
RENAMELAYER
- Rename a layer
RENAMELAYER(OLDNAME, NEWNAME) changes the name of the layer OLDNAME into NEWNAME. NEWNAME should not be the name of an existing layer.
RENAMELAYER
- Rename a parameter
RENAMEPARAM(OLDNAME, NEWNAME) changes the name of the parameter OLDNAME into NEWNAME. NEWNAME should not be the name of an existing parameter.
RENAMEVAR
- Rename a variable
RENAMEVAR(OLDNAME, NEWNAME) changes the name of the variable OLDNAME into NEWNAME. NEWNAME should not be the name of an existing variable.
RESET
- Reset the DagNN
RESET(obj) resets the DagNN obj. The function clears any intermediate value stored in the DagNN object, including parameter gradients. It also calls the reset function of every layer.
SAVEOBJ
- Save a DagNN to a vanilla MATLAB structure
S = OBJ.SAVEOBJ() saves the DagNN OBJ to a vanilla MATLAB structure S. This is particularly convenient to preserve future compatibility and to ship networks that are pure structures, instead of embedding dependencies to code.
The object can be reconstructe by obj = DagNN.loadobj(s)
.
As a side-effect the network is being reset (all variables are cleared) and is transfered to CPU.
See Also: dagnn.DagNN.loadobj, dagnn.DagNN.reset
SETLAYERINPUTS
- Set or change the inputs to a layer
Example: NET.SETLAYERINPUTS('layerName', {'input1', 'input2', ...})
SETLAYEROUTPUTS
- Set or change the outputs of a layer
Example: NET.SETLAYEROUTPUTS('layerName', {'output1', 'output2', ...})
SETLAYEPARAMS
- Set or change the parameters of a layer
Example: NET.SETLAYERPARAMS('layerName', {'param1', 'param2', ...})