Click or drag to resize

CategoricalEntailmentEnsembleOptimizationContext Class

Represents a Cross-Entropy context supporting the optimization of objective functions whose arguments are collections of CategoricalEntailment instances.
Inheritance Hierarchy

Namespace:  Novacta.Analytics.Advanced
Assembly:  Novacta.Analytics (in Novacta.Analytics.dll) Version: 2.0.0
Syntax
public sealed class CategoricalEntailmentEnsembleOptimizationContext : SystemPerformanceOptimizationContext

The CategoricalEntailmentEnsembleOptimizationContext type exposes the following members.

Constructors
  NameDescription
Public methodCategoricalEntailmentEnsembleOptimizationContext
Initializes a new instance of the CategoricalEntailmentEnsembleOptimizationContext class aimed to train an ensemble of categorical entailments by optimizing the specified objective function, with the given range of iterations, and probability smoothing coefficient.
Top
Properties
  NameDescription
Public propertyAllowEntailmentPartialTruthValues
Gets a value indicating whether that the truth value of a sampled categorical entailment must be equal to the homogeneity of the probability distribution from which its conclusion has been drawn. Otherwise, the truth value is always set to unity.
Protected propertyEliteSampleDefinition
Gets the elite sample definition for this context.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyFeatureCategoryCounts
Gets the collection of category counts for the features on which are defined the premises of the categorical entailments searched by this context.
Public propertyInitialParameter
Gets the parameter initially exploited to sample from the state-space of the system defined by this context.
(Inherited from CrossEntropyContext.)
Public propertyMaximumNumberOfIterations
Gets the maximum number of iterations allowed by this context.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyMinimumNumberOfIterations
Gets the minimum number of iterations required by this context.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyNumberOfCategoricalEntailments
Gets the number of categorical entailments.
Public propertyNumberOfResponseCategories
Gets the number of categories in the response variable.
Public propertyOptimizationGoal
Gets a constant specifying if the performance function in this context must be minimized or maximized.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyOverallNumberOfFeatureCategories
Gets the overall number of feature categories.
Public propertyProbabilitySmoothingCoefficient
Gets the coefficient that defines the smoothing scheme for the probabilities of the Cross-Entropy parameters exploited by this context.
Public propertyStateDimension
Gets or sets the dimension of a vector representing a system's state when a CrossEntropyProgram executes in this context.
(Inherited from CrossEntropyContext.)
Public propertyTraceExecution
Gets or sets a value indicating whether the execution of this context must be traced.
(Inherited from CrossEntropyContext.)
Top
Methods
  NameDescription
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetCategoricalEntailmentEnsembleClassifier
Gets the CategoricalEntailmentEnsembleClassifier instance represented by an element of the state-space defined by this context, having the specified feature and response variables.
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Protected methodGetOptimalState
Gets the argument that optimizes the objective function in this context, according to the specified Cross-Entropy sampling parameter.
(Overrides SystemPerformanceOptimizationContextGetOptimalState(DoubleMatrix).)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Protected methodCode exampleOnExecutedIteration
Called after completion of each iteration of a CrossEntropyProgram executing in this context.
(Overrides SystemPerformanceOptimizationContextOnExecutedIteration(Int32, DoubleMatrix, LinkedListDouble, LinkedListDoubleMatrix).)
Protected methodPartialSample
Draws the specified subset of a sample from a distribution characterized by the given parameter, using the stated random number generator. Used when executing the sampling step of a CrossEntropyProgram running in this context.
(Overrides CrossEntropyContextPartialSample(Double, TupleInt32, Int32, RandomNumberGenerator, DoubleMatrix, Int32).)
Protected methodPerformance
Computes the objective function at a specified argument as the performance defined in this context.
(Overrides CrossEntropyContextPerformance(DoubleMatrix).)
Protected methodSmoothParameter
Provides the smoothing of the updated sampling parameter of a SystemPerformanceOptimizer executing in this context.
(Overrides SystemPerformanceOptimizationContextSmoothParameter(LinkedListDoubleMatrix).)
Protected methodStopAtIntermediateIteration
Specifies conditions under which a SystemPerformanceOptimizer executing in this context should be considered as terminated after completing an intermediate iteration.
(Inherited from SystemPerformanceOptimizationContext.)
Protected methodStopExecution
Specifies conditions under which a CrossEntropyProgram executing in this context should be considered as terminated.
(Inherited from SystemPerformanceOptimizationContext.)
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Protected methodUpdateLevel
Updates the performance level for the current iteration of a CrossEntropyProgram executing in this context and determines the corresponding elite sample.
(Inherited from SystemPerformanceOptimizationContext.)
Protected methodUpdateParameter
Updates the sampling parameter attending the generation of the sample in the next iteration of a CrossEntropyProgram executing in this context.
(Overrides CrossEntropyContextUpdateParameter(LinkedListDoubleMatrix, DoubleMatrix).)
Top
Remarks

Class CategoricalEntailmentEnsembleOptimizationContext derives from SystemPerformanceOptimizationContext, and defines a Cross-Entropy context able to solve optimization problems regarding the selection of an ensemble of categorical entailments, i.e. objects defining collections of premises, about a given set of feature variables, that imply a specific response category, with the conclusion entailed by the premises with an eventually partial truth value, ranging between completely false to completely true.

Categorical entailments can be exploited when items from a given feature space LaTeX equation must be classified into a set LaTeX equation of labels. If LaTeX equation feature categorical variables are taken into account, and if variable LaTeX equation has finite domain LaTeX equation, then LaTeX equation can be represented as the Cartesian product LaTeX equation. A categorical entailment is a triple LaTeX equation, where LaTeX equation is a proper subset of LaTeX equation representing the entailment premises (LaTeX equation, with LaTeX equation), LaTeX equation is the concluded response category, and LaTeX equation is the entailment truth value.

Class SystemPerformanceOptimizationContext thoroughly defines a system whose performance must be optimized. Class CategoricalEntailmentEnsembleOptimizationContext specializes that system by assuming that its performance, say LaTeX equation, is defined on a collection of possible entailments about specific features and response variables in a given categorical data set.

Let LaTeX equation be the number of entailments to be selected. For LaTeX equation, let LaTeX equation be the number of categories in the domain of the LaTeX equation-th feature, say LaTeX equation, and let LaTeX equation be the number of categories in the response domain, say LaTeX equation. An entailment LaTeX equation can be represented by a partitioned row vector, say LaTeX equation, whose blocks are defined as follows. For LaTeX equation, LaTeX equation, with LaTeX equation being unity if the LaTeX equation-th category of LaTeX equation is included in the corresponding premise LaTeX equation, zero otherwise, or, using indicator functions,

LaTeX equation

while block LaTeX equation is a binary vector LaTeX equation in which, using indicator functions,

LaTeX equation

i.e., there is only one entry equal to unity corresponding to the feature category concluded by the entailment.

An argument for the Cross-Entropy program hence admits the partitioned form LaTeX equation, having dimensions LaTeX equation, with LaTeX equation being the overall number of available feature categories. Given a DoubleMatrix instance representing an argument, the collection of CategoricalEntailment instances it represents can be inspected by calling method GetCategoricalEntailmentEnsembleClassifier(DoubleMatrix, ListCategoricalVariable, CategoricalVariable).

The system's state-space LaTeX equation, i.e. the domain of LaTeX equation, can thus be represented as the Cartesian product of LaTeX equation copies of the set LaTeX equation, with the last closed interval representing the range of available truth values.

A Cross-Entropy optimizer is designed to identify the optimal arguments at which the performance function of a complex system reaches its minimum or maximum value. To get the optimal state, the system's state-space LaTeX equation is traversed iteratively by sampling, at each iteration, from a specific density function, member of a parametric family

LaTeX equation

where LaTeX equation is a possible argument of LaTeX equation, and LaTeX equation is the set of allowable values for parameter LaTeX equation. The parameter exploited at a given iteration LaTeX equation is referred to as the reference parameter of such iteration and indicated as LaTeX equation. A minimum number of iterations, say LaTeX equation, must be executed, while a number of them up to a maximum, say LaTeX equation, is allowed.

Implementing a context for optimizing on categorical entailments

The Cross-Entropy method provides an iterative multi step procedure. In the context of combinatorial optimization, at each iteration LaTeX equation a sampling step is executed in order to generate diverse candidate arguments of the objective function, sampled from a distribution characterized by the reference parameter of the iteration, say LaTeX equation. Such sample is thus exploited in the updating step in which a new reference parameter LaTeX equation is identified to modify the distribution from which the samples will be obtained in the next iteration: such modification is executed in order to improve the probability of sampling relevant arguments, i.e. those arguments corresponding to the function values of interest (See the documentation of class CrossEntropyProgram for a thorough discussion of the Cross-Entropy method).

When the Cross-Entropy method is applied in an optimization context, a final optimizing step is executed, in which the argument corresponding to the searched extremum is effectively identified.

These steps have been implemented as follows.

Sampling step

In a CategoricalEntailmentEnsembleOptimizationContext, the parametric family LaTeX equation is outlined as follows. Each component LaTeX equation of an argument LaTeX equation of LaTeX equation is attached to a parameter LaTeX equation, and the Cross-Entropy sampling parameter LaTeX equation is a partitioned row vector whose parts are those LaTeX equation blocks, each one governing the sampling of a different entailment LaTeX equation as follows. Firstly, a finite discrete distribution LaTeX equation is defined on the label domain LaTeX equation and sampled to obtain label LaTeX equation. Secondly, the entailment premise LaTeX equation must be sampled too. This is equivalent to select at random, for each input attribute LaTeX equation, a subset LaTeX equation of domain LaTeX equation, and hence define LaTeX equation: to obtain such result, a Bernoulli distribution, say LaTeX equation, is assigned to each category LaTeX equation in the attribute domain LaTeX equation: sampling LaTeX equation from LaTeX equation enable us to set LaTeX equation if and only if LaTeX equation. For LaTeX equation, LaTeX equation is thus defined as LaTeX equation, where, for LaTeX equation, one has LaTeX equation.

Finally, the truth value LaTeX equation is set equal to unity if partial truth values are not allowed; otherwise, one has

LaTeX equation

so that higher truth values will be assigned to entailments whose corresponding response distributions are less heterogeneous.

As a consequence of the previous discussion, the Cross-Entropy sampling parameter LaTeX equation can be represented as vector LaTeX equation.

The parametric space LaTeX equation should include a parameter under which all possible states must have a real chance of being selected: this parameter is specified as the initial reference parameter LaTeX equation. A CategoricalEntailmentEnsembleOptimizationContext defines LaTeX equation as a vector whose entries LaTeX equation, corresponding to feature categories, are all set equal to LaTeX equation, while entries LaTeX equation, corresponding to response categories, are all set equal to LaTeX equation.

Updating step

At iteration LaTeX equation, let us represent the sample drawn as LaTeX equation, where LaTeX equation is the Cross-Entropy sample size, and the LaTeX equation-th sample point is the sequence LaTeX equation. The parameter's updating formula is, for LaTeX equation and LaTeX equation,

LaTeX equation

where LaTeX equation is the elite sample in this context, i.e. the set of sample points having the lowest performances observed during the LaTeX equation-th iteration, if minimizing, the highest ones, otherwise, while LaTeX equation is its indicator function.

Analogously, one has, for LaTeX equation,

LaTeX equation

Applying a smoothing scheme to updated parameters

In a CategoricalEntailmentEnsembleOptimizationContext, the sampling parameter is smoothed applying the following formula (See Rubinstein and Kroese, Remark 5.2, p. 189[1] ):

LaTeX equation

where LaTeX equation.

Optimizing step

The optimizing step is executed after that the underlying Cross-Entropy program has converged. In a specified context, it is expected that, given a reference parameter LaTeX equation, a corresponding reasonable value could be guessed for the optimizing argument of LaTeX equation, say LaTeX equation, with LaTeX equation a function from LaTeX equation to LaTeX equation. Function LaTeX equation is defined by overriding method GetOptimalState(DoubleMatrix) that should return LaTeX equation given a specific reference parameter LaTeX equation.

Given the optimal parameter (the parameter corresponding to the last iteration LaTeX equation executed by the algorithm before stopping),

LaTeX equation

the argument LaTeX equation at which the searched extremum is considered as reached according to the Cross-Entropy method will be returned as follows.

For LaTeX equation, one has LaTeX equation, where for LaTeX equation, LaTeX equation, with LaTeX equation equal to unity if LaTeX equation, zero otherwise. Block LaTeX equation is a binary vector LaTeX equation, in which there is only one entry equal to unity, taken at random among those corresponding to probabilities LaTeX equation equal to

LaTeX equation

Finally, LaTeX equation is unity if partial truth values are not allowed; otherwise, it is set equal to LaTeX equation

Stopping criterion

A CategoricalEntailmentEnsembleOptimizationContext never stops before executing a number of iterations less than MinimumNumberOfIterations, and always stops if such number is greater than or equal to MaximumNumberOfIterations.

For intermediate iterations, default method StopAtIntermediateIteration(Int32, LinkedListDouble, LinkedListDoubleMatrix) is called to check if a Cross-Entropy program executing in this context should stop or not.

Instantiating a context for optimizing on entailments

At instantiation, the constructor of a CategoricalEntailmentEnsembleOptimizationContext object will receive information about the optimization under study by means of parameters representing the objective function LaTeX equation, the number of categories in the feature and response variables, LaTeX equation and LaTeX equation respectively, the number of entailments to be searched LaTeX equation, the extremes of the allowed range of intermediate iterations, LaTeX equation and LaTeX equation, and a constant stating if the optimization goal is a maximization or a minimization. In addition, the smoothing parameter LaTeX equation and a boolean constant signaling if entailment partial truth values should be allowed are also passed to the constructor.

After construction, LaTeX equation and LaTeX equation can be inspected, respectively, via properties MinimumNumberOfIterations and MaximumNumberOfIterations. The smoothing coefficient LaTeX equation is also available via property ProbabilitySmoothingCoefficient. Count constants LaTeX equation, LaTeX equation and LaTeX equation are returned by NumberOfCategoricalEntailments, FeatureCategoryCounts, and NumberOfResponseCategories, respectively. In addition, property OptimizationGoal signals that the performance function must be maximized if it evaluates to the constant Maximization, or that a minimization is requested if it evaluates to the constant Minimization.

To evaluate the objective function LaTeX equation at a specific argument, one can call method Performance(DoubleMatrix) passing the argument as a parameter. It is expected that the objective function will accept a row vector representing a valid representation of an argument.

Bibliography
[1] Rubinstein, R.Y. and Kroese, D.P., The Cross-Entropy Method, A unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, Springer, New York. (2004)
See Also