Click or drag to resize

PartitionOptimizationContext Class

Represents a Cross-Entropy context supporting the optimization of objective functions whose arguments are specific partitions of a given finite set of items.
Inheritance Hierarchy

Namespace:  Novacta.Analytics.Advanced
Assembly:  Novacta.Analytics (in Novacta.Analytics.dll) Version: 2.0.0
Syntax
public sealed class PartitionOptimizationContext : SystemPerformanceOptimizationContext

The PartitionOptimizationContext type exposes the following members.

Constructors
  NameDescription
Public methodPartitionOptimizationContext
Initializes a new instance of the PartitionOptimizationContext class aimed to optimize the specified objective function, with the given optimization goal, range of iterations, and probability smoothing coefficient.
Top
Properties
  NameDescription
Protected propertyEliteSampleDefinition
Gets the elite sample definition for this context.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyInitialParameter
Gets the parameter initially exploited to sample from the state-space of the system defined by this context.
(Inherited from CrossEntropyContext.)
Public propertyMaximumNumberOfIterations
Gets the maximum number of iterations allowed by this context.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyMinimumNumberOfIterations
Gets the minimum number of iterations required by this context.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyOptimizationGoal
Gets a constant specifying if the performance function in this context must be minimized or maximized.
(Inherited from SystemPerformanceOptimizationContext.)
Public propertyPartitionDimension
Gets the dimension of a partition represented by a system's state when a CrossEntropyProgram executes in this context.
Public propertyProbabilitySmoothingCoefficient
Gets the coefficient that defines the smoothing scheme for the probabilities of the Cross-Entropy parameters exploited by this context.
Public propertyStateDimension
Gets or sets the dimension of a vector representing a system's state when a CrossEntropyProgram executes in this context.
(Inherited from CrossEntropyContext.)
Public propertyTraceExecution
Gets or sets a value indicating whether the execution of this context must be traced.
(Inherited from CrossEntropyContext.)
Top
Methods
  NameDescription
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Protected methodGetOptimalState
Gets the argument that optimizes the objective function in this context, according to the specified Cross-Entropy sampling parameter.
(Overrides SystemPerformanceOptimizationContextGetOptimalState(DoubleMatrix).)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Protected methodCode exampleOnExecutedIteration
Called after completion of each iteration of a CrossEntropyProgram executing in this context.
(Overrides SystemPerformanceOptimizationContextOnExecutedIteration(Int32, DoubleMatrix, LinkedListDouble, LinkedListDoubleMatrix).)
Protected methodPartialSample
Draws the specified subset of a sample from a distribution characterized by the given parameter, using the stated random number generator. Used when executing the sampling step of a CrossEntropyProgram running in this context.
(Overrides CrossEntropyContextPartialSample(Double, TupleInt32, Int32, RandomNumberGenerator, DoubleMatrix, Int32).)
Protected methodPerformance
Computes the objective function at a specified argument as the performance defined in this context.
(Overrides CrossEntropyContextPerformance(DoubleMatrix).)
Protected methodSmoothParameter
Provides the smoothing of the updated sampling parameter of a SystemPerformanceOptimizer executing in this context.
(Overrides SystemPerformanceOptimizationContextSmoothParameter(LinkedListDoubleMatrix).)
Protected methodStopAtIntermediateIteration
Specifies conditions under which a SystemPerformanceOptimizer executing in this context should be considered as terminated after completing an intermediate iteration.
(Overrides SystemPerformanceOptimizationContextStopAtIntermediateIteration(Int32, LinkedListDouble, LinkedListDoubleMatrix).)
Protected methodStopExecution
Specifies conditions under which a CrossEntropyProgram executing in this context should be considered as terminated.
(Inherited from SystemPerformanceOptimizationContext.)
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Protected methodUpdateLevel
Updates the performance level for the current iteration of a CrossEntropyProgram executing in this context and determines the corresponding elite sample.
(Inherited from SystemPerformanceOptimizationContext.)
Protected methodUpdateParameter
Updates the sampling parameter attending the generation of the sample in the next iteration of a CrossEntropyProgram executing in this context.
(Overrides CrossEntropyContextUpdateParameter(LinkedListDoubleMatrix, DoubleMatrix).)
Top
Remarks

Class PartitionOptimizationContext derives from SystemPerformanceOptimizationContext, and defines a Cross-Entropy context able to solve combinatorial problems aimed to identify the optimal partition of a set, given a specified criterion.

Class SystemPerformanceOptimizationContext thoroughly defines a system whose performance must be optimized. Class PartitionOptimizationContext specializes the system by assuming that its performance, say LaTeX equation, has domain included in the family of those partitions in which the items of a given collection can be split.

The system's state-space LaTeX equation, i.e. the domain of LaTeX equation, can thus be represented as the Cartesian product of LaTeX equation copies of the set LaTeX equation, where LaTeX equation is the number of available items, and LaTeX equation is the maximum number of parts allowed in the partitions under study. An argument LaTeX equation defines a partition by signaling that the LaTeX equation-th item is included in the LaTeX equation-th part by setting LaTeX equation, with LaTeX equation.

Notice that LaTeX equation represents a maximum number of parts. It means that there exist arguments in which, for a given LaTeX equation, no entries will be equal to LaTeX equation.

A Cross-Entropy optimizer is designed to identify the optimal arguments at which the performance function of a complex system reaches its minimum or maximum value. To get the optimal state, the system's state-space LaTeX equation, i.e. the domain of LaTeX equation, is traversed iteratively by sampling, at each iteration, from a specific density function, member of a parametric family

LaTeX equation

where LaTeX equation is a possible argument of LaTeX equation, and LaTeX equation is the set of allowable values for parameter LaTeX equation. The parameter exploited at a given iteration LaTeX equation is referred to as the reference parameter of such iteration and indicated as LaTeX equation. A minimum number of iterations, say LaTeX equation, must be executed, while a number of them up to a maximum, say LaTeX equation, is allowed.

Implementing a context for optimizing on set partitions

The Cross-Entropy method provides an iterative multi step procedure. In the context of combinatorial optimization, at each iteration LaTeX equation a sampling step is executed in order to generate diverse candidate arguments of the objective function, sampled from a distribution characterized by the reference parameter of the iteration, say LaTeX equation. Such sample is thus exploited in the updating step in which a new reference parameter LaTeX equation is identified to modify the distribution from which the samples will be obtained in the next iteration: such modification is executed in order to improve the probability of sampling relevant arguments, i.e. those arguments corresponding to the function values of interest (See the documentation of class CrossEntropyProgram for a thorough discussion of the Cross-Entropy method).

When the Cross-Entropy method is applied in an optimization context, a final optimizing step is executed, in which the argument corresponding to the searched extremum is effectively identified.

These steps have been implemented as follows.

Sampling step

In a PartitionOptimizationContext, the parametric family LaTeX equation is outlined as follows. Each component LaTeX equation of an argument LaTeX equation of LaTeX equation is attached to a independent finite discrete distribution having parameter LaTeX equation, where, for LaTeX equation, LaTeX equation is the probability of including the LaTeX equation-th item to the LaTeX equation-th part. The Cross-Entropy sampling parameter LaTeX equation can thus be represented as the LaTeX equation matrix

LaTeX equation

.

The parametric space LaTeX equation should include a parameter under which all possible states must have a real chance of being selected: this parameter is specified as the initial reference parameter LaTeX equation. A PartitionOptimizationContext defines LaTeX equation as a constant matrix whose entries are all equal to LaTeX equation.

Updating step

At iteration LaTeX equation, let us represent the sample drawn as LaTeX equation, where LaTeX equation is the Cross-Entropy sample size, and the LaTeX equation-th sample point is the sequence LaTeX equation. The parameter's updating formula is, for LaTeX equation and LaTeX equation,

LaTeX equation

where LaTeX equation is the elite sample in this context, i.e. the set of sample points having the lowest performances observed during the LaTeX equation-th iteration, if minimizing, the highest ones, otherwise, while the LaTeX equation functions are indicators of the specified sets.

Applying a smoothing scheme to updated parameters

In a PartitionOptimizationContext, the sampling parameter is smoothed applying the following formula (See Rubinstein and Kroese, Remark 5.2, p. 189[1] ):

LaTeX equation

where LaTeX equation.

Optimizing step

The optimizing step is executed after that the underlying Cross-Entropy program has converged. In a specified context, it is expected that, given a reference parameter LaTeX equation, a corresponding reasonable value could be guessed for the optimizing argument of LaTeX equation, say LaTeX equation, with LaTeX equation a function from LaTeX equation to LaTeX equation. Function LaTeX equation is defined by overriding method GetOptimalState(DoubleMatrix) that should return LaTeX equation given a specific reference parameter LaTeX equation.

Given the optimal parameter (the parameter corresponding to the last iteration LaTeX equation executed by the algorithm before stopping),

LaTeX equation

the argument at which the searched extremum is considered as reached according to the Cross-Entropy method will be returned as follows. For each LaTeX equation, the probabilities LaTeX equation are sorted in increasing order, say obtaining the following ordering:

LaTeX equation

with the corresponding sequence of indexes LaTeX equation, such that

LaTeX equation

Hence LaTeX equation will return

LaTeX equation

where LaTeX equation is LaTeX equation. This is equivalent to include, in the optimal partition, the LaTeX equation-th item to the part having the maximum probability of being assigned to LaTeX equation given parameter LaTeX equation.

Stopping criterion

A PartitionOptimizationContext never stops before executing a number of iterations less than MinimumNumberOfIterations, and always stops if such number is greater than or equal to MaximumNumberOfIterations.

For intermediate iterations, method StopAtIntermediateIteration(Int32, LinkedListDouble, LinkedListDoubleMatrix) is called to check if a Cross-Entropy program executing in this context should stop or not.

In a PartitionOptimizationContext, the method analyzes the currently updated reference parameter, say

LaTeX equation

as follows. Define the sequence LaTeX equation such that, for LaTeX equation,

LaTeX equation

If condition

LaTeX equation

can be verified, the method returns true; otherwise false is returned. Equivalently, the algorithm converges if the indexes of the largest probabilities in each of the LaTeX equation columns of the reference parameter coincide LaTeX equation times in a row of iterations.

Instantiating a context for optimizing on set partitions

At instantiation, the constructor of a PartitionOptimizationContext object will receive information about the optimization under study by means of parameters representing the objective function LaTeX equation, the number of items LaTeX equation, the maximum number of parts LaTeX equation, the extremes of the allowed range of intermediate iterations, LaTeX equation and LaTeX equation, and a constant stating if the optimization goal is a maximization or a minimization. In addition, the smoothing parameter LaTeX equation is also passed to the constructor.

After construction, LaTeX equation and LaTeX equation can be inspected, respectively, via properties MinimumNumberOfIterations and MaximumNumberOfIterations. The smoothing coefficient LaTeX equation is also available via property ProbabilitySmoothingCoefficient. Combination constants LaTeX equation and LaTeX equation are returned by StateDimension and PartitionDimension, respectively. In addition, property OptimizationGoal signals that the performance function must be maximized if it evaluates to the constant Maximization, or that a minimization is requested if it evaluates to the constant Minimization.

To evaluate the objective function LaTeX equation at a specific argument, one can call method Performance(DoubleMatrix) passing the argument as a parameter. It is expected that the objective function will accept a row vector having entries included in LaTeX equation as a valid representation of an argument.

Examples

In the following example, an optimal partition of 12 items is discovered given an artificial data set regarding the items under study.

The optimality criterion is defined as the minimization of the Davies-Bouldin Index.

Discovering the optimal partition of a data set by Davies-Bouldin Index minimization.
using System;
using Novacta.Analytics.Advanced;

namespace Novacta.Analytics.CodeExamples.Advanced
{
    public class PartitionOptimizationContextExample0  
    {
        public void Main()
        {
            // Set the number of items and features under study.
            const int numberOfItems = 12;
            int numberOfFeatures = 7;

            // Create a matrix that will represent
            // an artificial data set,
            // having 12 items (rows) and 7 features (columns).
            // This will store the observations which
            // partition discovery will be based on.
            var data = DoubleMatrix.Dense(
                numberOfRows: numberOfItems,
                numberOfColumns: numberOfFeatures);

            // Fill the data rows by sampling from a different 
            // distribution while, respectively, drawing observations 
            // for items 0 to 3, 4 to 7, and 8 to 11: these will be the 
            // three different parts expected to be included in the 
            // optimal partition.
            double mu = 1.0;
            var g = new GaussianDistribution(mu: mu, sigma: .01);

            IndexCollection range = IndexCollection.Range(0, 3);
            for (int j = 0; j < numberOfFeatures; j++)
            {
                data[range, j] = g.Sample(sampleSize: range.Count);
            }

            mu += 5.0;
            g.Mu = mu;
            range = IndexCollection.Range(4, 7);
            for (int j = 0; j < numberOfFeatures; j++)
            {
                data[range, j] = g.Sample(sampleSize: range.Count);
            }

            mu += 5.0;
            g.Mu = mu;
            range = IndexCollection.Range(8, 11);
            for (int j = 0; j < numberOfFeatures; j++)
            {
                data[range, j] = g.Sample(sampleSize: range.Count);
            }

            Console.WriteLine("The data set:");
            Console.WriteLine(data);

            // Define the optimization problem as
            // the minimization of the Davies-Bouldin Index
            // of a candidate partition.
            double objectiveFunction(DoubleMatrix x)
            {
                // An argument x has 12 entries, each belonging
                // to the set {0,...,k-1}, where k is the
                // maximum number of allowed parts, so
                // x[j]==i signals that, at x, item j
                // has been assigned to part i.
                IndexPartition<double> selected =
                    IndexPartition.Create(x);

                var performance = IndexPartition.DaviesBouldinIndex(
                    data: data,
                    partition: selected);

                return performance;
            }

            var optimizationGoal = OptimizationGoal.Minimization;

            // Define the maximum number of parts allowed in the
            // partition to be discovered.
            int maximumNumberOfParts = 3;

            // Create the required context.
            var context = new PartitionOptimizationContext(
                objectiveFunction: objectiveFunction,
                stateDimension: numberOfItems,
                partitionDimension: maximumNumberOfParts,
                probabilitySmoothingCoefficient: .8,
                optimizationGoal: optimizationGoal,
                minimumNumberOfIterations: 3,
                maximumNumberOfIterations: 1000);

            // Create the optimizer.
            var optimizer = new SystemPerformanceOptimizer()
            {
                PerformanceEvaluationParallelOptions = { MaxDegreeOfParallelism = -1 },
                SampleGenerationParallelOptions = { MaxDegreeOfParallelism = -1 }
            };

            // Set optimization parameters.
            double rarity = 0.01;
            int sampleSize = 2000;

            // Solve the problem.
            var results = optimizer.Optimize(
                context,
                rarity,
                sampleSize);

            IndexPartition<double> optimalPartition =
                IndexPartition.Create(results.OptimalState);

            // Show the results.
            Console.WriteLine(
                "The Cross-Entropy optimizer has converged: {0}.",
                results.HasConverged);

            Console.WriteLine();
            Console.WriteLine("Initial guess parameter:");
            Console.WriteLine(context.InitialParameter);

            Console.WriteLine();
            Console.WriteLine("The minimizer of the performance is:");
            Console.WriteLine(results.OptimalState);

            Console.WriteLine();
            Console.WriteLine(
                "The optimal partition is:");
            Console.WriteLine(optimalPartition);

            Console.WriteLine();
            Console.WriteLine("The minimum performance is:");
            Console.WriteLine(results.OptimalPerformance);

            Console.WriteLine();
            Console.WriteLine("The Dunn Index for the optimal partition is:");
            var di = IndexPartition.DunnIndex(
                data,
                optimalPartition);
            Console.WriteLine(di);
        }
    }
}

// Executing method Main() produces the following output:
// 
// The data set:
// 1.00443413       1.00220674       0.998272394      1.00269053       0.996789222      1.00089097       1.00413588       
// 0.997933228      0.993625618      1.01576579       1.02088407       1.00505243       1.00840849       0.996769171      
// 1.01157148       0.993373518      1.01292534       1.00980881       0.985070715      0.999051426      1.00490173       
// 0.984314579      0.978064595      0.991518637      0.992424337      1.01228209       0.995401166      0.990271674      
// 5.99388101       5.98782501       5.99234977       6.00720306       5.99391035       5.99483769       6.01287673       
// 5.9959073        6.00295384       5.99060985       6.00210944       5.9964148        6.00144991       5.9847536        
// 6.00236976       6.00235032       6.00327886       6.01032821       5.98648754       6.0157819        5.98911535       
// 6.01095691       5.98513671       5.99782074       5.989109         6.0223965        6.01074213       6.00275269       
// 10.9874226       11.0066379       11.0105315       10.9944047       10.9989296       11.009567        10.9938497       
// 11.0044736       10.9993181       11.0126423       10.9974367       10.9946015       10.9885624       11.0013196       
// 10.9766909       10.9908217       11.012074        10.9924937       10.9886034       11.007384        10.9891406       
// 10.9974049       10.9925417       10.9969562       11.0226839       10.9973201       11.007219        11.005672        
// 
// 
// The Cross-Entropy optimizer has converged: True.
// 
// Initial guess parameter:
// 0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      
// 0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      
// 0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      0.333333333      
// 
// 
// 
// The minimizer of the performance is:
// 0                0                0                0                2                2                2                2                1                1                1                1                
// 
// 
// 
// The optimal partition is:
// [(0),  0, 1, 2, 3]
// [(1),  8, 9, 10, 11]
// [(2),  4, 5, 6, 7]
// 
// 
// The minimum performance is:
// 0.0034476806919971383
// 
// The Dunn Index for the optimal partition is:
// 253.85812808428824

Bibliography
[1] Rubinstein, R.Y. and Kroese, D.P., The Cross-Entropy Method, A unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, Springer, New York. (2004)
See Also