AdaEnsemble is a library for building adaptive ensembles, which consist in a multi armed bandit algorithm choosing among machine learning models and gradually learning which ML model is best, or best given a particular context. These ensembles consist of an Ensemble object which “contains” two or more Model objects and an equal number of Distribution objects that contain the representation of the reward associated with each model. This document will outline the specifics of different types of ensembles, models and rewards, and updated as the library expands.
The library “AdaEnsemble” contains three broad types of ensembles: “stackable” ensembles of types 1 and 2, and contextual ensembles.
All three types of ensembles are stackable: they can contain other ensembles as models. This results in a hierarchy of ensembles, each choosing (on selection by the layer above) a member model, which can be an ensemble in its own right or another kind of model. The only limitation is that “stackable” ensembles of types 1 and 2 cannot be mixed with contextual ensembles.
Ensemble type | Description |
---|---|
Stackable ensemble of type 1 | Model selection does not depend on data |
Stackable ensemble of type 2 | Model selection depends on data |
Contextual ensemble | Model selection depends on a context different from the data passed to the selected Model object |
The different elements of an adaptive ensemble (Ensemble object, Models and Distributions that represent rewards) are tied together by type parameters. The five type parameters, of which not all apply to every object are ModelID, Context, ModelData, ModelAction and AggregateReward.
These type parameters structure the library as a whole, and putting together an adaptive ensemble usually requires that they match for the different components.
Constraints on type parameters
Ensembles that use unconditional Thompson Sampling are constrained to employ an AggregateReward that is a BetaDistribution, and Ensembles that use conditional Thompson Sampling are constrained to use BayesianSampleRegressionDistribution.
So far, three kinds of models are available:
The most versatile of these are the onnx models, as they enable the incorporation of all or almost all machine learning models developed and trained in scikit-learn, TensorFlow, PyTorch, and many other machine learning frameworks. The only downside is that onnx does not allow continued training of the models – onnx models are static.
Constraints on type parameters
Models impose constraints on the type parameter values that are available to an ensemble. They are summarised in the following table.
Static models | No constraints |
---|---|
Onnx models | In OnnxTensor.createTensor(env, data), data is of type ModelData and must be compatible with the createTensor function |
Bayesian regression models | ModelData must be of type Array[Double] and ModelAction of type Double |
Distribution is the type any representation of the reward associated with a model must take. Such a representation can either be a SimpleDistribution in case the reward does not depend on any data, or a ConditionalDistribution[A] in case it does.
The available implementations of SimpleDistribution are the following:
The available implementations of ConditionalDistribution are the following:
Constraints on type parameters
ConditionalDistribution impose the constraint that the type parameter Context it takes is a subtype of Array[Double]. For a contextual distribution, the ensemble type parameter Context must thus be a subtype of Array[Double], while for a stackable ensemble of type 2, ModelData must be a subtype of Array[Double].
Here, for example, is the way to create an epsilon-greedy ensemble, which picks the model with the highest reward with probability (1 – epsilon) and randomly one of the others with probability epsilon. The ModelID type is Int, the ModelData type is Array[Double], the ModelAction type is Double and the reward distributions is MeanDouble.
1. Create a static model, which always returns the same value:
new StaticModel[Int, Array[Double], MeanDouble](0.0)
2. Create a bayesian regression model:
new BayesianMeanRegressionModel[Int, MeanDouble](5, alpha=0.3, beta=5.0)
3. Create the rewards:
List(new MeanDouble, new MeanDouble)
4. Create the ensemble:
new GreedyEnsemble[Int, Array[Double], Double, MeanDouble](
models=(i: Int) => modelsMap(i),
modelKeys=() => List(0, 1),
modelRewards=rewards,
epsilon= 0.3)
My hope is that the hierarchical combination of exploratory multi armed bandits with fixed or slowly evolving neural and other machine learning algorithms has many applications.
Three I have thought of: