General Instructions

To use PyXAB, simply follow the instructions below. The domain and the algorithm must be defined beforehand. Hierarchical Partition is optional and normally binary partition works well. The objective must be able to evaluate each point the algorithm pulls and return the evaluated objective value.

Domain

The domain needs to be written in list of lists for a continuous domain. For example, if the parameter range is [0.01, 1], then the domain should be written as

domain = [[0.01, 1]]

If the parameter has two dimensions, say [-1, 1] x [2, 10], then the domain should be written as

domain = [[-1, 1], [2, 10]]

(Optional) Partition

The hierarchical partition is a core part of many X-armed bandit algorithms. It discretizes the infinite parameter space into finite number of arms in each layer hierarchically, so that finite-armed bandit algorithm designs can be utilized.

However, the design of the partition is completely optional and unnecessary in the experiments. PyXAB provides many designs in the package for the users to choose from, e.g., a standard binary partition would be

from PyXAB.partition.BinaryPartition import BinaryPartition
partition = BinaryPartition

By default, the standard binary partition will be used for all the algorithms if unspecified.

(User Defined) Objective

Note

The objective function f should be bounded by -1 and 1 for the best performance of most algorithms, i.e., -1 <= f(x) <= 1

Note

It is unnecessary to define the objective function in the following way, but for consistency we recommend doing so. As long as the objective function can return a reward to each point pulled by the algorithm, then the optimization process could run.

The objective function has an attribute fmax, which is the maximum reward obtainable. Besides, the objective function should have a function f(x), which will return the reward of the point x. See the following simple example for a better illustration.

from PyXAB.synthetic_obj.Objective import Objective
import numpy as np

# The sine function f(x) = sin(x)
class Sine(Objective):
    def __init__(self):
        self.fmax = 1

    def f(self, x):
        x = np.array(x)
        return np.sin(x)

Algorithm

Note

The point returned by the algorithm will be a list. Make sure your objective can deal with this data type. For example, if it wants the objective value at the point x = 0.8, it will return [0.8]. If the algorithm wants the objective value at x = (0, 0.5), the algorithm will return [0, 0.5].

Algorithms will always have one function named pull that outputs a point for evaluation, and the other function named receive_reward to get the feedback. Therefore, in the online learning process, the following lines of code should be used.

from PyXAB.algos.HOO import T_HOO
T = 1000
algo = T_HOO(rounds=T, domain=domain, partition=partition)
target = Sine()

# either for-loop or while-loop
for t in range(1, T+1):
    point = algo.pull(t)
    reward = target.f(point) + np.random.uniform(-0.1, 0.1)
    algo.receive_reward(t, reward)

Note

If the objective function is not defined by inheriting the PyXAB.synthetic_obj.Objective.Objective class, simply change the second last line in the above snippet to the evaluation of the objective.