X-Armed Bandit Algorithms¶

The API references for X-armed bandit algorithms. Please see the general algorithm class in API Cheatsheet.

Zooming Algorithm¶

class PyXAB.algos.Zooming.Zooming(nu=1, rho=0.9, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the Zooming algorithm

__init__(nu=1, rho=0.9, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Initialization of the Zooming algorithm

Parameters

nu (float) – smoothness parameter nu of the Zooming algorithm
rho (float) – smoothness parameter rho of the Zooming algorithm
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

get_last_point()¶

The function to get the last point of Zooming

Returns: chosen_point – The point chosen by the algorithm
Return type: list

make_active(node)¶

The function to make a node (an arm in the node) active

Parameters: node – the node to be made active

pull(time)¶

The pull function of Zooming that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of Zooming to obtain the reward and update the statistics, then expand the active arms

Parameters

time (int) – time stamp parameter
reward (float) – the reward of the evaluation

Truncated HOO Algorithm¶

class PyXAB.algos.HOO.HOO_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the HOO_node

__init__(depth, index, parent, domain)¶

Initialization of the HOO node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

compute_u_value(nu, rho, rounds)¶

The function to compute the u_{h,i} value of the node

Parameters

nu (float) – parameter nu in the HOO algorithm
rho (float) – parameter rho in the HOO algorithm
rounds (int) – the number of rounds in the HOO algorithm

get_b_value()¶: The function to get the b_{h,i} value of the node

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_mean_reward()¶: The function to get the mean reward of the node

get_parent()¶: The function to get the parent of the node

get_u_value()¶: The function to get the u_{h,i} value of the node

get_visited_times()¶: The function to get the number of visited times of the node

update_b_value(b_value)¶

The function to update the b_{h,i} value of the node

Parameters: b_value (float) – The new b_{h,i} value to be updated

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward list of the node

Parameters: reward (float) – the reward for evaluating the node

class PyXAB.algos.HOO.T_HOO(nu=1, rho=0.5, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

Implementation of the T_HOO algorithm

__init__(nu=1, rho=0.5, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Initialization of the T_HOO algorithm

Parameters

nu (float) – parameter nu of the T_HOO algorithm
rho (float) – parameter rho of the T_HOO algorithm
rounds (int) – total number of rounds
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

expand(parent)¶

The function to expand the tree after pulling the parent node

Parameters: parent – The parent node to be expanded

get_last_point()¶

The function to get the last point of HOO

Returns: chosen_point – The point chosen by the algorithm
Return type: list

optTraverse()¶

The function to traverse the exploration tree to find the best path and the best node to pull at this moment.

Returns

curr_node (Node) – The last node selected by the algorithm
path (List of Node) – The best path to traverse the partition selected by the algorithm

pull(time)¶

The pull function of T_HOO that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of T_HOO to obtain the reward and update the Statistics

Parameters

time (int) – time stamp parameter
reward (float) – the reward of the evaluation

updateAllTree(path, reward)¶

The function to update everything in the tree

Parameters

path (list) – the path from the root to the chosen node
reward (float) – the reward to update

updateBackwardTree()¶: The function to update all the b_{h,i} value backwards in the tree

updateRewardTree(path, reward)¶

The function to update the reward of each node in the path

Parameters

path (list) – the path to find the best node
reward (float) – the reward to update

updateUvalueTree()¶: The function to update the u_{h,i} value in the whole tree

DOO Algorithm¶

class PyXAB.algos.DOO.DOO_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the node in the DOO algorithm

__init__(depth, index, parent, domain)¶

Initialization of the DOO node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

compute_b_value(delta)¶

The function to compute the b_{h,i} value of the node

Parameters: delta (float) – The delta value in the b_{h,i} term, which depends on the depth of the node (Munos, 2011)

get_b_value()¶: The function to get the b_{h,i} value of the node

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_parent()¶: The function to get the parent of the node

get_reward()¶: The function to get the reward of the node

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward of the node

Parameters: reward (float) – the reward for evaluating the node

visit()¶: The function to visit the node

class PyXAB.algos.DOO.DOO(n=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the DOO algorithm (Munos, 2011)

__init__(n=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

The initialization of the DOO algorithm

Parameters

n (int) – The total number of rounds (budget)
delta (function) – The function to compute the delta value for each depth
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

delta_init(h)¶

The default delta function used in the algorithm (Munos, 2011)

Parameters: h (int) – The depth parameter
Returns: max_value – The delta value in that depth
Return type: float

get_last_point()¶

The function to get the last point in DOO

Returns: point – The output of the DOO algorithm at last
Return type: list

pull(time)¶

The pull function of DOO that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of DOO to obtain the reward and update Statistics

Parameters

time (int) – The time stamp parameter
reward (float) – The reward of the evaluation

SOO Algorithm¶

class PyXAB.algos.SOO.SOO_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the node in the SOO algorithm

__init__(depth, index, parent, domain)¶

Initialization of the SOO node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_parent()¶: The function to get the parent of the node

get_reward()¶: The function to get the reward of the node

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward of the node

Parameters: reward (float) – the reward for evaluating the node

visit()¶: The function to visit the node

class PyXAB.algos.SOO.SOO(n=100, h_max=100, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the SOO algorithm (Munos, 2011)

__init__(n=100, h_max=100, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

The initialization of the SOO algorithm

Parameters

n (int) – The total number of rounds (budget)
h_max (int) – The largest searching depth
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

get_last_point()¶

The function to get the last point in SOO

Returns: point – The output of the SOO algorithm at last
Return type: list

pull(time)¶

The pull function of SOO that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of SOO to obtain the reward and update Statistics (for current node)

Parameters

time (int) – The time stamp parameter
reward (float) – The reward of the evaluation

StoSOO Algorithm¶

class PyXAB.algos.StoSOO.StoSOO_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the node in the StoSOO algorithm

__init__(depth, index, parent, domain)¶

Initialization of the StoSOO node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

compute_b_value(n, k, delta)¶

The function to compute the b_{h,i} value of the node

Parameters

n (int) – The total number of rounds (budget)
k (int) – The maximum number of pulls per node
delta (float) – The confidence parameter

get_b_value()¶: The function to get the b_{h,i} value of the node

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_mean_reward()¶: The function to get the mean reward of the node

get_parent()¶: The function to get the parent of the node

get_visited_times()¶: The function to get the number of visited times of the node

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward list of the node

Parameters: reward (float) – the reward for evaluating the node

class PyXAB.algos.StoSOO.StoSOO(n=100, k=None, h_max=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the StoSOO algorithm (Valko et al., 2013)

__init__(n=100, k=None, h_max=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

The initialization of the StoSOO algorithm

Parameters

n (int) – The total number of rounds (budget)
k (int) – The maximum number of pulls per node
h_max (int) – The maximum depth limit
delta (float) – The confidence parameter delta
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

get_last_point()¶

The function to get the last point in StoSOO

Returns: point – The output of the StoSOO algorithm at last
Return type: list

pull(time)¶

The pull function of StoSOO that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of StoSOO to obtain the reward and update the Statistics

Parameters

time (int) – The time stamp parameter
reward (float) – the reward of the evaluation

HCT Algorithm¶

class PyXAB.algos.HCT.HCT_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of HCT node

__init__(depth, index, parent, domain)¶

Initialization of the HCT node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

compute_u_value(nu, rho, c, delta_tilde)¶

The function to compute the u_{h,i} value of the node

Parameters

nu (float) – parameter nu in the HOO algorithm
rho (float) – parameter rho in the HOO algorithm
rounds (int) – the number of rounds in the HOO algorithm

get_b_value()¶: The function to get the b_{h,i} value of the node

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_mean_reward()¶: The function to get the mean reward of the node

get_parent()¶: The function to get the parent of the node

get_u_value()¶: The function to get the u_{h,i} value of the node

get_visited_times()¶: The function to get the number of visited times of the node

update_b_value(b_value)¶

The function to update the b_{h,i} value of the node

Parameters: b_value (float) – The new b_{h,i} value to be updated

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward list of the node

Parameters: reward (float) – the reward for evaluating the node

class PyXAB.algos.HCT.HCT(nu=1, rho=0.5, c=0.1, delta=0.01, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

Implementation of the HCT algorithm

__init__(nu=1, rho=0.5, c=0.1, delta=0.01, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Initialization of the HCT algorithm

Parameters

nu (float) – parameter nu of the HCT algorithm
rho (float) – parameter rho of the HCT algorithm
c (float) – parameter c of the HCT algorithm
delta (float) – confidence parameter delta of the HCT algorithm
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

expand(parent)¶

The function to expand the tree at the parent node

Parameters: parent – The parent node to be expanded

get_last_point()¶

The function to get the last point of HCT

Returns: chosen_point – The point chosen by the algorithm
Return type: list

optTraverse()¶

The function to traverse the exploration tree to find the best path and the best node to pull at this moment.

Returns

curr_node (Node) – The last node selected by the algorithm
path (List of Node) – The best path to traverse the partition selected by the algorithm

pull(time)¶

The pull function of HCT that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of HCT to obtain the reward and update the Statistics

Parameters

time (int) – time stamp parameter
reward (float) – the reward of the evaluation

updateAllTree(path, reward)¶

The function to update everything in the tree

Parameters

path (list) – the path from the root to the chosen node
reward (float) – the reward to update

updateBackwardTree()¶: The function to update all the b_{h,i} value backwards in the tree

updateRewardTree(path, reward)¶

The function to update the reward of each node in the path

Parameters

path (list) – the path to find the best node
reward (float) – the reward to update

updateUvalueTree()¶: The function to update the u_{h,i} value in the whole tree

POO Algorithm¶

class PyXAB.algos.POO.POO(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶

Bases: PyXAB.algos.Algo.Algorithm

Implementation of the Parallel Optimistic Optimization (POO) algorithm (Grill et al., 2015), with the general definition in Shang et al., 2019.

__init__(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶

Parameters

numax (float) – parameter nu_max in the algorithm
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used
rounds (int) – the number of rounds/budget
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process
algo – the baseline algorithm used by the wrapper, such as T_HOO or HCT

get_last_point()¶: The function that returns the last point chosen by POO

pull(time)¶

The pull function of POO that returns a point to be evaluated

Parameters: time (int) – The time step of the online process.
Returns: point – The point chosen by the POO algorithm
Return type: list

receive_reward(time, reward)¶

The receive_reward function of POO to receive the reward for the chosen point

Parameters

time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point

GPO Algorithm¶

class PyXAB.algos.GPO.GPO(numax=1.0, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶

Bases: PyXAB.algos.Algo.Algorithm

Implementation of the General Parallel Optimization (GPO) algorithm (Shang et al., 2019)

__init__(numax=1.0, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶

Initialization of the wrapper algorithm

Parameters

numax (float) – parameter nu_max in the algorithm (default 1.0)
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used (default 0.9)
rounds (int) – the number of rounds/budget (default 1000)
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process
algo – the baseline algorithm used by the wrapper, such as T_HOO or HCT

get_last_point()¶: The function to get the last point in GPO

pull(time)¶

The pull function of GPO that returns a point to be evaluated

Parameters: time (int) – The time step of the online process.
Returns: point – The point chosen by the GPO algorithm
Return type: list

receive_reward(time, reward)¶

The receive_reward function of GPO to receive the reward for the chosen point

Parameters

time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point

PCT Algorithm¶

class PyXAB.algos.PCT.PCT(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

Implementation of Parallel Confidence Tree (Shang et al., 2019) algorithm

__init__(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Initialization of the PCT algorithm

Parameters

numax (float) – parameter nu_max in the algorithm
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used
rounds (int) – the number of rounds/budget
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process

get_last_point()¶: The function to get the last point for PCT

pull(time)¶

The pull function of PCT that returns a point to be evaluated

Parameters: time (int) – The time step of the online process.
Returns: point – The point chosen by the PCT algorithm
Return type: list

receive_reward(time, reward)¶

The receive_reward function of PCT to receive the reward for the chosen point

Parameters

time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point

SequOOL Algorithm¶

class PyXAB.algos.SequOOL.SequOOL_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the SequOOL node

__init__(depth, index, parent, domain)¶: Initialization of the SequOOL node :param depth: fepth of the node :type depth: int :param index: index of the node :type index: int :param parent: parent node of the current node :param domain: domain that this node represents :type domain: list(list)

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_parent()¶: The function to get the parent of the node

get_reward()¶: The function to get the reward of the node

not_opened()¶: The function to get the status of the node (opened or not)

open()¶: The function to open a node

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward list of the node

Parameters: reward (float) – the reward for evaluating the node

class PyXAB.algos.SequOOL.SequOOL(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the SequOOL algorithm (Barlett, 2019)

__init__(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

The initialization of the SequOOL algorithm

Parameters

n (int) – The totdal number of rounds (budget)
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

get_last_point()¶

The function to get the last point in SequOOL

Returns: point – The output of the SequOOL algorithm at last
Return type: list

static harmonic_series_sum(n)¶

A static method for computing the summation of harmonic series

Parameters: n (int) – The number of terms in the summation
Returns: res – The sum of the series
Return type: float

pull(t)¶

The pull function of SequOOL that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(t, reward)¶

The receive_reward function of SequOOL to obtain the reward and update Statistics

Parameters

t (int) – The time stamp parameter
reward (float) – The reward of the evaluation

StroquOOL Algorithm¶

class PyXAB.algos.StroquOOL.StroquOOL_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the node in the StroquOOL algorithm

__init__(depth, index, parent, domain)¶

Initialization of the StroquOOL node

depth: int: depth of the node
index: int: index of the node
parent:: parent node of the current node
domain: list(list): domain that this node represents

compute_mean_reward()¶: The function to compute the mean of the reward list of the node

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_mean_reward()¶: The function to get the mean of the reward list of the node

get_parent()¶: The function to get the parent of the node

get_visited_times()¶: The function to get the number of visited times of the node

not_opened()¶: The function to get the status of the node (opened or not)

open_node()¶: The function to open a node

remove_reward()¶: The function to clear the reward list of a node

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward list of the node

Parameters: reward (float) – the reward for evaluating the node

class PyXAB.algos.StroquOOL.StroquOOL(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the StroquOOL algorithm (Bartlett, 2019)

__init__(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

The initialization of the StroquOOL algorithm

Parameters

n (int) – The total number of rounds (budget)
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

get_last_point()¶

The function to get the last point in StroquOOL

Returns: point – The output of the StroquOOL algorithm at last
Return type: list

static harmonic_series_sum(n)¶

A static method for computing the summation of harmonic series

Parameters: n (int) – The number of terms in the summation
Returns: res – The sum of the series
Return type: float

pull(time)¶

The pull function of StroquOOL that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(t, reward)¶

The receive_reward function of StroquOOL to obtain the reward and update Statistics. If the algorithm has ended but there is still time left, then this function just passes

Parameters

t (int) – The time stamp parameter
reward (float) – The reward of the evaluation

reset_p()¶: The function to reset p for current situation

VROOM Algorithm¶

class PyXAB.algos.VROOM.VROOM_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of the node in the VROOM algorithm

__init__(depth, index, parent, domain)¶

Initialization of the VROOM node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

add_rank(rank)¶

The method to set the rank of the cell

Parameters: rank (int) – the rank of the cell at current depth

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_eval_time()¶: The function to get the evaluation time of the node

get_index()¶: The function to get the index of the node

get_mean_reward()¶: The function to get the mean of the reward of the node

get_parent()¶: The function to get the parent of the node

get_rank()¶

The function to get the rank of the cell

Returns: rank – the rank of the cell at current depth
Return type: int

get_reward_tilde()¶: The function to get the reward tilde statistic of the node

sample_uniform()¶

The function to uniformly sample a point from the domain of the node

Returns: res – the point sampled by the sampler
Return type: list

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward of the node

Parameters: reward (float) – the reward for evaluating the node

update_reward_tilde(reward)¶

The function to update the reward tilde of the node

Parameters: reward (float) – the reward tilde statistc of the node

class PyXAB.algos.VROOM.VROOM(n=100, h_max=100, b=None, f_max=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the VROOM algorithm (Ammar, Haitham, et al., 2020)

__init__(n=100, h_max=100, b=None, f_max=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

The initialization of the VROOM algorithm

Parameters

n (int) – The total number of rounds (budget)
h_max (int) – The number bounds the depth of the searching tree
b (float) – The parameter that measures the variation of the function
f_max (float) – An upper bound of the objective function
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

get_last_point()¶

The function to get the last point in VROOM

Returns: point – The output of the VROOM algorithm at last
Return type: list

pull(time)¶

The pull function of VROOM that returns a point in every bound

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

rank(nodes)¶

The rank function of VROOM that rank nodes at the same depth

Parameters: nodes (list) – a list of node at the same depth

receive_reward(time, reward)¶

The receive_reward function of VROOM to obtain the reward and update Statistics (for current node)

Parameters

time (int) – The time stamp parameter
reward (float) – The reward of the evaluation

VHCT Algorithm¶

class PyXAB.algos.VHCT.VHCT_node(depth, index, parent, domain)¶

Bases: PyXAB.partition.Node.P_node

Implementation of VHCT node

__init__(depth, index, parent, domain)¶

Initialization of the VHCT node

Parameters

depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents

compute_tau_hi_value(nu, rho, c, bound, delta_tilde)¶

The function to compute the threshold tau_hi value for the VHCT node

Parameters

nu (float) – parameter nu of the VHCT algorithm
rho (float) – parameter rho of the VHCT algorithm
c (float) – parameter c of the VHCT algorithm
bound (float) – parameter bound of the VHCT algorithm, the noise bound
delta_tilde (float) – modified confidence parameter delta_tilde of the VHCT algorithm

compute_u_value(nu, rho, c, bound, delta_tilde)¶

The function to compute the u_{h,i} value of the node

Parameters

nu (float) – parameter nu in the HOO algorithm
rho (float) – parameter rho in the HOO algorithm
rounds (int) – the number of rounds in the HOO algorithm

get_b_value()¶: The function to get the b_{h,i} value of the node

get_children()¶: The function to get the children of the node

get_cpoint()¶: The function to get the center point of the domain

get_depth()¶: The function to get the depth of the node

get_domain()¶: The function to get the domain of the node

get_index()¶: The function to get the index of the node

get_mean_reward()¶: The function to get the mean reward of the node

get_parent()¶: The function to get the parent of the node

get_tau_hi_value()¶: The function to get the tau_hi value of the node

get_u_value()¶: The function to get the u_{h,i} value of the node

get_visited_times()¶: The function to get the number of visited times of the node

update_b_value(b_value)¶

The function to update the b_{h,i} value of the node

Parameters: b_value (float) – The new b_{h,i} value to be updated

update_children(children)¶

The function to update the children of the node

Parameters: children – The children nodes to be updated

update_reward(reward)¶

The function to update the reward list of the node

Parameters: reward (float) – the reward for evaluating the node

class PyXAB.algos.VHCT.VHCT(nu=1, rho=0.5, c=0.1, delta=0.01, bound=1, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

The implementation of the Variance High Confidence Tree algorithm

__init__(nu=1, rho=0.5, c=0.1, delta=0.01, bound=1, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Initialization of the VHCT algorithm

Parameters

nu (float) – parameter nu of the VHCT algorithm
rho (float) – parameter rho of the VHCT algorithm
c (float) – parameter c of the VHCT algorithm
delta (float) – confidence parameter delta of the VHCT algorithm
bound (float) – the noise upper bound parameter bound
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm

expand(parent)¶

The function to expand the tree at the parent node

Parameters: parent – The parent node to be expanded

get_last_point()¶

The function to get the last point of HCT

Returns: chosen_point – The point chosen by the algorithm
Return type: list

optTraverse()¶

The function to traverse the exploration tree to find the best path and the best node to pull at this moment.

Returns

curr_node (Node) – The last node selected by the algorithm
path (List of Node) – The best path to traverse the partition selected by the algorithm

pull(time)¶

The pull function of VHCT that returns a point in every round

Parameters: time (int) – time stamp parameter
Returns: point – the point to be evaluated
Return type: list

receive_reward(time, reward)¶

The receive_reward function of VHCT to obtain the reward and update the Statistics

Parameters

time (int) – time stamp parameter
reward (float) – the reward of the evaluation

updateAllTree(path, reward)¶

The function to update everything in the tree

Parameters

path (list) – the path from the root to the chosen node
reward (float) – the reward to update

updateBackwardTree()¶: The function to update all the b_{h,i} value backwards in the tree

updateRewardTree(path, reward)¶

The function to update the reward of each node in the path

Parameters

path (list) – the path to find the best node
reward (float) – the reward to update

updateUvalueTree()¶: The function to update the u_{h,i} value in the whole tree

VPCT Algorithm¶

class algos.VPCT.VPCT(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Bases: PyXAB.algos.Algo.Algorithm

Implementation of Variance-reduced Parallel Confidence Tree algorithm (VHCT + GPO)

__init__(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶

Initialization of the VPCT algorithm

Parameters

numax (float) – parameter nu_max in the algorithm
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used
rounds (int) – the number of rounds/budget
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process

get_last_point()¶: The function to get the last point of VPCT.

pull(time)¶

The pull function of VPCT that returns a point to be evaluated

Parameters: time (int) – The time step of the online process.
Returns: point – The point chosen by the VPCT algorithm
Return type: list

receive_reward(time, reward)¶

The receive_reward function of VPCT to receive the reward for the chosen point

Parameters

time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point

X-Armed Bandit Algorithms¶

Zooming Algorithm¶

Truncated HOO Algorithm¶

DOO Algorithm¶

SOO Algorithm¶

StoSOO Algorithm¶

HCT Algorithm¶

POO Algorithm¶

GPO Algorithm¶

PCT Algorithm¶

SequOOL Algorithm¶

StroquOOL Algorithm¶

VROOM Algorithm¶

VHCT Algorithm¶

VPCT Algorithm¶

PyXAB

Navigation

Related Topics