X-Armed Bandit Algorithms¶
The API references for X-armed bandit algorithms. Please see the general algorithm class in API Cheatsheet.
Zooming Algorithm¶
- class PyXAB.algos.Zooming.Zooming(nu=1, rho=0.9, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the Zooming algorithm
- __init__(nu=1, rho=0.9, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Initialization of the Zooming algorithm
- Parameters
nu (float) – smoothness parameter nu of the Zooming algorithm
rho (float) – smoothness parameter rho of the Zooming algorithm
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- get_last_point()¶
The function to get the last point of Zooming
- Returns
chosen_point – The point chosen by the algorithm
- Return type
list
- make_active(node)¶
The function to make a node (an arm in the node) active
- Parameters
node – the node to be made active
- pull(time)¶
The pull function of Zooming that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of Zooming to obtain the reward and update the statistics, then expand the active arms
- Parameters
time (int) – time stamp parameter
reward (float) – the reward of the evaluation
Truncated HOO Algorithm¶
- class PyXAB.algos.HOO.HOO_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the HOO_node
- __init__(depth, index, parent, domain)¶
Initialization of the HOO node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- compute_u_value(nu, rho, rounds)¶
The function to compute the u_{h,i} value of the node
- Parameters
nu (float) – parameter nu in the HOO algorithm
rho (float) – parameter rho in the HOO algorithm
rounds (int) – the number of rounds in the HOO algorithm
- get_b_value()¶
The function to get the b_{h,i} value of the node
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_mean_reward()¶
The function to get the mean reward of the node
- get_parent()¶
The function to get the parent of the node
- get_u_value()¶
The function to get the u_{h,i} value of the node
- get_visited_times()¶
The function to get the number of visited times of the node
- update_b_value(b_value)¶
The function to update the b_{h,i} value of the node
- Parameters
b_value (float) – The new b_{h,i} value to be updated
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward list of the node
- Parameters
reward (float) – the reward for evaluating the node
- class PyXAB.algos.HOO.T_HOO(nu=1, rho=0.5, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmImplementation of the T_HOO algorithm
- __init__(nu=1, rho=0.5, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Initialization of the T_HOO algorithm
- Parameters
nu (float) – parameter nu of the T_HOO algorithm
rho (float) – parameter rho of the T_HOO algorithm
rounds (int) – total number of rounds
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- expand(parent)¶
The function to expand the tree after pulling the parent node
- Parameters
parent – The parent node to be expanded
- get_last_point()¶
The function to get the last point of HOO
- Returns
chosen_point – The point chosen by the algorithm
- Return type
list
- optTraverse()¶
The function to traverse the exploration tree to find the best path and the best node to pull at this moment.
- Returns
curr_node (Node) – The last node selected by the algorithm
path (List of Node) – The best path to traverse the partition selected by the algorithm
- pull(time)¶
The pull function of T_HOO that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of T_HOO to obtain the reward and update the Statistics
- Parameters
time (int) – time stamp parameter
reward (float) – the reward of the evaluation
- updateAllTree(path, reward)¶
The function to update everything in the tree
- Parameters
path (list) – the path from the root to the chosen node
reward (float) – the reward to update
- updateBackwardTree()¶
The function to update all the b_{h,i} value backwards in the tree
- updateRewardTree(path, reward)¶
The function to update the reward of each node in the path
- Parameters
path (list) – the path to find the best node
reward (float) – the reward to update
- updateUvalueTree()¶
The function to update the u_{h,i} value in the whole tree
DOO Algorithm¶
- class PyXAB.algos.DOO.DOO_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the node in the DOO algorithm
- __init__(depth, index, parent, domain)¶
Initialization of the DOO node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- compute_b_value(delta)¶
The function to compute the b_{h,i} value of the node
- Parameters
delta (float) – The delta value in the b_{h,i} term, which depends on the depth of the node (Munos, 2011)
- get_b_value()¶
The function to get the b_{h,i} value of the node
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_parent()¶
The function to get the parent of the node
- get_reward()¶
The function to get the reward of the node
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward of the node
- Parameters
reward (float) – the reward for evaluating the node
- visit()¶
The function to visit the node
- class PyXAB.algos.DOO.DOO(n=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the DOO algorithm (Munos, 2011)
- __init__(n=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
The initialization of the DOO algorithm
- Parameters
n (int) – The total number of rounds (budget)
delta (function) – The function to compute the delta value for each depth
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- delta_init(h)¶
The default delta function used in the algorithm (Munos, 2011)
- Parameters
h (int) – The depth parameter
- Returns
max_value – The delta value in that depth
- Return type
float
- get_last_point()¶
The function to get the last point in DOO
- Returns
point – The output of the DOO algorithm at last
- Return type
list
- pull(time)¶
The pull function of DOO that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of DOO to obtain the reward and update Statistics
- Parameters
time (int) – The time stamp parameter
reward (float) – The reward of the evaluation
SOO Algorithm¶
- class PyXAB.algos.SOO.SOO_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the node in the SOO algorithm
- __init__(depth, index, parent, domain)¶
Initialization of the SOO node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_parent()¶
The function to get the parent of the node
- get_reward()¶
The function to get the reward of the node
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward of the node
- Parameters
reward (float) – the reward for evaluating the node
- visit()¶
The function to visit the node
- class PyXAB.algos.SOO.SOO(n=100, h_max=100, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the SOO algorithm (Munos, 2011)
- __init__(n=100, h_max=100, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
The initialization of the SOO algorithm
- Parameters
n (int) – The total number of rounds (budget)
h_max (int) – The largest searching depth
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- get_last_point()¶
The function to get the last point in SOO
- Returns
point – The output of the SOO algorithm at last
- Return type
list
- pull(time)¶
The pull function of SOO that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of SOO to obtain the reward and update Statistics (for current node)
- Parameters
time (int) – The time stamp parameter
reward (float) – The reward of the evaluation
StoSOO Algorithm¶
- class PyXAB.algos.StoSOO.StoSOO_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the node in the StoSOO algorithm
- __init__(depth, index, parent, domain)¶
Initialization of the StoSOO node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- compute_b_value(n, k, delta)¶
The function to compute the b_{h,i} value of the node
- Parameters
n (int) – The total number of rounds (budget)
k (int) – The maximum number of pulls per node
delta (float) – The confidence parameter
- get_b_value()¶
The function to get the b_{h,i} value of the node
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_mean_reward()¶
The function to get the mean reward of the node
- get_parent()¶
The function to get the parent of the node
- get_visited_times()¶
The function to get the number of visited times of the node
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward list of the node
- Parameters
reward (float) – the reward for evaluating the node
- class PyXAB.algos.StoSOO.StoSOO(n=100, k=None, h_max=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the StoSOO algorithm (Valko et al., 2013)
- __init__(n=100, k=None, h_max=100, delta=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
The initialization of the StoSOO algorithm
- Parameters
n (int) – The total number of rounds (budget)
k (int) – The maximum number of pulls per node
h_max (int) – The maximum depth limit
delta (float) – The confidence parameter delta
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- get_last_point()¶
The function to get the last point in StoSOO
- Returns
point – The output of the StoSOO algorithm at last
- Return type
list
- pull(time)¶
The pull function of StoSOO that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of StoSOO to obtain the reward and update the Statistics
- Parameters
time (int) – The time stamp parameter
reward (float) – the reward of the evaluation
HCT Algorithm¶
- class PyXAB.algos.HCT.HCT_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of HCT node
- __init__(depth, index, parent, domain)¶
Initialization of the HCT node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- compute_u_value(nu, rho, c, delta_tilde)¶
The function to compute the u_{h,i} value of the node
- Parameters
nu (float) – parameter nu in the HOO algorithm
rho (float) – parameter rho in the HOO algorithm
rounds (int) – the number of rounds in the HOO algorithm
- get_b_value()¶
The function to get the b_{h,i} value of the node
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_mean_reward()¶
The function to get the mean reward of the node
- get_parent()¶
The function to get the parent of the node
- get_u_value()¶
The function to get the u_{h,i} value of the node
- get_visited_times()¶
The function to get the number of visited times of the node
- update_b_value(b_value)¶
The function to update the b_{h,i} value of the node
- Parameters
b_value (float) – The new b_{h,i} value to be updated
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward list of the node
- Parameters
reward (float) – the reward for evaluating the node
- class PyXAB.algos.HCT.HCT(nu=1, rho=0.5, c=0.1, delta=0.01, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmImplementation of the HCT algorithm
- __init__(nu=1, rho=0.5, c=0.1, delta=0.01, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Initialization of the HCT algorithm
- Parameters
nu (float) – parameter nu of the HCT algorithm
rho (float) – parameter rho of the HCT algorithm
c (float) – parameter c of the HCT algorithm
delta (float) – confidence parameter delta of the HCT algorithm
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- expand(parent)¶
The function to expand the tree at the parent node
- Parameters
parent – The parent node to be expanded
- get_last_point()¶
The function to get the last point of HCT
- Returns
chosen_point – The point chosen by the algorithm
- Return type
list
- optTraverse()¶
The function to traverse the exploration tree to find the best path and the best node to pull at this moment.
- Returns
curr_node (Node) – The last node selected by the algorithm
path (List of Node) – The best path to traverse the partition selected by the algorithm
- pull(time)¶
The pull function of HCT that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of HCT to obtain the reward and update the Statistics
- Parameters
time (int) – time stamp parameter
reward (float) – the reward of the evaluation
- updateAllTree(path, reward)¶
The function to update everything in the tree
- Parameters
path (list) – the path from the root to the chosen node
reward (float) – the reward to update
- updateBackwardTree()¶
The function to update all the b_{h,i} value backwards in the tree
- updateRewardTree(path, reward)¶
The function to update the reward of each node in the path
- Parameters
path (list) – the path to find the best node
reward (float) – the reward to update
- updateUvalueTree()¶
The function to update the u_{h,i} value in the whole tree
POO Algorithm¶
- class PyXAB.algos.POO.POO(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶
Bases:
PyXAB.algos.Algo.AlgorithmImplementation of the Parallel Optimistic Optimization (POO) algorithm (Grill et al., 2015), with the general definition in Shang et al., 2019.
- __init__(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶
- Parameters
numax (float) – parameter nu_max in the algorithm
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used
rounds (int) – the number of rounds/budget
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process
algo – the baseline algorithm used by the wrapper, such as T_HOO or HCT
- get_last_point()¶
The function that returns the last point chosen by POO
- pull(time)¶
The pull function of POO that returns a point to be evaluated
- Parameters
time (int) – The time step of the online process.
- Returns
point – The point chosen by the POO algorithm
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of POO to receive the reward for the chosen point
- Parameters
time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point
GPO Algorithm¶
- class PyXAB.algos.GPO.GPO(numax=1.0, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶
Bases:
PyXAB.algos.Algo.AlgorithmImplementation of the General Parallel Optimization (GPO) algorithm (Shang et al., 2019)
- __init__(numax=1.0, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>, algo=None)¶
Initialization of the wrapper algorithm
- Parameters
numax (float) – parameter nu_max in the algorithm (default 1.0)
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used (default 0.9)
rounds (int) – the number of rounds/budget (default 1000)
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process
algo – the baseline algorithm used by the wrapper, such as T_HOO or HCT
- get_last_point()¶
The function to get the last point in GPO
- pull(time)¶
The pull function of GPO that returns a point to be evaluated
- Parameters
time (int) – The time step of the online process.
- Returns
point – The point chosen by the GPO algorithm
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of GPO to receive the reward for the chosen point
- Parameters
time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point
PCT Algorithm¶
- class PyXAB.algos.PCT.PCT(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmImplementation of Parallel Confidence Tree (Shang et al., 2019) algorithm
- __init__(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Initialization of the PCT algorithm
- Parameters
numax (float) – parameter nu_max in the algorithm
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used
rounds (int) – the number of rounds/budget
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process
- get_last_point()¶
The function to get the last point for PCT
- pull(time)¶
The pull function of PCT that returns a point to be evaluated
- Parameters
time (int) – The time step of the online process.
- Returns
point – The point chosen by the PCT algorithm
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of PCT to receive the reward for the chosen point
- Parameters
time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point
SequOOL Algorithm¶
- class PyXAB.algos.SequOOL.SequOOL_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the SequOOL node
- __init__(depth, index, parent, domain)¶
Initialization of the SequOOL node :param depth: fepth of the node :type depth: int :param index: index of the node :type index: int :param parent: parent node of the current node :param domain: domain that this node represents :type domain: list(list)
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_parent()¶
The function to get the parent of the node
- get_reward()¶
The function to get the reward of the node
- not_opened()¶
The function to get the status of the node (opened or not)
- open()¶
The function to open a node
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward list of the node
- Parameters
reward (float) – the reward for evaluating the node
- class PyXAB.algos.SequOOL.SequOOL(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the SequOOL algorithm (Barlett, 2019)
- __init__(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
The initialization of the SequOOL algorithm
- Parameters
n (int) – The totdal number of rounds (budget)
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- get_last_point()¶
The function to get the last point in SequOOL
- Returns
point – The output of the SequOOL algorithm at last
- Return type
list
- static harmonic_series_sum(n)¶
A static method for computing the summation of harmonic series
- Parameters
n (int) – The number of terms in the summation
- Returns
res – The sum of the series
- Return type
float
- pull(t)¶
The pull function of SequOOL that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(t, reward)¶
The receive_reward function of SequOOL to obtain the reward and update Statistics
- Parameters
t (int) – The time stamp parameter
reward (float) – The reward of the evaluation
StroquOOL Algorithm¶
- class PyXAB.algos.StroquOOL.StroquOOL_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the node in the StroquOOL algorithm
- __init__(depth, index, parent, domain)¶
Initialization of the StroquOOL node
- depth: int
depth of the node
- index: int
index of the node
- parent:
parent node of the current node
- domain: list(list)
domain that this node represents
- compute_mean_reward()¶
The function to compute the mean of the reward list of the node
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_mean_reward()¶
The function to get the mean of the reward list of the node
- get_parent()¶
The function to get the parent of the node
- get_visited_times()¶
The function to get the number of visited times of the node
- not_opened()¶
The function to get the status of the node (opened or not)
- open_node()¶
The function to open a node
- remove_reward()¶
The function to clear the reward list of a node
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward list of the node
- Parameters
reward (float) – the reward for evaluating the node
- class PyXAB.algos.StroquOOL.StroquOOL(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the StroquOOL algorithm (Bartlett, 2019)
- __init__(n=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
The initialization of the StroquOOL algorithm
- Parameters
n (int) – The total number of rounds (budget)
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- get_last_point()¶
The function to get the last point in StroquOOL
- Returns
point – The output of the StroquOOL algorithm at last
- Return type
list
- static harmonic_series_sum(n)¶
A static method for computing the summation of harmonic series
- Parameters
n (int) – The number of terms in the summation
- Returns
res – The sum of the series
- Return type
float
- pull(time)¶
The pull function of StroquOOL that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(t, reward)¶
The receive_reward function of StroquOOL to obtain the reward and update Statistics. If the algorithm has ended but there is still time left, then this function just passes
- Parameters
t (int) – The time stamp parameter
reward (float) – The reward of the evaluation
- reset_p()¶
The function to reset p for current situation
VROOM Algorithm¶
- class PyXAB.algos.VROOM.VROOM_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of the node in the VROOM algorithm
- __init__(depth, index, parent, domain)¶
Initialization of the VROOM node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- add_rank(rank)¶
The method to set the rank of the cell
- Parameters
rank (int) – the rank of the cell at current depth
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_eval_time()¶
The function to get the evaluation time of the node
- get_index()¶
The function to get the index of the node
- get_mean_reward()¶
The function to get the mean of the reward of the node
- get_parent()¶
The function to get the parent of the node
- get_rank()¶
The function to get the rank of the cell
- Returns
rank – the rank of the cell at current depth
- Return type
int
- get_reward_tilde()¶
The function to get the reward tilde statistic of the node
- sample_uniform()¶
The function to uniformly sample a point from the domain of the node
- Returns
res – the point sampled by the sampler
- Return type
list
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward of the node
- Parameters
reward (float) – the reward for evaluating the node
- update_reward_tilde(reward)¶
The function to update the reward tilde of the node
- Parameters
reward (float) – the reward tilde statistc of the node
- class PyXAB.algos.VROOM.VROOM(n=100, h_max=100, b=None, f_max=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the VROOM algorithm (Ammar, Haitham, et al., 2020)
- __init__(n=100, h_max=100, b=None, f_max=None, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
The initialization of the VROOM algorithm
- Parameters
n (int) – The total number of rounds (budget)
h_max (int) – The number bounds the depth of the searching tree
b (float) – The parameter that measures the variation of the function
f_max (float) – An upper bound of the objective function
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- get_last_point()¶
The function to get the last point in VROOM
- Returns
point – The output of the VROOM algorithm at last
- Return type
list
- pull(time)¶
The pull function of VROOM that returns a point in every bound
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- rank(nodes)¶
The rank function of VROOM that rank nodes at the same depth
- Parameters
nodes (list) – a list of node at the same depth
- receive_reward(time, reward)¶
The receive_reward function of VROOM to obtain the reward and update Statistics (for current node)
- Parameters
time (int) – The time stamp parameter
reward (float) – The reward of the evaluation
VHCT Algorithm¶
- class PyXAB.algos.VHCT.VHCT_node(depth, index, parent, domain)¶
Bases:
PyXAB.partition.Node.P_nodeImplementation of VHCT node
- __init__(depth, index, parent, domain)¶
Initialization of the VHCT node
- Parameters
depth (int) – depth of the node
index (int) – index of the node
parent – parent node of the current node
domain (list(list)) – domain that this node represents
- compute_tau_hi_value(nu, rho, c, bound, delta_tilde)¶
The function to compute the threshold tau_hi value for the VHCT node
- Parameters
nu (float) – parameter nu of the VHCT algorithm
rho (float) – parameter rho of the VHCT algorithm
c (float) – parameter c of the VHCT algorithm
bound (float) – parameter bound of the VHCT algorithm, the noise bound
delta_tilde (float) – modified confidence parameter delta_tilde of the VHCT algorithm
- compute_u_value(nu, rho, c, bound, delta_tilde)¶
The function to compute the u_{h,i} value of the node
- Parameters
nu (float) – parameter nu in the HOO algorithm
rho (float) – parameter rho in the HOO algorithm
rounds (int) – the number of rounds in the HOO algorithm
- get_b_value()¶
The function to get the b_{h,i} value of the node
- get_children()¶
The function to get the children of the node
- get_cpoint()¶
The function to get the center point of the domain
- get_depth()¶
The function to get the depth of the node
- get_domain()¶
The function to get the domain of the node
- get_index()¶
The function to get the index of the node
- get_mean_reward()¶
The function to get the mean reward of the node
- get_parent()¶
The function to get the parent of the node
- get_tau_hi_value()¶
The function to get the tau_hi value of the node
- get_u_value()¶
The function to get the u_{h,i} value of the node
- get_visited_times()¶
The function to get the number of visited times of the node
- update_b_value(b_value)¶
The function to update the b_{h,i} value of the node
- Parameters
b_value (float) – The new b_{h,i} value to be updated
- update_children(children)¶
The function to update the children of the node
- Parameters
children – The children nodes to be updated
- update_reward(reward)¶
The function to update the reward list of the node
- Parameters
reward (float) – the reward for evaluating the node
- class PyXAB.algos.VHCT.VHCT(nu=1, rho=0.5, c=0.1, delta=0.01, bound=1, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmThe implementation of the Variance High Confidence Tree algorithm
- __init__(nu=1, rho=0.5, c=0.1, delta=0.01, bound=1, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Initialization of the VHCT algorithm
- Parameters
nu (float) – parameter nu of the VHCT algorithm
rho (float) – parameter rho of the VHCT algorithm
c (float) – parameter c of the VHCT algorithm
delta (float) – confidence parameter delta of the VHCT algorithm
bound (float) – the noise upper bound parameter bound
domain (list(list)) – The domain of the objective to be optimized
partition – The partition choice of the algorithm
- expand(parent)¶
The function to expand the tree at the parent node
- Parameters
parent – The parent node to be expanded
- get_last_point()¶
The function to get the last point of HCT
- Returns
chosen_point – The point chosen by the algorithm
- Return type
list
- optTraverse()¶
The function to traverse the exploration tree to find the best path and the best node to pull at this moment.
- Returns
curr_node (Node) – The last node selected by the algorithm
path (List of Node) – The best path to traverse the partition selected by the algorithm
- pull(time)¶
The pull function of VHCT that returns a point in every round
- Parameters
time (int) – time stamp parameter
- Returns
point – the point to be evaluated
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of VHCT to obtain the reward and update the Statistics
- Parameters
time (int) – time stamp parameter
reward (float) – the reward of the evaluation
- updateAllTree(path, reward)¶
The function to update everything in the tree
- Parameters
path (list) – the path from the root to the chosen node
reward (float) – the reward to update
- updateBackwardTree()¶
The function to update all the b_{h,i} value backwards in the tree
- updateRewardTree(path, reward)¶
The function to update the reward of each node in the path
- Parameters
path (list) – the path to find the best node
reward (float) – the reward to update
- updateUvalueTree()¶
The function to update the u_{h,i} value in the whole tree
VPCT Algorithm¶
- class algos.VPCT.VPCT(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Bases:
PyXAB.algos.Algo.AlgorithmImplementation of Variance-reduced Parallel Confidence Tree algorithm (VHCT + GPO)
- __init__(numax=1, rhomax=0.9, rounds=1000, domain=None, partition=<class 'PyXAB.partition.BinaryPartition.BinaryPartition'>)¶
Initialization of the VPCT algorithm
- Parameters
numax (float) – parameter nu_max in the algorithm
rhomax (float) – parameter rho_max in the algorithm, the maximum rho used
rounds (int) – the number of rounds/budget
domain (list(list)) – the domain of the objective function
partition – the partition used in the optimization process
- get_last_point()¶
The function to get the last point of VPCT.
- pull(time)¶
The pull function of VPCT that returns a point to be evaluated
- Parameters
time (int) – The time step of the online process.
- Returns
point – The point chosen by the VPCT algorithm
- Return type
list
- receive_reward(time, reward)¶
The receive_reward function of VPCT to receive the reward for the chosen point
- Parameters
time (int) – The time step of the online process.
reward (float) – The (Stochastic) reward of the pulled point