-
Notifications
You must be signed in to change notification settings - Fork 2
Description
First note that the number of points that will be computed exactly is predetermined by the initial batch size, T (assuming infinite sampling budget). Let's call this number of points X.
The number of points sampled looks like:
N, sampled T times
N/2, sampled 2T times
N/4, sampled 4T times
...
...
N/2^r = X points computed exactly (r roughly equals log_2(N/T)).
Now consider the case where the are 2X highly degenerate points that are good candidates for the best arm. In the last step, half of these will get culled -- potentially including the best medoid, due to random noise. For this reason, the last set of X points may not contain the true best arm.
One possible explanation for why this might be a "worse" problem in UCB: in UCB, we see the number of points that need to be computed exactly, giving us a sense of the "degeneracy" of the problem. We likely need to fine-tune T so that X is at least as large as the number of points computed exactly in UCB; any lower, and we will be likely to miss the true medoid.
A lot of this boils down to the "fallback"/"degeneracy" being prespecified by T in the CSH algorithm.