-
Notifications
You must be signed in to change notification settings - Fork 3
Home
This FAQ is curated by Luigi Acerbi, and in constant expansion.
For a IBS tutorial and example, see ibs_example.m (in MATLAB); other languages to be added.
If you have questions not covered here, please feel free to ask me at [email protected] (putting 'IBS' in the subject of the email).
Acknowlegments: Most of the questions currently answered here originated in a live Q&A session with the Ma lab, and thanks to Hsin-Hung Li for taking notes.
-
No, this is not okay in the sense that by doing it one would essentially be reverting IBS to be a fixed-sampling method, with all the associated problems discussed in the paper. A more principled way is to put an early-stopping threshold on the log-likelihood, as described in the paper.
-
Is it important to provide the standard deviation of the IBS estimator to the optimization/inference algorithm for every parameter combination evaluated?
It depends:
- If you are optimizing the target log-likelihood (e.g., for maximum-likelihood or maximum-a-posteriori estimation) then it might help but it is not necessary because the IBS estimator variance, somewhat surprisingly, is nearly constant across the parameter space. However, the BADS optimizer (which we recommend to use in combination with IBS; see also below) does not currently support user-provided, input-dependent noise; so in that case it is not even an option.
- If you are performing Bayesian inference, for example using the VBMC toolbox, then it is necessary to provide the standard deviation of the IBS estimate to the algorithm. Bayesian inference is very sensitive to noisy estimates of the log-likelihood (or log-posterior), so it is crucial to provide the inference algorithm with all available information about the magnitude of observation noise.
-
For computational reasons, we can often not afford to evaluate the log likelihood of every parameter combination with high precision while optimizing the parameters. However, once we have found the (supposedly) best parameter combination, we could increase the precision (e.g., the number of IBS repeats). Is this advisable?
Yes, absolutely. It should be considered standard practice, regardless of IBS. Whenever optimizing a noisy target function, after obtaining a candidate solution from an optimization method, one should evaluate the target function at the solution with higher precision.
-
In an ideal world, would you let the number of IBS repeats depend on how close the optimization algorithm thinks it is to the maximum — i.e. some form of adaptive precision?
Yes, this is a good idea and topic of ongoing research.
-
I am trying to get more precise results. As I increase the number of IBS repeats, the standard deviation of the estimated log-likelihood goes down slowly, but the computational time increases linearly. Is this trend normal?
Well, think about it. The number of repeats is literally the number of times the IBS estimator is run, so the computational time has to be linear in the number of repeats. On the other hand, the number of repeats increases the number of independent log-likelihood estimates you are averaging over. As known, the standard error of the mean decreases with the square root of the number of independent estimates (in this case, number or repeats).
-
Any guideline on how to balance computation time and precision of the IBS estimates (i.e., number of repeats)?
The algorithm you are using (e.g., BADS or VBMC) will often have some recommendation for how much noise in the log-likelihood it can handle, usually of order ~1. If you cannot decrease the log-likelihood observation noise to be ~1 or less, try to be as precise as possible within the available computational resources.
-
The main assumption for it to work well is roughly that the trial likelihoods are correlated across (reasonable) regions of parameter space, such that you can compute the resource allocation for a given "representative" set of parameters, and that allocation of resource is still beneficial across iterations of the optimization or inference algorithm.
-
For which cases are trial-dependent repeats more preferable than fixed repeats (e.g., 20 repeats for every trial)?
In theory whenever the above assumption holds, which seems to hold often in practice. However, more empirical studies are needed.
-
I am interested in using IBS for problems with continuous responses. Should I try and implement ABC-IBS or is discretization enough? And how do I set the number of bins in the discretization (or, equivalently, the epsilon radius for ABC-IBS)?
While ABC-IBS as briefly described in the paper is a slightly better approach statistically, for most problems it would not make a big difference if one simply discretizes the response space. Using ABC-IBS (or discretizing the space) is roughly equivalent to adding localized uniform noise to the response of the model being fit, with radius equal to half the bin size (or equal to epsilon). So, as a rule of thumb, one wants this added noise to be (much) less than the magnitude of the noise present in the data.
-
BADS is a robust optimizer that works well with stochastic target functions, and in particular with the noisy estimates produced by IBS. Many questions related to the usage of BADS can be found in the BADS general FAQ. In particular, you might want to start the section of the FAQ dedicated to noisy objective functions (but do not stop there — all sections of the FAQ are relevant).