set.seed(1)
= rnorm(10, 0, sqrt(6))
data thetahat = exp(mean(data)))
(#> [1] 1.382411
= replicate(1000, {
bs = sample(data, 10, replace = TRUE)
resample exp(mean(resample))
})biashat = mean(bs) - thetahat)
(#> [1] 0.2973734
Using the bootstrap for bias reduction
I came across a neat example in Horowitz (2001, p. 3174), which demonstrates that (in these specific circumstances at least), the bias-corrected bootstrap estimator has lower MSE by a large factor. The setup is as follows. We have a sample of 10 iid observation, where \(X_{i}\sim N(0,6)\). The goal is then to estimate \(\theta=\exp(\text{E}[X_{i}])\), for which the true value is \(\theta=1\). The plug-in estimator is \(\hat{\theta}=\exp\left(\frac{1}{10}\sum_{i=1}^{10}X_{i}\right)\).
Given a realized sample \(\mathbf{x}=(x_{1},\ldots,x_{n})\), the usual bootstrap estimates are obtained by resampling \(m\) times from \(\mathbf{x}\) with replacement, generating the bootstrap samples \(\mathbf{x}_{j}^{*}\), and the bootstrap estimates \(\hat{\theta}_{j}^{*}=\exp\left(\frac{1}{10}\sum_{i=1}^{10}x_{j}^{*}\right)\). Let \(\hat{\theta}^{*}=\frac{1}{m}\sum_{j=1}^{m}\hat{\theta}_{j}^{*}\) be the average across all \(\hat{\theta}_{j}^{*}\). We can then estimate the bias as \(\widehat{\text{Bias}}[\hat{\theta}]=\hat{\theta}^{*}-\hat{\theta}\). In R code, this is:
The “debiased” estimate would hence be \(\hat{\theta}-\widehat{\text{Bias}}[\hat{\theta}]=2\hat{\theta}-\hat{\theta}^{*}\). For the concrete result, this is \(1.382-0.297=1.085\), much closer to the true value \(\theta=1\).
Because we control the data-generating process and know the true value of \(\theta\), we can repeat the above procedures any number of times and obtain approximations for the MSE’s of \(\hat{\theta}\) and \(\hat{\theta}-\widehat{\text{Bias}}[\hat{\theta}]\). The following code accomplishes that for 100 repetitions:
= replicate(100, {
res = rnorm(10, 0, sqrt(6))
data = exp(mean(data))
thetahat = replicate(1000, {
bs = sample(data, 10, replace = TRUE)
resample exp(mean(resample))
})debiased = 2 * thetahat - mean(bs))
(c(thetahat - 1, debiased - 1, (thetahat - 1) ^ 2, (debiased - 1)^2)
})apply(res, 1, mean)
#> [1] 0.37878143 -0.04919049 1.10729457 0.47833810
By making use of the identity \(\text{MSE}[\cdot]=\text{Bias}^{2}[\cdot]+\text{Var}[\cdot]\), we obtain the following results:
Estimator | MSE | Bias | Variance |
---|---|---|---|
\(\hat{\theta}\) | 1.107 | 0.379 | 0.964 |
\(\hat{\theta}-\widehat{\text{Bias}}[\hat{\theta}]\) | 0.478 | -0.049 | 0.476 |
Similar to the results reported in Horowitz (2001, p. 3175), there is a large reduction in both bias and MSE. Not reported by Horowitz, but also significant, is the reduction in variance. The true bias1 of \(\hat{\theta}\) is \(\exp(0.3) - 1 \approx 0.35\), so the simulation estimate is not far off.
References
Horowitz, Joel L. 2001. “The Bootstrap.” In: Handbook of Econometrics, Volume 5, edited by J. J. Heckman and E. Leamer. Elsevier.
Footnotes
Let \(Y = \frac{1}{10} \sum X_i\), then \(Y \sim N(0, 0.6)\), and \(\hat{\theta}= \exp (Y) \sim \text{LogNormal}(0, 0.6)\). A log-normal random variable has mean \(\exp \left( \frac{\mu + \sigma^2}{2} \right)\), hence \(\text{E}[\hat{\theta}] = \exp(0.3)\).↩︎