Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-15958: GLM fix Tweedie ML dispersion estimation #15959

Merged
merged 4 commits into from
Dec 15, 2023

Conversation

tomasfryda
Copy link
Contributor

#15958

I looked at the Torczon’s multi-directional method but I don't think it's suitable for this task as this is just one dimensional optimization problem. Furthermore, the expand (2) and contraction (0.5) constants proposed in the thesis(https://repository.rice.edu/server/api/core/bitstreams/6bfc12f5-a69e-44bf-9e67-1d2e738bec5a/content) would be suboptimal for one dimensional problem (golden section search derivation
explains why https://homepages.math.uic.edu/~jan/mcs471f05/Lec9/gss.pdf). So I used the golden section search.

Description

I found out that the problem is not only with gradient and Hessian but also with Tweedie likelihood calculated using the series method. So the fix is to once in a while (every 10th iteration) calculate the log likelihood using the Tweedie estimator (I implemented for var. power and dispersion estimation as it combines both series and Fourier inversion method to achieve more precise log likelihood estimation) and compare it with the best so far value. I call this a sanity check - basically check if we are improving and if we get worse switch to the golden section search. Last sanity check I do is when the algo things it converged - sometimes the value explodes too fast so it wouldn't get to the sanity check that would find out we got actually worse.

Speed concerns

Basically the same as for the Tweedie variance power and dispersion estimation - for some values (e.g., Tweedie var. power close to 1 (p < 1.2)) it takes longer to estimate the likelihood.

Golden section search has linear convergence so it's asymptotically slower than Newton's method but it appears more robust to noise and since it doesn't require calculation of gradient and Hessian it doesn't seem that much slower for practical purposes.

Results

I managed to get the results close to R's MLE dispersion estimation (often little bit better than R (see the test)). Note that the value we get from summary(glm)$dispersion is not MLE but it seems close.

Estimation from summary:

 sum((object$weights * object$residuals^2)[object$weights > 0])/df.r

MLE dispersion in R:

    tp <- tweedie.profile( yr ~ xt,
                           p.vec= tweedie_p,
                           link.power = 0,
                           data = simData,
                           weights = weight,
                           offset = offset_col,
                           phi.method = "mle",
                           do.smooth = FALSE,
                           verbose = 0
    )
    rdispersion <- tp$phi.max

Problem with R's MLE dispersion estimation is that it sometimes takes very long time (and I don't know if it finishes).

So I used the estimation used from summary for the following plots.

These plots show how it used to behave for different Tweedie variance power [1.2, 1.8]:
tweedie_disp

and how it behaves after the change:
tweedie_disp_new

And these two show the same thing but with log scale:
before
tweedie_disp_log

after
tweedie disp new log

The previous plots were generated using the dispersion estimation from the summary(glm) so I recalculated the same thing with the true MLE and it matches up until Tweedie var. power = 1.7 where R gets stuck (shown as MLE threshold in the plot). The rest of the values ([1.7, 1.85]) I use the summary type of calculation.
tweedie

@tomasfryda tomasfryda added this to the 3.44.0.3 milestone Nov 30, 2023
@tomasfryda tomasfryda self-assigned this Nov 30, 2023
@wendycwong
Copy link
Contributor

The Torczon multi-directional search method is to avoid the simplex degenerations with the iterations. If you do not run into the degenerations problem, then you won't need it.

@tomasfryda
Copy link
Contributor Author

SInce it is just one dimensional optimization I think it's not a problem. If the 1-simplex (line segment) degenerates it's just a point and that could happen but it should be only due to finite precision and that 0-simplex should be the local optimum. But in practice it shouldn't happen since we would converge soon before having the problem with finite precision unless the user specifies the dispersion epsilon to be the machine epsilon or smaller (zero or negative number). So I think it should be ok unless I'm missing something. Does that sound reasonable @wendycwong ?

@wendycwong
Copy link
Contributor

Yes, your reasoning sounds good. I obviously did not read your message carefully and it is the very first one. Really have no excuse here.

wendycwong
wendycwong previously approved these changes Dec 11, 2023
/**
* This method estimates the tweedie dispersion parameter. It will use Newton's update if the new update will
* increase the loglikelihood. Otherwise, the dispersion will be updated as
* dispersionNew = dispersionCurr + learningRate * update.
* In addition, line search is used to increase the magnitude of the update when the update magnitude is too small
* (< 1e-3).
*
* For details, please see seciton IV.I, IV.II, and IV.III in document here:
* For details, please see section IV.I, IV.II, and IV.III in document here:
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace section to sections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Looking at the nice documentation you wrote in the comments made me realize that I didn't update it. I'll do it so it's up-to-date with the code. Thank you for directing my attention here!

@tomasfryda
Copy link
Contributor Author

Yes, your reasoning sounds good. I obviously did not read your message carefully and it is the very first one. Really have no excuse here.

Don't worry about it @wendycwong. No excuse needed. It happens to me all the time :)

@tomasfryda tomasfryda merged commit bfc1bd0 into rel-3.44.0 Dec 15, 2023
2 checks passed
@tomasfryda tomasfryda deleted the tomf_GH-15958_fix_tweedie_dispersion branch December 15, 2023 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants