Skip to content

HSGP: Birthdays Example #627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
May 4, 2024
Merged

Conversation

juanitorduz
Copy link
Collaborator

@juanitorduz juanitorduz commented Jan 16, 2024

Closes #626

Scope:

TODO:

  • Intro
  • EDA
  • Simple model
  • Complete model
  • References
  • Additional packages in metadata.

📚 Documentation preview 📚: https://pymc-examples--627.org.readthedocs.build/en/627/

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@juanitorduz juanitorduz changed the title init HSGP: Birthdays Example Jan 16, 2024
@juanitorduz juanitorduz marked this pull request as draft January 16, 2024 20:31
@juanitorduz juanitorduz requested a review from bwengals January 16, 2024 20:31
@juanitorduz
Copy link
Collaborator Author

The simple model is ready for a first round of review.

@AlexAndorra
Copy link
Collaborator

Nice @juanitorduz !! So this is an extension of your original blog post? I was gonna read when I got the time, but I might read and review this NB if that's the same and helpful to you

@juanitorduz
Copy link
Collaborator Author

Well, we decided with Bill that we will present a simpler model to illustrate HSGP and then Bill will add at the end the complete model from Aki. I was not able to add the Horseshoe prior (very bad r hats and divergences) in my post so Bill will give it a go :)

All feedback is welcome!

@AlexAndorra
Copy link
Collaborator

Ok, so it sounds like it's your post updated + simpler model, so I'll definitely review this PR instead! Please feel free to ping me when it's ready for review

@juanitorduz
Copy link
Collaborator Author

juanitorduz commented Jan 19, 2024

As discussed with Bill, we want to keep the model specification closer to the Stsan one for this example. In 08f4bc8 , I changed the day_of_week parametrization from zero-sum-normal to one-hot-encoding setting Monday coefficient to zero. The relative contribution between days (say difference between Tuesday and Sunday) remained the same, it is more about the interpretation.

Remark: Note this corresponds to Model 3: Slow trend + yearly seasonal trend + day of week. In particular, if you go to the Stan code we see there is actually not an intercept as we are modeling the standardized births (vector[N] intercept = 0.0 + f_day_of_week[day_of_week], see https://github.com/avehtari/casestudies/blob/master/Birthdays/gpbf3.stan#L47C3-L47C58)

@juanitorduz
Copy link
Collaborator Author

@bwengals I added a small comment on the first basis vectors in 67ef3fc . Let me know what you think and how much you think we should expand (related to pymc-devs/pymc#7115)

@juanitorduz
Copy link
Collaborator Author

In view of #626 (comment) I think the very first iteration is ready for review in order to collect feedback.

@juanitorduz
Copy link
Collaborator Author

The pre-commit-ci jobs fails with a strange error. Still the pre-commit on the GitHub actions passes :)

Copy link
Collaborator Author

ok thanks!


View entire conversation on ReviewNB

@juanitorduz
Copy link
Collaborator Author

@bwengals just a small ping 🤗 for a last review 😅

Copy link
Collaborator

maybe missed it earlier, but I do think it'd be nice to say something like "Next we're going to systematically look at seasonality, first yearly, then monthly" ... etc. Just wanted to bring it up again in case it got lost in the shuffle -- also totally ok if you think it messes your flow


View entire conversation on ReviewNB

Copy link
Collaborator

Wanna re-mention this one:

All of these building blocks should not come as a surprise after looking into the EDA section.

I think it's better, especially for beginners, to not say things are obvious or not a surprise, because it doesn't help understanding and only potentially makes them feel dumb. To be honest after doing the EDA above, I dunno if I'd make the exact same model (unless I was copying the Vehtari case study).


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Mar 27, 2024

View / edit / reply to this conversation on ReviewNB

bwengals commented on 2024-03-27T21:48:02Z
----------------------------------------------------------------

the classic birthdays example dataset (see {cite:p}gelman2013bayesian [Chapter 21])

I think it'd be good to also cite Vehtari et al's GPTools matlab package. I think this is where birthdays makes it's first appearance, but would need to double check.

rather focus on the implementation and how to use PyMC's {class}~pymc.gp.HSGP API.

and pm.gp.HSGPPeriodic right?

As at the moment there is no HSGP wrapper in NumPyro, this example is a great resource to learn about the method internals. 

could consider removing the "at the moment there is no HSGP wrapper in Numpyro", just so you no one will need to update this later if they do add it someday

juanitorduz commented on 2024-03-29T09:10:28Z
----------------------------------------------------------------

Done!

juanitorduz commented on 2024-03-29T09:11:02Z
----------------------------------------------------------------

I also added a link to theis note about the source of the birthdays dataset https://statmodeling.stat.columbia.edu/2020/10/25/birthday-data/

Copy link

review-notebook-app bot commented Mar 27, 2024

View / edit / reply to this conversation on ReviewNB

bwengals commented on 2024-03-27T21:48:03Z
----------------------------------------------------------------

Maybe no need to mention heteroscedasticity, since I don't think it comes up again later right?

Nvm, it comes up later


Copy link

review-notebook-app bot commented Mar 27, 2024

View / edit / reply to this conversation on ReviewNB

bwengals commented on 2024-03-27T21:48:04Z
----------------------------------------------------------------

Is this a raw cell or a markdown cell?


juanitorduz commented on 2024-03-29T08:29:38Z
----------------------------------------------------------------

No, it is a box-cell, see the rendered docs https://pymcio--627.org.readthedocs.build/projects/examples/en/627/gaussian_processes/GP-Births.html ;)

Copy link

review-notebook-app bot commented Mar 28, 2024

View / edit / reply to this conversation on ReviewNB

bwengals commented on 2024-03-28T23:39:49Z
----------------------------------------------------------------

this is nifty...


Copy link

review-notebook-app bot commented Mar 28, 2024

View / edit / reply to this conversation on ReviewNB

bwengals commented on 2024-03-28T23:39:50Z
----------------------------------------------------------------

Maybe a smaller markdown header size instead of a bullet for these?


juanitorduz commented on 2024-03-29T08:51:21Z
----------------------------------------------------------------

done!

Copy link

review-notebook-app bot commented Mar 28, 2024

View / edit / reply to this conversation on ReviewNB

bwengals commented on 2024-03-28T23:39:51Z
----------------------------------------------------------------

I wonder if the conclusion could be expanded a bit? It feels a bit like trailing off here, especially since you've beefed up the intro (which was really nice!). Some takeaways I have are:

  • It's great to be able to strategically fold GPs into larger models. It's "possible" with GPs, but HSGPs make that actually possible. This example is a great demonstration of this.
  • It might be good to point out how fast these things sample! If you were to replace just one of the HSGPs with gp.Latent, what would the estimated sampling time be, probably days? Months?
  • You could mention the main limitations of HSGPs, that they only apply to stationary covariances (in practice, ExpQuad, Matern52, Matern32), and they don't scale well with input dimension. But emphasize that in practice this isnt a huge deal. Most of the time this is what you're probably trying to do.


juanitorduz commented on 2024-03-29T09:08:33Z
----------------------------------------------------------------

Added! Let me know what you think :)

Copy link
Collaborator Author

Copy link
Collaborator Author

You are 100% right! Let me rephrase it 🙏


View entire conversation on ReviewNB

Copy link
Collaborator Author

Let me know what you think about the change :)


View entire conversation on ReviewNB

Copy link
Collaborator Author

done!


View entire conversation on ReviewNB

Copy link
Collaborator Author

Added! Let me know what you think :)


View entire conversation on ReviewNB

Copy link
Collaborator Author

Done!


View entire conversation on ReviewNB

Copy link
Collaborator Author

I also added a link to theis note about the source of the birthdays dataset https://statmodeling.stat.columbia.edu/2020/10/25/birthday-data/


View entire conversation on ReviewNB

@juanitorduz
Copy link
Collaborator Author

juanitorduz commented Mar 29, 2024

thank you for your comments @bwengals ! I think I addressed them all. Let me know what you think about the changes :)

https://pymcio--627.org.readthedocs.build/projects/examples/en/627/gaussian_processes/GP-Births.html is the current rendered version

Copy link
Collaborator

bwengals commented May 2, 2024

One last detail in the conclusion: The complexity for HSGP is $O(nm + m)$ instead of $O(nm^2)$. From the "Practical HSGP" paper on efficiency:


 While the pre-computation cost of the basis functions is O(m2n), the compu-
tational cost of learning the covariance function parameters is
O(mn + m) in every step of the optimizer or sampler. This is
a big advantage in terms of speed for iterative algorithms such
as Markov chain Monte Carlo (MCMC). Another advantage
is the reduced memory requirements of automatic differentia-
tion methods used in modern probabilistic programming frame-
works, such as Stan (Carpenter et al., 2017) and others. This is
because the memory requirements of automatic differentiation
scale with the size of the autodiff expression tree which in direct
implementations is simpler for basis function than covariance
matrix-based approaches

@juanitorduz
Copy link
Collaborator Author

Thank you @bwengals ! Fixed in adf2170 :)

@AlexAndorra
Copy link
Collaborator

Shall I merge @juanitorduz ?

@juanitorduz
Copy link
Collaborator Author

Shall I merge @juanitorduz ?

Yes :) 🙏

@AlexAndorra AlexAndorra merged commit 4875104 into pymc-devs:main May 4, 2024
2 checks passed
@juanitorduz juanitorduz deleted the hsgp_births branch May 4, 2024 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Notebook Proposal : Time Series Modeling with HSGP: Baby Births Example
4 participants