-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix random sampling in Mixture #3004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Wow this is much more difficult than I thought with |
Merge #2984 first? It doesn't break anything new... |
Totally - is that ready? |
Yep! Tests all pass now, and I think it just passes |
Also updated dependent_density_regression notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a tricky change! I made a few tiny suggestions (and a docstring suggestion that might be wrong?)
pymc3/distributions/mixture.py
Outdated
@@ -197,21 +212,21 @@ class NormalMixture(Mixture): | |||
the component standard deviations | |||
tau : array of floats | |||
the component precisions | |||
distshape : shape of the Normal component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this is already called dist_shape
in generate_samples
(instead of distshape
)... is it easy to change that here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it to comp_shape
to avoid confusion.
pymc3/distributions/mixture.py
Outdated
distshape : shape of the Normal component | ||
notice that it should be different than the shape | ||
of the mixture distribution, with one axis being | ||
the number of component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small typo: "number of components". I am also not sure I understand the docstring either -- would adding "a mixture of three 2d normal distributions would have shape=(3, 2)
and dist_shape=(2,)
" be accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a tricky one.
"a mixture of three 2d normal distributions would have shape=(3, 2)
and dist_shape=(2,)
" this is not correct. A mixture of three 2d normal distributions would have shape=(..., 2)
, and dist_shape=(2, 3)
. But I am not sure multi-D normal mixture actually works...
In practice, if you have a NormalMixture RV that is shape=(a,b,c,...)
, the component shape get one more axis comp_shape=(a,b,c,...,k)
, with k
components. However, passing comp_shape=(a,b,c,...,k)
(also what the previous code trying to do) break cases when you are using theano.shared
observed and sample_ppc.
Will merge if no more comments. |
Of course, I am glad that bugs with this distribution is fixed 😄 . However, I experienced that some of my code which relied on mixtures for multiple dimensions broke. Have you considered updating the Thanks! PS: I can provide a minimal breaking example if needed. |
@ahmadsalim Please provide an example. |
Here, you go: import pymc3 as pm
import numpy as np
with pm.Model() as model:
mus = pm.Normal('mus', shape=(6,12))
taus = pm.Gamma('taus', alpha=1, beta=1, shape=(6, 12))
ws = pm.Dirichlet('ws', np.ones(12))
mixture = pm.NormalMixture('m', w=ws, mu=mus, tau=taus) I get the error:
|
Can confirm - could you please raise an issue? |
Yes, of course! |
The output from sample_ppc does not equal to the shape of the observed sometimes, as the
distribution.shape
of ObservedRVs are[]
. Initially, this PR adds the specific shape (i.e., shape of the data) to the ObservedRVs, so that generating random sample will have the correct shape, which makes working with random method easier (e.g., PR #2984). However, doing so turns out breaks a lot of examples whentheano.shared
observed is used, as updating the value viaset_values
change the shape.Now this PR just fix the mixture random error (close #2954).