Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Definition of "noise distribution" terms #191

Merged
merged 14 commits into from
Nov 7, 2014

Conversation

cmaumet
Copy link
Member

@cmaumet cmaumet commented Oct 24, 2014

This issue is a companion for #176 to define the terms created to model noise distributions, namely:

  • nidm:hasNoiseDistribution
  • nidm:NoiseDistribution
  • nidm:NonParametricDistribution
  • nidm:NonParametricSymmetricDistribution

(nidm:BinomialDistribution, nidm:GaussianDistribution and nidm:PoissonDistribution are already defined as synonyms of STATO terms following a comment by @nicholst at #176)

Proposed definition:

  • nidm:hasNoiseDistribution: "Property that associates a NoiseDistribution with a NoiseModel "
  • nidm:NoiseDistribution: "Probability distribution used to model the residuals."
  • nidm:NonParametricDistribution: "Probability distribution defined empirically on the data without assumption on the shape of the probability distribution. Non-parametric distribution are usually defined using permutation of the class labels or sign-flipping."
  • nidm:NonParametricSymmetricDistribution: "Probability distribution defined empirically on the data assuming a symmetric probability distribution. Non-parametric distribution are usually defined using permutation of the class labels or sign-flipping."

I did not find the term "non-parametric distribution" in Bioportal (but "non-parametric test" is in NCIT). Once we settle on definitions, it might be worth suggesting them to STATO.

Please let me know what you think. @nicholst: would you like to comment on those definitions?

@cmaumet
Copy link
Member Author

cmaumet commented Oct 1, 2014

Could we mention "voxel-wise" somewhere? I think those distributions are defined on a voxel-by-voxel basis (otherwise we would have multivariate distributions?).

@nicholst
Copy link
Contributor

nicholst commented Oct 1, 2014

Some comments...

nidm:NoiseDistribution: "Probability distribution used to model the residuals." -> Probability distribution used to model the noise."

There are some annoying subtitles here: For Gaussian data with GLM, it is very straightforward: Y=X beta + epsilon, for data Y, regressors in design matrix X, predictors beta and (unobserved) random error epsilon. The assumptions are that the X's account for all systematic variation in the data Y, and what's left over, the error, has a given nidm:NoiseDistribution and "noise" properties.

This accounts for the vast majority of brain imaging, where the mass univariate model is used to fit the GLM at each voxel/element. However, as soon as you fall out of that, e.g. have binomial count data, you can no longer really disentangle noise from the data. That is, e.g., each binary counts in binomial data might be modelled with logit(P(Xi)) = log{P(Xi)/(1-P(Xi))} = X beta. What's the "noise" here? It isn't explicitly modeled with any variable (say, epsilon) but rather random variation about the deterministic response X beta is accounted for by the statistical model (which gives a likelihood, which has to be optimised iterative to find estimates of beta).

So!!!!! What I'm saying is that we need to be very clear what exactly we're doing. And I propose that we make a very loud and precise distinction that we are modelling the use of mass univariate linear model, fit at each voxel or other type of spatial element. "Univariate" excludes multivariate; "linear" excludes general_ised_ linear models.

GIVEN that, yes, we can say these are univariate models (of course, by our "mass univariate" qualifier), not multivariate.

@cmaumet
Copy link
Member Author

cmaumet commented Oct 2, 2014

+1 for nidm:NoiseDistribution definition

+1 for being more specific and clearly state that we are modelling the use of mass univariate linear models. I think this could be an attribute of each activity (Model Parameters Estimation, Contrast Estimation, Inference). I will open a new issue to discuss this point (cf. #207).

@nicholst
Copy link
Contributor

nicholst commented Oct 2, 2014

Following on my comment on #192, NoiseDistribution -> ErrorDistribution or DistributionOfError

On Oct 2, 2014, at 4:23 AM, Camille Maumet [email protected] wrote:

+1 for nidm:NoiseDistribution definition

+1 for being more specific and clearly state that we are modelling the use of mass univariate linear models. I think this could be an attribute of each activity (Model Parameters Estimation, Contrast Estimation, Inference). I will open a new issue to discuss this point.


Reply to this email directly or view it on GitHub.

@cmaumet
Copy link
Member Author

cmaumet commented Oct 24, 2014

The following definitions are now implemented:

  • nidm:hasErrorDistribution: "Property that associates a ErrorDistribution with an ErrorModel "
  • nidm:ErrorDistribution: "Probability distribution used to model the error."
  • nidm:NonParametricDistribution: "Probability distribution defined empirically on the data without assumption on the shape of the probability distribution. Non-parametric distribution are usually defined using permutation of the class labels or sign-flipping."
  • nidm:NonParametricSymmetricDistribution: "Probability distribution defined empirically on the data assuming a symmetric probability distribution. Non-parametric distribution are usually defined using permutation of the class labels or sign-flipping."

Do you think this is good to go?

@nicholst
Copy link
Contributor

One tweak: for nonparametric change "defined" to "estimated". Just seems a little more accurate.

  • nidm:NonParametricDistribution: "Probability distribution defined empirically on the data without assumption on the shape of the probability distribution. Non-parametric distribution are usually estimated using permutation of the class labels or sign-flipping."
  • nidm:NonParametricSymmetricDistribution: "Probability distribution defined empirically on the data assuming a symmetric probability distribution. Non-parametric distribution are usually estimated using permutation of the class labels or sign-flipping."

@cmaumet
Copy link
Member Author

cmaumet commented Nov 6, 2014

Those definitions are now "pending final vetting". Would someone like to comment or +1?

Latest version:

  • nidm:hasErrorDistribution: "Property that associates a ErrorDistribution with an ErrorModel "
  • nidm:ErrorDistribution: "Probability distribution used to model the error."
  • nidm:NonParametricDistribution: "Probability distribution estimated empirically on the data without assumption on the shape of the probability distribution. Non-parametric distribution are usually estimated using permutation of the class labels or sign-flipping."
  • nidm:NonParametricSymmetricDistribution: "Probability distribution estimated empirically on the data assuming a symmetric probability distribution. Non-parametric distribution are usually estimated using permutation of the class labels or sign-flipping."

@khelm
Copy link
Contributor

khelm commented Nov 7, 2014

+1 for those - except that I would leave out: "Non-parametric distribution are usually estimated using permutation of the class labels or sign-flipping." since that is, while helpful, extraneous to the definition and then you would have to define what you mean by "permutation of class labels" and "sign-flipping"

@nicholst
Copy link
Contributor

nicholst commented Nov 7, 2014

+1 for @khelm's mod... makes sense. Two tiny follow-up edits

  • nidm:NonParametricDistribution: "Probability distribution estimated empirically on the data without assumptions on the shape of the probability distribution."

(Added "s" to assumption).

  • nidm:NonParametricSymmetricDistribution: "Probability distribution estimated empirically on the data assuming only symmetry of the probability distribution."

(revised end of sentence, to stress that "only symmetry" is asummed).

@khelm
Copy link
Contributor

khelm commented Nov 7, 2014

+1 for @nicholst follow-up edits

@cmaumet
Copy link
Member Author

cmaumet commented Nov 7, 2014

+1 for those edits (now implemented).

@cmaumet
Copy link
Member Author

cmaumet commented Nov 7, 2014

The tests passed. I think that this pull request is good to merge.

cmaumet pushed a commit that referenced this pull request Nov 7, 2014
Definition of "noise distribution" terms
@cmaumet cmaumet merged commit 18888fd into incf-nidash:master Nov 7, 2014
@cmaumet cmaumet deleted the error_dist branch November 7, 2014 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants