Contents
Base class to define a PF shape parameter.
Two types of parameters can be built:
Examples
Declarations:
>>> import statspy as sp
>>> mu = sp.Param(name="mu",value=10.,label="\\mu")
>>> x = sp.Param("x = 10. +- 2.")
>>> x.value
10.0
>>> x.unc
2.0
>>> y = sp.Param("y = 5.", unc=1.)
Operations, building DERIVED parameters:
>>> x + y
x + y = 15.0 +- 2.2360679775
>>> z = x * y
>>> z.name = 'z'
>>> z
z = x * y = 50.0 +- 14.1421356237
>>> x**2
x ** 2 = 100.0 +- 40.0
Possible operations are +,-,*,/,**.
Attributes
name | str | Random Variable name |
label | str | Random Variable name for printing purposes |
value | float | Current numerical value |
unc | float | Parameter uncertainty (e.g. after minimization) |
neg_unc | float | Negative parameter uncertainty (only for intervals) |
pos_unc | float | Positive parameter uncertainty (only for intervals) |
bounds | list | Defines, if necessary, the lower and upper bounds |
formula | list (optional, only for DERIVED parameters) | List of operators and parameters used to parse an analytic function |
strform | str (optional, only for DERIVED parameters) | Representation of the formula as a string |
partype | Param.RAW or Param.DERIVED | Tells whether it is a RAW or a DERIVED parameter |
const | bool | Tells whether a parameter is fixed during a minimazation process. It is not a constant in the sense of programming. |
poi | bool | Tells whether a parameter is a parameter of interest in an hypothesis test. |
isuptodate | bool | Tells whether value needs to be computed again or not |
logger | logging.Logger | message logging system |
Methods
Add a parameter to another parameter or a numerical value.
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | new : Param
|
Return the parameter value.
Returns : | self.value : float
|
---|
Divide a parameter by another parameter or by a numerical value.
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | new : Param
|
Overload __getattribute__ to update the value attribute from the formula for a DERIVED parameter.
In-place addition (+=)
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | self : Param
|
In-place multiplication (*=)
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | self : Param
|
In-place subtraction (-=)
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | self : Param
|
Multiply a parameter by another parameter or by a numerical value.
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | new : Param
|
Raise a parameter the power.
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | new : Param
|
Overload __setattr__ to make sure that quantities based on this parameter will be updated when its value is modified.
Subtract a parameter to another parameter or a numerical value.
Parameters : | self : Param other : Param, int, long, float |
---|---|
Returns : | new : Param
|
list of weak references to the object (if defined)
Transform the value of the parameter such as it has no bounds.
This method is used with the minimization algorithms which require only unbound parameters. The transformation formula from a double-sided or a one-sided parameter to an unbound parameter are described in Section 1.3.1 of the MINUIT manual: Fred James, Matthias Winkler, MINUIT User’s Guide, June 16, 2004.
Returns : | new : float
|
---|
Transform the parameter value from an unbound to a bounded representation.
This method is used with the minimization algorithms which require only unbound parameters. The transformation formula from an unbound parameter to a double-sided or a one-sided parameter are described in Section 1.3.1 of the MINUIT manual: Fred James, Matthias Winkler, MINUIT User’s Guide, June 16, 2004. Since the transformation is non-linear, the transformation of the uncertainty is approximate and based on the error propagation formula. In particular, when the value is close to its limit, the uncertainty is not trustable and a more refined analysis should be performed.
Parameters : | val : float
unc : float
|
---|
Base class to define a Probability Function.
Probability Function is a generic name which includes both the probability mass function for discrete random variables and the probability density fucntion for continuous random variables. The function itself is defined in self.func:
Examples
>>> import statspy as sp
>>> pmf_n = sp.PF("pmf_n=poisson(n;mu)",mu=10.)
Attributes
name | str (optional) | Function name |
func | scipy.stats.distributions.rv_generic, list | Probability Density Function object. |
params | statspy.core.Param list | List of shape parameters used to define the pf |
norm | Param | Normalization parameter set to 1 by default. It can be different from 1 when the PF is fitted to data. |
isuptodate | bool | Tells whether PF needs to be normalized or not |
options | dict | Potential list of options |
pftype | PF.RAW or PF.DERIVED | Tells whether a PF is a RAW PF or DERIVED from other PFs |
logger | logging.Logger | message logging system |
Methods
Add two PFs.
The norm parameters are also summed.
Parameters : | self : PF other : PF |
---|---|
Returns : | new : PF
|
Evaluate Probability Function in x
Parameters : | args : float, ndarray, optional, multiple values for multivariate pfs
kwargs : keywork arguments, optional
|
---|---|
Returns : | value : float, ndarray
|
Multiply a PF by another PF.
Parameters : | self : PF other : PF |
---|---|
Returns : | new : PF
|
Scale PF normalization value
Parameters : | self : PF scale : float |
---|---|
Returns : | self : PF
|
list of weak references to the object (if defined)
Compute the cumulative distribution function in x.
Parameters : | args : ndarray, tuple
kwargs : keywork arguments, optional
|
---|---|
Returns : | value : float, ndarray
|
Convolve two PFs which is same as adding two random variables.
The scipy.signal package is used to perform the convolution with different options set by mode.
Parameters : | self : PF other : PF kw : keywork arguments, dict
|
---|---|
Returns : | new : PF
|
Notes
This method is working only for 1-dim pdf/pmf currently.
Returns the correlation matrix of the free parameters.
Returns : | corr : ndarray
|
---|
Compute the uncertainty on PF given the uncertainty on the shape and norm parameters.
This method can be used to show an error band on your fitted PF. To compute the uncertainty on the PF, the error propagation formula is used:
dF(x;th) = (F(x;th+dth) - F(x;th-dth))/2
dF(x)^2 = dF(x;th)^T * corr(th,th') * dF(x;th')
so keep in mind it is only an approximation.
Parameters : | x : float, ndarray
|
---|---|
Returns : | dF : float, ndarray
|
Fit the PF to data using a least squares method.
The fitting part is performed using the scipy.optimize.leastsq function. The Levenberg-Marquardt algorithm is used by the ‘leastsq’ method to find the minimum values. When calling this method, all PF parameters are minimized except the one which are set as ‘const’.
Parameters : | xdata : ndarray
ydata : ndarray
ey : ndarray (optional)
dx : ndarray (optional)
cond : boolean ndarray (optional)
kw : keyword arguments
|
---|---|
Returns : | free_params : statspy.core.Param list
pcov : 2d array
chi2min : float
pvalue : float
|
Derive a PF from another PF via a location parameter.
Parameters : | self : PF loc : Param, value |
---|---|
Returns : | new : PF
|
Compute the logarithm of the PF in x.
Parameters : | args : ndarray, tuple
kwargs : keywork arguments, optional
|
---|---|
Returns : | value : float, ndarray
|
Fit the PF to data using the maximum likelihood estimator method.
The fitting part is performed using the scipy.optimize.minimize function. If keyword argument method is not specified, the BFGS algorithm is used. When calling this method, all PF parameters are minimized except the one which are set as ‘const’ before calling the method.
Parameters : | data : ndarray, tuple
kw : keyword arguments (optional)
|
---|---|
Returns : | free_params : statspy.core.Param list
nllfmin : float
|
Estimate the mean of the PF.
Warning
Parameters : | kw : keywork arguments, optional
|
---|---|
Returns : | mean : ndarray |
Evaluate the negative log-likelihood function:
nllf = -sum_i(log(pf(x_i;params)))
sum runs over the x-variates defined in data array.
Parameters : | data : ndarray, tuple
kw : keyword arguments (optional)
|
---|---|
Returns : | nllf : float
|
Evaluate the profile log-likelihood ratio ( * -2 )
The profile likelihood ratio is defined by:
l = L(x|theta_r,\hat{\hat{theta_s}}) / L(x|\hat{theta_r},\hat{theta_s})
The profile log-likelihood ratio is then:
q = -2 * log(l)
Where
pllr is used as a test statistics for problems with numerous nuisance parameters. Asymptotically, the pllr PF is described by a chi2 distribution (Wilks theorem). Further information on the likelihood ratio can be found in Chapter 22 of “Kendall’s Advanced Theory of Statistics, Volume 2A”.
Parameters : | data : ndarray, tuple
kw : keyword arguments (optional)
|
---|---|
Returns : | pllr : float
|
Compute the pvalue in xobs.
The p-value is the probability of observing at least xobs Pr(x >= xobs).
Parameters : | args : ndarray, tuple
kwargs : keywork arguments, optional
|
---|---|
Returns : | pvalue : float, ndarray
|
Get random variates from a PF
Returns : | data : ndarray
|
---|
Examples
>>> import statspy as sp
>>> pdf_x = sp.PF("pdf_x=norm(x;mu=20,sigma=5)")
>>> data = pdf_x.rvs(size=1000)
Derive a PF from another PF via a scale parameter.
Parameters : | self : PF scale : Param, value |
---|---|
Returns : | new : PF
|
Base class to define a Random Variable.
Examples
>>> import statspy as sp
>>> X = sp.RV("norm(x|mu=10,sigma=2)")
Attributes
name | str | Random Variable name |
pf | statspy.core.PF | Probability Function object associated to a Random Variable |
rvtype | RV.CONTINUOUS or RV.DISCRETE | Random Variable type |
logger | logging.Logger | message logging system |
Add two random variables or a random variable by a parameter.
The associated PF of the sum of the two random variables is a convolution of the two PFs. Caveat: this method assumes independent random variables.
Parameters : | self : RV other : RV, Param, value |
---|---|
Returns : | new : RV
|
Get random variates
Returns : | data : ndarray
|
---|
Examples
>>> import statspy as sp
>>> X = sp.RV("norm(x;mu=20,sigma=5)")
>>> x = X(size=1000)
Divide two random variables or a random variable by a parameter.
Caveat: this method assumes independent random variables.
Parameters : | self : RV other : RV, Param, value |
---|---|
Returns : | new : RV
|
Multiply two random variables or a random variable by a parameter.
Caveat: this method assumes independent random variables.
Parameters : | self : RV other : RV, Param, value |
---|---|
Returns : | new : RV
|
Add a random variable and a parameter.
Parameters : | self : RV other : Param, value |
---|---|
Returns : | new : RV
|
Multiply a parameter by a random variable.
Parameters : | self : RV other : Param, value |
---|---|
Returns : | new : RV
|
Subtract two random variables or a random variable by a parameter.
Caveat: this method assumes independent random variables.
Parameters : | self : RV other : RV, Param, value |
---|---|
Returns : | new : RV
|
list of weak references to the object (if defined)
module interval.py
This module contains functions to estimate confidence or credible intervals.
Compute confidence intervals using a profile likelihood ratio method.
Interval estimation is done through steps:
The (minus log-)likelihood is computed from pf and data via the PF.nllf method.
Best estimates hat{theta_i} for each parameter theta_i are computed with the PF.maxlikelihood_fit method.
The confidence interval of theta_i around its best estimate is computed from a profile log-likelihood ratio function q defined as:
l = L(x|theta_i,\hat{\hat{theta_s}}) / L(x|\hat{theta_i},\hat{theta_s})
q(theta_i) = -2 * log(l)
where L is the likelihood function and theta_s are the nuisance parameters.
q(theta_i) is assumed to be described as a chi2 distribution (Wilks’ theorem). Bounds corresponding to a given confidence level (CL) are found by searching values for which q(theta_i) is equal to the chi2 quantile of CL:
quantile = scipy.stats.chi2.ppf(cl, ndf)
Parameters : | pf : statspy.core.PF
data : ndarray, tuple
kw : keyword arguments (optional)
|
---|---|
Returns : | params : statspy.core.Param list
corr : ndarray
quantile : float
|
module hypotest.py
This module contains functions to perform hypothesis tests.
Class to store results from an hypothesis test.
Among the variables stored in this class, there are:
Methods
x.__delitem__(y) <==> del x[y]
x.__setitem__(i, y) <==> x[i]=y
list of weak references to the object (if defined)
Convert a p-value to a Z-value.
Definition:
math.sqrt(2.) * scipy.special.erfcinv(mode * pvalue)
mode is equal to 2 for a two-sided Z-value and to 1 for a one-sided Z-value.
Parameters : | pvalue : float
mode : str
|
---|---|
Returns : | Zvalue : float
|
Convert a Z-value to a p-value.
Definition:
scipy.special.erfc(Zvalue / math.sqrt(2.)) / mode
mode is equal to 2 for a two-sided Z-value and to 1 for a one-sided Z-value.
Parameters : | Zvalue : float
mode : str
|
---|---|
Returns : | pvalue : float
|
Profile likelihood ratio test.
For the likelihood ratio test, the likelihood is maximized separately for the null and the alternative hypothesis. The word “profile” means that in addition, the likelihood is maximized wrt the nuisance parameters. The test statistics is then defined as:
l = L(x|theta_r,\hat{\hat{theta_s}}) / L(x|\hat{theta_r},\hat{theta_s})
q_obs = -2 * log(l)
and is distributed asymptotically as a chi2 distribution. q_obs can be used to compute a p-value = Pr(q >= q_obs).
Parameters : | pf : statspy.core.PF
data : ndarray, tuple
kw : keyword arguments (optional) |
---|---|
Returns : | result : statspy.hypotest.Result
|
Compute the exponential of a Parameter.
Parameters : | x : Param
|
---|---|
Returns : | y : Param
|
Examples
>>> import statspy as sp
>>> x = sp.Param("x = 4 +- 1")
>>> y = sp.exp(x)
Compute the logarithm of a Parameter.
Parameters : | x : Param
|
---|---|
Returns : | y : Param
|
Examples
>>> import statspy as sp
>>> x = sp.Param("x = 4 +- 1")
>>> y = sp.log(x)
Compute the square root of a Parameter.
Parameters : | x : Param
|
---|---|
Returns : | y : Param
|
Examples
>>> import statspy as sp
>>> x = sp.Param("x = 4 +- 1")
>>> y = sp.sqrt(x)
Returns a Param, PF or RV object.
Look in the different dictionaries if an object named obj_name exists and returns it.
Parameters : | obj_name : str
|
---|---|
Returns : | new : Param, PF, RV
|
Examples
>>> import statspy as sp
>>> mypmf = sp.PF('mypmf=poisson(n;lbda=5)')
>>> lbda = sp.get_obj('lbda')
>>> lbda.label = '\\lambda'