How to get started

This series of examples shows the very basics of how to get started with MRPY, using the different functionality. These examples aren’t “real world” ones, just toy ones to show the basic idea.

In [1]:
# General Imports
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

Core Functionality

Core functionality (i.e. calculation of the MRP function given input parameters, plus some other functions useful for normalising) is in the core module:

In [2]:
from mrpy import dndm      # resides in the base.core module, but imported into top-level namespace
m = np.logspace(10,15,500) # Create an array of masses
dn = dndm(m, logHs=14.0, alpha=-1.8, beta=0.85) # Create the MRP mass function
plt.plot(m,dn,lw=2)
plt.xscale('log')
plt.yscale('log')
../_images/examples_getting_started_5_0.png

Pure Stats

If you don’t care so much about the fact that the MRP is good for halo mass functions (or don’t know what a halo mass function is…), but want to use the statistical distribution, you’ll want the stats module. It contains an object called TGGD (short for Truncated Generalised Gamma Distribution), which has many statistical quantities and methods available (such as producing random variates, mean, mode etc.)

In [3]:
from mrpy import TGGD
tggd = TGGD(scale=1e14,a=1.0,b=0.85,xmin=1e10)
print "Mean: ", tggd.mean  # Mean of the distribution
print "Mode: ", tggd.mode  # Mode of the distirbution
print "Variance: ", tggd.variance # Variance of the distribution
print "Mean of sample: ", np.mean(tggd.rvs(1e5))  #Produce 1e5 random variates and take the mean
plt.plot(m, tggd.pdf(m))      #Plot the PDF
plt.xscale('log')
Mean:  2.84867015869e+14
Mode:  1.21070004341e+14
Variance:  4.79705281452e+28
Mean of sample:  2.83287193279e+14
../_images/examples_getting_started_8_1.png

Physical Dependence

The physical_dependence module contains a counterpart to the basic dndm function, called mrp_b13, which returns the best-fit MRP according to input physical variables (redshift, matter density, rms mass variance). These are derived from fits to the theoretical mass function of Behroozi+2013.

In [4]:
from mrpy import mrp_b13
dndm_z1 = mrp_b13(m,z=1)           # HMF at redshift 1
dndm_z0 = mrp_b13(m,sigma_8=0.85)  # HMF at redshift 0 but sigma_8=0.85
plt.plot(m,dndm_z1,label="z=1")
plt.plot(m,dndm_z0,label="z=0")
plt.legend(loc=0)
plt.xscale('log')
plt.yscale('log')
../_images/examples_getting_started_11_0.png

Fitting MRP

Fitting Curve Data

The fit_curve module contains routines to fit the MRP to binned/curve data. This can be a theoretical curve, or binned halos (or other variates). There are several options available, and the gradient of the objective function is specified analytically to improve performance. See Murray, Robotham, Power, 2016 (in prep.) for more details.

In [5]:
from mrpy.fitting.fit_curve import get_fit_curve
dn = dndm(m,logHs=14.0,alpha=-1.9,beta=0.75,norm=1)
result, curve_obj = get_fit_curve(m,dn,[14.5, -1.8, 0.85,0.],
                                  bounds = [[None,None], [-2,-1.5], [0.2,1.5], [None,None]]) #The bounds argument is very important at this stage.
print result
plt.plot(curve_obj.logm,curve_obj.dndm()/dn-1,lw=2)
      fun: 1.6704623362075916e-12
 hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
      jac: array([ -8.60315661e-05,  -1.13296916e-04,   6.03274702e-05,
        -3.79420248e-06])
  message: 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
     nfev: 28
      nit: 25
   status: 0
  success: True
        x: array([  1.40000001e+01,  -1.90000003e+00,   7.50000058e-01,
        -5.62234817e-07])
Out[5]:
[<matplotlib.lines.Line2D at 0x7f072e383dd0>]
../_images/examples_getting_started_15_2.png

In the previous example, we simply fit the four MRP parameters to the input curve. Options can be specified to constrain the normalisation via the known mean density of the Universe (see MRP, 2017 Appendix C.1 for details).

Fitting Samples

To fit actual samples of halos, use the fit_sample module. At this stage, only samples with no measurement uncertainties, and a constant volume (per subsample) are supported. Usefully, this covers the case of output halos from simulations. Within this context, either a downhill-gradient method or MCMC can be used.

We hope to provide more general fitting scenarios via configuration with other packages in the future.

An example of using the downhill method is as follows:

In [6]:
from mrpy.base.stats import TGGD
from mrpy.fitting.fit_sample import SimFit

# Create some mock data to fit
r = TGGD(scale=1e14,a=-1.8,b=1.0,xmin=1e12).rvs(1e5)
r = np.sort(r)

# Create fit object, specifying parameter bounds
fitobj = SimFit(r,hs_bounds=(12,16),alpha_bounds=(-1.99,-1.6),beta_bounds=(0.5,1.5))

# Run downhill gradient method
res, obj = fitobj.run_downhill()

# Print the resulting parameters
print "Resulting Parameters: ", res.x

# The "obj" returned is a list of PerObjLike objects defined at the result, containing lots of methods and quantities:
print "Hessian at solution: ", obj[0].hessian
print "Covariance at solution: ", obj[0].cov

# Plot the mass function from obj (the logm contains all the masses from the fit)
plt.plot(obj[0].logm,obj[0].dndm(log=True),lw=2)
Resulting Parameters:  [ 13.9900839   -1.79700413   0.94765685 -24.43903982]
Hessian at solution:  [[-1838656.7724181   1485394.57073756  -508029.87769594  -427802.81517913]
 [ 1485394.57073756 -1328344.72287371   412360.36967558   351970.2695528 ]
 [ -508029.87769594   412360.36967558  -142628.15625157  -118826.3858363 ]
 [ -427802.81517913   351970.2695528   -118826.3858363   -100000.08553355]]
Covariance at solution:  [[ 0.00077558 -0.00025638  0.00122131 -0.00567158]
 [-0.00025638  0.00010004 -0.00046645  0.00200319]
 [ 0.00122131 -0.00046645  0.00287933 -0.01028797]
 [-0.00567158  0.00200319 -0.01028797  0.04354857]]
Out[6]:
[<matplotlib.lines.Line2D at 0x7f072cbebd50>]
../_images/examples_getting_started_19_2.png