StatsTools · Rimu.jl

Module `Rimu/StatsTools`

The module Rimu/StatsTools contains helper function for analysis and post processing of Monte Carlo data.

Usage

The module is not exported by default. In oder to use its functions without explicitly specifying the submodule, import the module with

using Rimu, Rimu.StatsTools

Blocking analysis

For blocking analysis of a single time series use blocking_analysis, or mean_and_se.

For blocking analysis of a couple of time series where the ration of means is the quantity of interest use ratio_of_means.

Exported

Rimu.StatsTools — Module

Tools for the statistical analysis of Monte Carlo data.

Exports:

source

Rimu.StatsTools.blocking_analysis — Method

blocking_analysis(v::AbstractVector; α = 0.01, corrected = true)
-> BlockingResult(mean, err, err_err, p_cov, k, blocks)

Compute the sample mean mean and estimate the standard deviation of the mean (standard error) err of a correlated time series using the blocking algorithm from Flyvberg and Peterson JCP (1989) and the M test of Jonsson PRE (2018) at significance level $1-α$. k is the number of blocking transformations required to pass the hypothesis test for an uncorrelated time series and err_err the estimated standard error or err.

If decorrelating the time series fails according to the M test, NaN is returned as the standard error and -1 for k. corrected controls whether bias correction for variances is used.

See BlockingResult.

source

Rimu.StatsTools.mean_and_se — Method

mean_and_se(v::AbstractVector; α = 0.01, corrected::Bool=true) -> mean, err
mean_and_se(r::BlockingResult) -> mean, err

Return the mean and standard error (as a tuple) of a time series obtained from blocking_analysis. See also BlockingResult.

source

Rimu.StatsTools.ratio_of_means — Method

ratio_of_means(num, denom; α = 0.01, corrected = true, mc_samples = 10_000) -> r

Estimate the ratio of mean(num)/mean(denom) assuming that num and denom are possibly correlated time series. A blocking analysis with m-test is used to uncorrelate the time series, see blocking_analysis(). The remaining standard error and correlation of the means is propagated using MonteCarloMeasurements. The results are reported as a RatioBlockingResult.

Robust estimates for the ratio are obtained from median(r) and confidence intervals from quantile(), e.g. quantile(r, [0.025, 0.975]) for the 95% confidence interval.

Estimates from linear uncertainty propagation are returned as r.f and r.σ_f using x_by_y_linear().

source

Rimu.StatsTools.med_and_errs — Method

med_and_errs(p) -> (; med, err1_l, err1_u, err2_l, err2_u)

Convenience function for extracting plottable data from a distribution or an uncertain object created by the packages MonteCarloMeasurements or Measurements. Returns the median med and the lower err1_l and upper err1_u standard error (for 1σ or 68% confidence inteval). err2_l and err2_u provide the lower and upper error limits for the 2σ or 95% confidence interval.

Example:

julia> results = [Particles(d) for d in datasets] # Particles[]
julia> res_w_errs = med_and_errs.(results) # Vector of NamedTuple's with standard errors
julia> res_df = DataFrame(res_w_errs) # results as DataFrame with lower an upper error
1×5 DataFrame
 Row │ med      err1_l     err1_u     err2_l    err2_u
     │ Float64  Float64    Float64    Float64   Float64
─────┼────────────────────────────────────────────────────
   1 │ 1.01325  0.0173805  0.0183057  0.034042  0.0366713

source

Rimu.StatsTools.ratio_with_errs — Method

ratio_with_errs(r::RatioBlockingResult)
-> (;ratio=med, err1_l, err1_u, err2_l, err2_u, f, σ_f, δ_y, k, success)

Convenience function for extracting plottable data from RatioBlockingResult. Returns NamedTuple with median and standard error of r extracted by p_to_errs(). See also ratio_of_means().

Example:

julia> results = [ratio_of_means(n[i], d[i]; args...) for i in datasets]
julia> res_w_errs = ratio_with_errs.(results) # Vector of NamedTuple's with standard errors
julia> res_df = DataFrame(res_w_errs) # results as DataFrame with lower an upper error
1×10 DataFrame
 Row │ ratio    err1_l     err1_u     err2_l    err2_u     f        σ_f        δ_y        k      success
     │ Float64  Float64    Float64    Float64   Float64    Float64  Float64    Float64    Int64  Bool
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1.01325  0.0173805  0.0183057  0.034042  0.0366713  1.01361  0.0181869  0.0128806      2     true

source

Rimu.StatsTools.replica_fidelity — Method

replica_fidelity(rr::Tuple; p_field = :hproj, skip = 0)

Compute the fidelity of the average coefficient vector and the projector defined in p_field from the result of replica fci_qmc!() passed as argument rr (a tuple of DataFrames). Calls ratio_of_means() to perform a blocking analysis on a ratio of the means of separate time series and returns a RatioBlockingResult. The first skip steps in the time series are skipped.

The fidelity of states |ψ⟩ and |ϕ⟩ is defined as

\[F(ψ,ϕ) = \frac{|⟨ψ|ϕ⟩|^2}{⟨ψ|ψ⟩⟨ϕ|ϕ⟩} .\]

Specifically, replica_fidelity computes

\[F(\mathbf{v},⟨\mathbf{c}⟩) = \frac{⟨(\mathbf{c}_A⋅\mathbf{v})(\mathbf{v}⋅\mathbf{c}_B)⟩} {⟨\mathbf{c}_A⋅\mathbf{c}_B⟩} ,\]

where v is the projector specified by p_field, which is assumed to be normalised to unity with the two-norm (i.e. v⋅v == 1), and $\mathbf{c}_A$ and $\mathbf{c}_B$ are two replica coefficient vectors.

source

Rimu.StatsTools.to_measurement — Method

to_measurement(p::MonteCarloMeasurements.Particles) -> ::Measurements.measurement

Convert an uncertain number from MonteCarloMeasurements to Measurements format using the median as the central point. The new ± boundaries will include the 68% quantile around the median.

source

Rimu.StatsTools.growth_witness — Method

growth_witness(norm::AbstractArray, shift::AbstractArray, dt, [b]) -> g
growth_witness(df::DataFrame, [b]) -> g

Compute the growth witness

\[G^{(n)} = S^{(n)} - \frac{\vert\mathbf{c}^{(n+1)}\vert - \vert\mathbf{c}^{(n)}\vert}{\vert\mathbf{c}^{(n)}\vert d\tau},\]

where S is the shift and $\vert\mathbf{c}^{(n)}\vert ==$ norm[n, 1]. Setting b ≥ 1 a sliding average over b time steps is computed using smoothen().

source

Rimu.StatsTools.smoothen — Method

smoothen(noisy::AbstractVector, b)

Smoothen the array noisy by averaging over a sliding window of length b and wrapping noisy periodically. The mean(noisy) is preserved.

source

Rimu.StatsTools.growth_estimator — Method

growth_estimator(
    shift, wn, h, dτ;
    skip = 0,
    E_r = mean(shift[skip+1:end]),
    weights = w_exp,
    change_type = identity,
    kwargs...,
) -> (; E_gr::Particles, k, blocks, success)
growth_estimator(df::DataFrame, h; kwargs ...)

Compute the growth estimator with reference energy E_r by the reweighting technique described in Umirgar et al. (1993), see Eq. (20). shift and wn are equal length vectors containing the shift and walker number time series, respectively. Reweighting is done over h time steps and length(shift) - skip time steps are used for the blocking analysis done with ratio_of_means(). dτ is the time step and weights a function that calulates the weights. See w_exp() and w_lin().

\[E_{gr} = E_r - \frac{1}{dτ}\ln \frac{\sum_n w_{h+1}^{(n+1)} N_\mathrm{w}^{(n+1)}} {\sum_m w_{h}^{(m)} N_\mathrm{w}^{(m)}}\]

When h is greater than the autocorrelation time scale of the shift, then E_gr is an unbiased but approximate estimator for the ground state energy $E_0$ with an error $\mathcal{O}(dτ^2)$ and potentially increased confidence intervals compared to the (biased) shift estimator. Error propagation is done with MonteCarloMeasurements. Progagation through the logarithm can be modified by setting change_type to to_measurement in order to avoid NaN results from negative outliers.

If success==true the blocking analysis was successful in k-1 steps, using blocks uncorrelated data points.

Additional docstrings

MonteCarloMeasurements.Particles — Method

MonteCarloMeasurements.Particles(r::BlockingResult; mc_samples = 2000)
MonteCarloMeasurements.±(r::BlockingResult)

Convert a BlockingResult into a Particles object for nonlinear error propagation with MonteCarloMeasurements.

source

Rimu.StatsTools.BlockingResult — Type

BlockingResult(mean, err, err_err, p_cov, k, blocks)

Result of blocking_analysis().

Fields:

mean: sample mean
err: standard error (estimated standard deviation of the mean)
err_err: estimated uncertainty of err
p_cov: estimated pseudo covariance of mean, relevant for complex time series
k::Int: k-1 blocking steps were used to uncorrelate time series
blocks::Int: number of uncorrelated values after blocking

Has methods for mean_and_se, Measurements.:±, MonteCarloMeasurements.Particles, and Statistics.cov for Complex data.

source

Measurements.measurement — Method

measurement(r::BlockingResult)
Measurements.±(r::BlockingResult)

Convert a BlockingResult into a Measurement for linear error propagation with Measurements.

Limitation: Does not account for covariance in complex BlockingResult. Consider using MonteCarloMeasurements.Particles(r)!

source

Rimu.StatsTools.blocker — Method

blocker(v::Vector) -> new_v::Vector

Reblock the data by successively taking the mean of two adjacent data points to form a new vector with a half of the length(v). The last data point will be discarded if length(v) is odd.

source

Rimu.StatsTools.blocks_with_m — Method

blocks_with_m(v; corrected = true) -> (;blocks, mean, std_err, std_err_err, p_cov, mj)

Perform the blocking algorithm from Flyvberg and Peterson JCP (1989). Returns named tuple with the results from all blocking steps. See mtest().

source

Rimu.StatsTools.mtest — Method

mtest(mj::AbstractVector; α = 0.01, warn = true) -> k
mtest(table; α = 0.01, warn = true) -> k

Hypothesis test for decorrelation of a time series after blocking transformations with significance level $1-α$ after Jonson PRE (2018). mj or table.mj is expected to be a vector with relevant $M_j$ values from a blocking analysis as obtained from blocks_with_m(). Returns the row number k where the M-test is passed. If the M-test has failed mtest() returns the value -1 and optionally prints a warning message.

source

Statistics.cov — Method

cov(r::BlockingResult{<:Complex})

Return the covariance matrix of the multivariate normal distribution approximating the uncertainty of the blocking result r of a complex time series. See (https://en.wikipedia.org/wiki/Complexnormaldistribution).

source

Rimu.StatsTools.RatioBlockingResult — Type

RatioBlockingResult(ratio, f, σ_f, δ_y, k, success)

Result of ratio_of_means().

Fields:

ratio::P: ratio with uncertainties propagated by MonteCarloMeasurements
f::T: ratio of means
σ_f::T: std from linear propagation
δ_y::T: coefficient of variation for denominator (≤ 0.1 for normal approx)
k::Int: k-1 blocking steps were used to uncorrelate time series
blocks::Int: number of data values after blocking
success::Bool: false if any of the blocking steps failed

source

Rimu.StatsTools.ratio_estimators — Method

ratio_estimators(x, y, [k]; corrected=true, mc_samples=10_000) -> (; r, f, σ_f, δ_y, n)

Estimators for the ratio of means mean(x)/mean(y). If k is given, k-1 blocking steps are performed to remove internal correlations in the time series x and y. Otherwise these are assumed to be free of internal correlations. Correlations between x and y may be present and are taken into account.

Return values:

r::Particles is the Monte Carlo sampled ratio estimator, see Particles
f = mean(x)/mean(y)
σ_f standard deviation of f from linear error propagation (normal approximation)
δ_y = std(y)/mean(y) coefficient of variation; < 0.1 for normal approximation to work
n: number of uncorrelated data used for uncertainty estimation

source

Rimu.StatsTools.x_by_y_linear — Method

x_by_y_linear(μ_x,μ_y,σ_x,σ_y,ρ) -> f, σ_f

Linear error propagation for ratio f = x/y assuming x and y are correlated normal random variables and assuming the ratio can be approximated as a normal distribution. See wikipedia and Díaz-Francés, Rubio (2013).

\[σ_f = \sqrt{\frac{σ_x}{μ_y}^2 + \frac{μ_x σ_y}{μ_y^2}^2 - \frac{2 ρ μ_x}{μ_y^3}}\]

source

Rimu.StatsTools.autocovariance — Method

autocovariance(v::Vector,h::Int; corrected::Bool=true)

$\hat{\gamma}(h) =\frac{1}{n}\sum_{t=1}^{n-h}(v_{t+h}-\bar{v})(v_t-\bar{v})^*$ Calculate the autocovariance of dataset v with a delay h. If corrected is true (the default) then the sum is scaled with n-h, whereas the sum is scaled with n if corrected is false where n = length(v).

source

Rimu.StatsTools.pseudo_cov — Method

pseudo_cov(x, y; xmean = mean(x), ymean = mean(y), corrected = true)

Compute the pseudo covariance between collections x and y returning a scalar:

\[\frac{1}{n}\sum_{i=1}^{n} (x_i - \bar{x})(y_{i} - \bar{y})\]

Optionally, precomputed means can be passed as keyword arguments. pseudo_cov(x,y) is functionally equivalent to Statistics.cov(x, conj(y); corrected = false) but it is found to be significantly faster and avoids allocations.

source

Module Rimu/StatsTools

Usage

Blocking analysis

Exported

Additional docstrings

Index

Module `Rimu/StatsTools`