Module StatsTools
The  module StatsTools contains helper function for analysis and post processing of Monte Carlo data.
Blocking analysis
After equilibration, FCIQMC produces information about observables through correlated time series. In order to estimate the statistical errors the time series need to be decorrelated. The main workhorse for achieving this is the blocking_analysis, which is based on the paper by Flyvberg and Peterson JCP (1989), and automated with the M test of Jonsson PRE (2018).
Analysing the stochastic errors of observables obtained from the ratio of sample means is done with ratio_of_means, where error propagation of correlated uncertainties is done with the help of the package MonteCarloMeasurements.
Many convenience functions are implemented for directly analysing data obtained from solve as a DataFrame. See, e.g., shift_estimator and projected_energy. Asymptotically unbiased estimators are implemented as mixed_estimator, growth_estimator and rayleigh_replica_estimator.
Exported
Rimu.StatsTools — ModuleTools for the statistical analysis of Monte Carlo data.
Exports:
- blocking_analysis
- blocking_analysis_data
- ratio_of_means
- growth_witness
- smoothen
- shift_estimator
- projected_energy
- variational_energy_estimator
- growth_estimator
- growth_estimator_analysis
- mixed_estimator
- mixed_estimator_analysis
- rayleigh_replica_estimator
- rayleigh_replica_estimator_analysis
- val_and_errs
- val
- mean_and_se
Rimu.StatsTools.blocking_analysis — Methodblocking_analysis(v::AbstractVector; α = 0.01, corrected = true, skip=0, warn=true)
-> BlockingResult(mean, err, err_err, p_cov, k, blocks)Compute the sample mean mean and estimate the standard deviation of the mean (standard error) err of a correlated time series. It uses the blocking algorithm from Flyvberg and Peterson JCP (1989) and the M test of Jonsson PRE (2018) at significance level $1-α$.
Use skip to skip the first skip elements in v. corrected controls whether bias correction for variances is used. If decorrelating the time series fails according to the M test, NaN is returned as the standard error and -1 for k. The keyword argument warn controls whether a warning message is logged.
The summary result is returned as a BlockingResult. k - 1 is the number of blocking transformations required to pass the hypothesis test for an uncorrelated time series and err_err the estimated standard error or err.
The detailed results from each reblocking step can be obtained with blocking_analysis_data.
See BlockingResult, shift_estimator, ratio_of_means, blocking_analysis_data.
Rimu.StatsTools.blocking_analysis_data — Methodblocking_analysis_data(v::AbstractVector; kwargs...) ->
(; br::BlockingResult, df::DataFrame)Perform a blocking_analysis and return the summary result br as well as a DataFrame df with information about the standard error in each blocking step.
For a description of the keyword arguments see blocking_analysis.
Example
julia> data = smoothen(rand(10_000), 2^6); # generate correlated data
julia> br, df = blocking_analysis_data(data)
(br = BlockingResult{Float64}
  mean = 0.5088 ± 0.0029
  with uncertainty of ± 0.00023454488294744232
  from 78 blocks after 7 transformations (k = 8).
, df = 13×6 DataFrame
 Row │ blocks  mean      std_err      std_err_err  p_cov       mj
     │ Int64   Float64   Float64      Float64      Float64     Float64
─────┼──────────────────────────────────────────────────────────────────────
   1 │  10000  0.508806  0.000375044  2.6521e-6    1.40658e-7  9715.08
   2 │   5000  0.508806  0.000528547  5.28599e-6   2.79361e-7  4778.14
   3 │   2500  0.508806  0.000743386  1.05152e-5   5.52622e-7  2298.64
   4 │   1250  0.508806  0.00104064   2.08212e-5   1.08293e-6  1056.24
   5 │    625  0.508806  0.00144177   4.08121e-5   2.07871e-6   427.949
   6 │    312  0.508736  0.00194209   7.78707e-5   3.77171e-6   128.711
   7 │    156  0.508736  0.00247921   0.00014081   6.14647e-6    17.3075
   8 │     78  0.508736  0.00291063   0.000234545  8.47174e-6     0.731386
   9 │     39  0.508736  0.00284613   0.000326474  8.10046e-6     0.901054
  10 │     19  0.508241  0.0026998    0.000449967  7.28892e-6     2.85915
  11 │      9  0.507939  0.00359907   0.000899766  1.29533e-5     1.08644
  12 │      4  0.509327  0.00440559   0.00179857   1.94092e-5     0.0370381
  13 │      2  0.509327  0.00432708   0.00305971   1.87237e-5     0.125)
julia> using StatsPlots; unicodeplots();
julia> plot([br.k,br.k],[0.0,maximum(df.std_err.+df.std_err_err)], label="m test");
julia> @df df plot!(
           1:length(:std_err), :std_err;
           err=:std_err_err, xlabel="k", label="std err",
           title="std err vs blocking steps"
       )
               ⠀⠀⠀⠀⠀⠀⠀⠀⠀std err vs blocking steps⠀⠀⠀⠀⠀⠀⠀⠀
               ┌────────────────────────────────────────┐
    0.00423501 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⠀⠀⠀⠀│ m test
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠀⢸⠀⠀⠀⠀│ std err
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀⠀⢸⠀⠀⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀⠀⢸⠀⠀⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⡠⢺⠒⠒⢺⠀⠀⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⡀⠀⠀⡆⣀⠤⡗⠉⠀⢸⠀⠀⢸⡆⠀⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡧⠤⠔⡗⠊⠉⡏⠀⠀⡇⠀⠀⢸⠀⠀⢸⢣⠀⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠔⠁⡇⠀⠀⠁⠀⠀⠁⠀⠀⠁⠀⠀⠀⠀⠀⢸⠸⡀⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠴⠁⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠸⠀⡇⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠔⠁⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠔⠊⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⣦⠀│
               │⠀⠀⠀⠀⠀⠀⠀⠀⡠⠔⠒⠁⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢹⠀│
               │⠀⠀⠀⢀⣀⠤⠒⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀│
               │⠀⠒⠉⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀│
   -0.00012335 │⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠧⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤│
               └────────────────────────────────────────┘
               ⠀0.64⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀k⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀13.36⠀A vertical line at k==8 indicates the blocking step identified by hypothesis testing to decorrelate the time series data. The decorrelation length can thus be estimated at $2^{k-1} = 2^7 = 128$. Note that the data was correlated with a sliding window of $2^6$ steps.
Rimu.StatsTools.mean_and_se — Methodmean_and_se(v::AbstractVector; α = 0.01, corrected::Bool=true, skip=0) -> mean, err
mean_and_se(r::BlockingResult) -> mean, errReturn the mean and standard error (as a tuple) of a time series obtained from blocking_analysis. See also BlockingResult.
Statistics.cov — Methodcov(r::BlockingResult{<:Complex})Return the covariance matrix of the multivariate normal distribution approximating the uncertainty of the blocking result r of a complex time series. See (https://en.wikipedia.org/wiki/Complexnormaldistribution).
Rimu.StatsTools.ratio_of_means — Methodratio_of_means(num, denom; α=0.01, corrected=true, mc_samples=nothing, skip=0, warn=true)
-> r::RatioBlockingResultEstimate the ratio of mean(num)/mean(denom) assuming that num and denom are possibly correlated time series, skipping the first skip elements. A blocking analysis with m-test is used to uncorrelate the time series, see blocking_analysis. The remaining standard error and correlation of the means is propagated using MonteCarloMeasurements. The results are reported as a RatioBlockingResult.
Robust estimates for the ratio are obtained from pmedian(r) and confidence intervals from pquantile(), e.g. pquantile(r, [0.025, 0.975]) for the 95% confidence interval.
Estimates from linear uncertainty propagation are returned as r.f and r.σ_f using x_by_y_linear. The standard error estimate r.σ_f should only be trusted when the coefficient of variation std(denom)/mean(denom) is small: abs(r.δ_y) < 0.1. Under this condition can the ratio be approximated as a normal distribution. See wikipedia and Díaz-Francés, Rubio (2013)
The keyword mc_samples controls the number of samples used for error propagation by MonteCarloMeasurements. Use nothing for the default and Val(1000) to set the number to 1000 samples in a type-consistent way.
The keyword warn controls whether warning messages are logged when blocking fails or noisy denominators are encountered.
Note: to compute statistics on the RatioBlockingResult, use functions pmedian, pquantile, pmiddle, piterate, pextrema, pminimum, pmaximum, pmean, and pcov.
Rimu.StatsTools.errs — Methoderrs(x; n=1, p=nothing, name=:err) -> (; err_l, err_u)Return the lower and upper error bar for the uncertain value x.
See val_and_errs.
Rimu.StatsTools.to_measurement — Methodto_measurement(p::MonteCarloMeasurements.Particles) -> ::Measurements.measurementConvert an uncertain number from MonteCarloMeasurements  to Measurements format  using the median as the central point. The new ± boundaries will include  the 68% quantile around the median.
Rimu.StatsTools.val — Methodval(x)Return the best estimate value for an uncertain x. Defaults to the median for uncertain x represented by a (sampled) distribution. Supports  MonteCarloMeasurements and Measurements.
Rimu.StatsTools.val_and_errs — Methodval_and_errs(x; n=1, p=nothing, name=:val) -> (;val, val_l, val_u)Return the median and the lower and upper error bar for the uncertain value x as a NamedTuple. This is useful for plotting scripts. The interval [val - val_l, val + val_u] represents the confidence interval at level n*σ, or at probability p. Setting p overrides n. Supports  MonteCarloMeasurements and Measurements. The  names in the NamedTuple can be changed with name.
Example:
julia> results = [blocking_analysis(i:0.1:2i+20) for i in 1:3]; # mock results
julia> v = val_and_errs.(results, name="res"); # Vector of NamedTuple's with standard errors
julia> DataFrame(v)
3×3 DataFrame
 Row │ res      res_l    res_u
     │ Float64  Float64  Float64
─────┼───────────────────────────
   1 │    11.5  1.7282   1.7282
   2 │    13.0  1.7282   1.7282
   3 │    14.5  1.78885  1.78885See NamedTuple, val, errs, BlockingResult, RatioBlockingResult.
Rimu.StatsTools.growth_witness — Functiongrowth_witness(df::DataFrame, [b];
    shift=:shift,
    norm=:norm,
    time_step=determine_constant_time_step(df),
    skip=0
)
growth_witness(sim::PMCSimulation, [b]; kwargs...)Calculate the growth witness directly from the result (DataFrame or PMCSimulation) of solveing a ProjectorMonteCarloProblem. The keyword arguments shift and norm can be used to change the names of the relevant columns.
Rimu.StatsTools.growth_witness — Methodgrowth_witness(shift::AbstractArray, norm::AbstractArray, dt, [b]; skip=0)Compute the growth witness
\[G^{(n)} = S^{(n)} - \frac{\vert\mathbf{c}^{(n+1)}\vert - \vert\mathbf{c}^{(n)}\vert}{\vert\mathbf{c}^{(n)}\vert d\tau},\]
where S is the shift and $\vert\mathbf{c}^{(n)}\vert ==$ norm[n, 1]. Setting b ≥ 1 a sliding average over b time steps is computed using smoothen(). The first skip time steps are skipped. mean(growth_witness) is approximately the same as growth_estimator with h=0.
See also growth_estimator.
Rimu.StatsTools.smoothen — Methodsmoothen(noisy::AbstractVector, b)Smoothen the array noisy by averaging over a sliding window of length b and wrapping noisy periodically. The mean(noisy) is preserved.
Rimu.StatsTools.growth_estimator — Methodgrowth_estimator(
    shift, wn, h, time_step;
    skip = 0,
    E_r = mean(shift[skip+1:end]),
    weights = w_exp,
    change_type = identity,
    kwargs...
)
growth_estimator(
    df::DataFrame, h;
    shift_name=:shift,
    norm_name=:norm,
    time_step=determine_constant_time_step(df),
    kwargs...
)
growth_estimator(sim::PMCSimulation; kwargs...)
-> r::RatioBlockingResultCompute the growth estimator with reference energy E_r by the reweighting technique described in Umrigar et al. (1993), see Eq. (20). shift and wn are equal length vectors containing the shift and walker number time series, respectively.  Reweighting is done over h time steps and length(shift) - skip time steps are used for the blocking analysis done with ratio_of_means. weights is a function that calulates the weights. See w_exp and w_lin.
\[E_{gr} = E_r - \frac{1}{dτ}\ln \frac{\sum_n w_{h+1}^{(n+1)} N_\mathrm{w}^{(n+1)}} {\sum_m w_{h}^{(m)} N_\mathrm{w}^{(m)}} ,\]
where $dτ$ is the time_step
When h is greater than the autocorrelation time scale of the shift, then E_gr (returned as r.ratio) is an unbiased but approximate estimator for the ground state energy $E_0$ with an error $\mathcal{O}(dτ^2)$ and potentially increased confidence intervals compared to the (biased) shift estimator.  Error propagation is done with MonteCarloMeasurements. Propagation through the logarithm can be modified by setting change_type to to_measurement in order to avoid NaN results from negative outliers.
If success==true the blocking analysis was successful in k-1 steps, using blocks uncorrelated data points.
The second method calculates the growth estimator directly from a PMCSimulation or DataFrame returned by solve. The keyword arguments shift_name and norm_name can be used to change the names of the relevant columns.
See also mixed_estimator and RatioBlockingResult.
Rimu.StatsTools.growth_estimator_analysis — Methodgrowth_estimator_analysis(df::DataFrame; kwargs...)
growth_estimator_analysis(sim::PMCSimulation; kwargs...)
-> (; df_ge, correlation_estimate, se, se_l, se_u)Compute the growth_estimator on a DataFrame df or PMCSimulation sim. repeatedly over a range of reweighting depths.
Returns a NamedTuple with the fields
- df_ge:- DataFramewith reweighting depth and- growth_estimatordata. See example below.
- correlation_estimate: estimated correlation time from blocking analysis
- se, se_l, se_u:- shift_estimatorand error
Keyword arguments
- h_range: The default is about- h_valuesvalues from 0 to twice the estimated correlation time
- h_values = 100: minimum number of reweighting depths
- skip = 0: initial time steps to exclude from averaging
- threading = Threads.nthreads() > 1: if- falsea progress meter is displayed
- shift_name = :shiftname of column in- dfwith shift data
- norm_name = :normname of column in- dfwith walkernumber data
- time_step = determine_constant_time_step(df)the time step
- warn = truewhether to log warning messages when blocking fails or denominators are small
Example
sim = solve(...)
df_ge, correlation_estimate, se, se_l, se_u = growth_estimator_analysis(sim; skip=5_000)
using StatsPlots
@df df_ge plot(_ -> se, :h, ribbon = (se_l, se_u), label = "⟨S⟩") # constant line and ribbon for shift estimator
@df df_ge plot!(:h, :val, ribbon = (:val_l, :val_u), label="E_gr") # growth estimator as a function of reweighting depth
xlabel!("h")See also: growth_estimator, mixed_estimator_analysis.
Rimu.StatsTools.mixed_estimator — Methodmixed_estimator(
    hproj, vproj, shift, h, time_step;
    skip = 0,
    E_r = mean(shift[skip+1:end]),
    weights = w_exp,
    kwargs...
)
mixed_estimator(
    df::DataFrame, h;
    hproj_name=:hproj,
    vproj_name=:vproj,
    shift_name=:shift,
    time_step=determine_constant_time_step(df),
    kwargs...
)
mixed_estimator(sim::PMCSimulation, h; kwargs...)
-> r::RatioBlockingResultCompute the mixed estimator by the reweighting technique described in Umrigar et al. (1993), Eq. (19)
\[E_\mathrm{mix} = \frac{\sum_n w_{h}^{(n)} (Ĥ'\mathbf{v})⋅\mathbf{c}^{(n)}} {\sum_m w_{h}^{(m)} \mathbf{v}⋅\mathbf{c}^{(m)}} ,\]
where the time series hproj == $(Ĥ'\mathbf{v})⋅\mathbf{c}^{(n)}$ and vproj == $\mathbf{v}⋅\mathbf{c}^{(m)}$ have the same length as shift (See ProjectedEnergy on how to set these up).  Reweighting is done over h time steps and length(shift) - skip time steps are used for the blocking analysis done with ratio_of_means. weights is a function that calulates the weights. See w_exp and w_lin.  Additional keyword arguments are passed on to ratio_of_means.
When h is greater than the autocorrelation time scale of the shift, then r.ratio is an unbiased but approximate estimator for the ground state energy $E_0$ with an error $\mathcal{O}(dτ^2)$, where $dτ$ is the time_step, and potentially increased confidence intervals compared to the unweighted ratio.  Error propagation is done with MonteCarloMeasurements. Results are returned as RatioBlockingResult.
The second method calculates the mixed energy estimator directly from a DataFrame or PMCSimulation returned by solve. The keyword arguments hproj_name, vproj_name, and shift_name can be used to change the names of the relevant columns.
See also growth_estimator.
Rimu.StatsTools.mixed_estimator_analysis — Methodmixed_estimator_analysis(df::DataFrame; kwargs...)
mixed_estimator_analysis(sim::PMCSimulation; kwargs...)
-> (; df_me, correlation_estimate, se, se_l, se_u)Compute the mixed_estimator on a DataFrame df or PMCSimulation sim returned from solve repeatedly over a range of reweighting depths.
Returns a NamedTuple with the fields
- df_me:- DataFramewith reweighting depth and- mixed_estiamatordata. See example below.
- correlation_estimate: estimated correlation time from blocking analysis
- se, se_l, se_u:- shift_estimatorand error
Keyword arguments
- h_range: The default is about- h_valuesvalues from 0 to twice the estimated correlation time
- h_values = 100: minimum number of reweighting depths
- skip = 0: initial time steps to exclude from averaging
- threading = Threads.nthreads() > 1: if- falsea progress meter is displayed
- shift_name = :shiftname of column in- dfwith shift data
- hproj_name = :hprojname of column in- dfwith operator overlap data
- vproj_name = :vprojname of column in- dfwith projector overlap data
- time_step = determine_constant_time_step(df)the time step
- warn = truewhether to log warning messages when blocking fails or denominators are small
Example
sim = solve(...)
df_me, correlation_estimate, se, se_l, se_u = mixed_estimator_analysis(sim; skip=5_000)
using StatsPlots
@df df_me plot(_ -> se, :h, ribbon = (se_l, se_u), label = "⟨S⟩") # constant line and ribbon for shift estimator
@df df_me plot!(:h, :val, ribbon = (:val_l, :val_u), label="E_mix") # mixed estimator as a function of reweighting depth
xlabel!("h")See also: mixed_estimator, growth_estimator_analysis.
Rimu.StatsTools.projected_energy — Methodprojected_energy(df::DataFrame; skip=0, hproj=:hproj, vproj=:vproj, kwargs...)
projected_energy(sim::PMCSimulation; kwargs...)
-> r::RatioBlockingResultCompute the projected energy estimator
\[E_\mathrm{p} = \frac{\sum_n \mathbf{v}⋅Ĥ\mathbf{c}^{(n)}} {\sum_m \mathbf{v}⋅\mathbf{c}^{(m)}} ,\]
where the time series df.hproj == $\mathbf{v}⋅Ĥ\mathbf{c}^{(n)}$ and df.vproj == $\mathbf{v}⋅\mathbf{c}^{(m)}$ are taken from df, skipping the first skip entries (use post_step_strategy =ProjectedEnergy(...) to set these up in ProjectorMonteCarloProblem). projected_energy is equivalent to mixed_estimator with h=0.
The keyword arguments hproj and vproj can be used to change the names of the relevant columns. Other kwargs are passed on to ratio_of_means. Returns a RatioBlockingResult.
See NamedTuple, val_and_errs, val, errs for processing results.
Rimu.StatsTools.rayleigh_replica_estimator — Methodrayleigh_replica_estimator(
    op_ol, vec_ol, shift, h, time_step;
    skip = 0,
    E_r = mean(shift[skip+1:end]),
    weights = w_exp,
    kwargs...
)
rayleigh_replica_estimator(
    df::DataFrame;
    shift_name="shift",
    op_name="Op1",
    vec_name="dot",
    spectral_state=1,
    h=0,
    skip=0,
    Anorm=1,
    kwargs...
)
rayleigh_replica_estimator(sim::PMCSimulation; kwargs...)
-> r::RatioBlockingResultCompute the estimator of a Rayleigh quotient of operator $\hat{A}$ with reweighting,
\[A_\mathrm{est}(h) = \frac{\sum_{a<b} \sum_n w_{h,a}^{(n)} w_{h,b}^{(n)} \mathbf{c}_a^{(n)} \cdot \hat{A} \cdot \mathbf{c}_b^{(n)}} {\sum_{a<b} \sum_n w_{h,a}^{(n)} w_{h,b}^{(n)} \mathbf{c}_a^{(n)} \cdot \mathbf{c}_b^{(n)}},\]
using data from multiple replicas.
Argument op_ol holds data for the operator overlap $\mathbf{c}_a^{(n)} \hat{A} \mathbf{c}_b^{(n)}$ and vec_ol holds data for the vector overlap $\mathbf{c}_a^{(n)} \mathbf{c}_b^{(n)}$. They are of type Vector{Vector}, with each element Vector holding the data for a pair of replicas. Argument shift is of type Vector{Vector}, with each element Vector holding the shift data for each individual replica.
The second method computes the Rayleigh quotient directly from a DataFrame or PMCSimulation returned by solve. The keyword arguments shift_name, op_name and vec_name can be used to change the names of the relevant columns, see AllOverlaps for default formatting. The operator overlap data can be scaled by a prefactor Anorm. A specific reweighting depth can be set with keyword argument h. The default is h = 0 which calculates the Rayleigh quotient without reweighting. To compute the Rayleigh quotient for the nth spectral state, set spectral_state = n.
The reweighting is an extension of the mixed estimator using the reweighting technique described in Umrigar et al. (1993). Reweighting is done over h time steps and length(shift) - skip time steps are used for the blocking analysis done with ratio_of_means. weights is a function that calulates the weights. See w_exp and w_lin. Additional keyword arguments are passed on to ratio_of_means.
Error propagation is done with MonteCarloMeasurements. Results are returned as RatioBlockingResult.
See also mixed_estimator, growth_estimator.
Rimu.StatsTools.rayleigh_replica_estimator_analysis — Methodrayleigh_replica_estimator_analysis(df::DataFrame; kwargs...)
rayleigh_replica_estimator_analysis(sim::PMCSimulation; kwargs...)
-> (; df_rre, df_se)Compute the rayleigh_replica_estimator on a DataFrame df or PMCSimulation sim returned from solve repeatedly over a range of reweighting depths.
Returns a NamedTuple with the fields
- df_rre:- DataFramewith reweighting depth and- rayleigh_replica_estimatordata. See example below.
- df_se:- DataFramewith- shift_estimatoroutput, one row per replica
Keyword arguments
- h_range: The default is about- h_valuesvalues from 0 to twice the estimated correlation time
- h_values = 100: minimum number of reweighting depths
- skip = 0: initial time steps to exclude from averaging
- threading = Threads.nthreads() > 1: if- falsea progress meter is displayed
- shift_name = "shift": shift data corresponding to column in- dfwith names- <shift>_r1s1, ...
- op_name = "Op1": name of operator overlap corresponding to column in- dfwith names- r1s1_<op_ol>_r2s1, ...
- vec_name = "dot": name of vector-vector overlap corresponding to column in- dfwith names- r1s1_<vec_ol>_r2s1, ...
- spectral_state = 1: which spectral state to use
- Anorm = 1: a scalar prefactor to scale the operator overlap data
- warn = true: whether to log warning messages when blocking fails or denominators are small
Example
sim = solve(...)
df_rre, df_se = rayleigh_replica_estimator_analysis(sim; skip=5_000)
using StatsPlots
@df df_rre plot(_ -> se, :h, ribbon = (se_l, se_u), label = "⟨S⟩") # constant line and ribbon for shift estimator
@df df_rre plot!(:h, :val, ribbon = (:val_l, :val_u), label="E_mix") # Rayleigh quotient estimator as a function of reweighting depth
xlabel!("h")See also: rayleigh_replica_estimator, mixed_estimator_analysis, AllOverlaps.
Rimu.StatsTools.shift_estimator — Methodshift_estimator(df::DataFrame; shift=:shift, kwargs...)
shift_estimator(sim::PMCSimulation; kwargs...)
-> r::BlockingResultReturn the shift estimator from the data in df.shift. The keyword argument shift can be used to change the name of the relevant column. Other keyword arguments are passed on to blocking_analysis. Returns a BlockingResult.
See also growth_estimator, projected_energy.
Rimu.StatsTools.w_exp — Methodw_exp(shift, h, time_step; E_r = mean(shift), skip = 0)Compute the weights for reweighting over h time steps with reference energy E_r from the exponential formula
\[w_h^{(n)} = \prod_{j=1}^h \exp[-dτ(S^{(q+n-j)}-E_r)] ,\]
where q = skip and $dτ$ is the time_step.
See also w_lin, growth_estimator, mixed_estimator.
Rimu.StatsTools.w_lin — Methodw_lin(shift, h, time_step; E_r = mean(shift), skip = 0)Compute the weights for reweighting over h time steps with reference energy E_r from the linearised formula
\[w_h^{(n)} = \prod_{j=1}^h [1-dτ(S^{(q+n-j)}-E_r)] ,\]
where q = skip and $dτ$ is the time_step.
See also w_exp, growth_estimator, mixed_estimator.
Rimu.StatsTools.replica_fidelity — Methodreplica_fidelity(df::DataFrame; p_field = :hproj, spectral_state = 1, skip = 0)
replica_fidelity(sim::PMCSimulation; kwargs...)Compute the fidelity of the average coefficient vector and the projector defined in p_field from the PMCSimulation or DataFrame returned by solve, using replicas _r1s{i} and _r2s{i}, where i is the spectral_state. Calls ratio_of_means to perform a blocking analysis on a ratio of the means of separate time series and returns a RatioBlockingResult. The first skip steps in the time series are skipped.
The fidelity of states |ψ⟩ and |ϕ⟩ is defined as
\[F(ψ,ϕ) = \frac{|⟨ψ|ϕ⟩|^2}{⟨ψ|ψ⟩⟨ϕ|ϕ⟩} .\]
Specifically, replica_fidelity computes
\[F(\mathbf{v},⟨\mathbf{c}⟩) = \frac{⟨(\mathbf{c}_1⋅\mathbf{v})(\mathbf{v}⋅\mathbf{c}_1)⟩} {⟨\mathbf{c}_1⋅\mathbf{c}_1⟩} ,\]
where v is the projector specified by p_field, which is assumed to be normalised to unity with the two-norm (i.e. v⋅v == 1), and $\mathbf{c}_1$ and $\mathbf{c}_2$ are two replica coefficient vectors.
Rimu.StatsTools.variational_energy_estimator — Methodvariational_energy_estimator(shifts, overlaps; kwargs...)
variational_energy_estimator(df::DataFrame; max_replicas=:all, spectral_state=1, kwargs...)
variational_energy_estimator(sim::PMCSimulation; kwargs...)
-> r::RatioBlockingResultCompute the variational energy estimator from the replica time series of the shifts and coefficient vector overlaps by blocking analysis. The keyword argument max_replicas can be used to constrain the number of replicas processed to be smaller than all available in df. The keyword argument spectral_state determines which spectral state's energy is computed. Other keyword arguments are passed on to ratio_of_means(). Returns a RatioBlockingResult.
An estimator for the variational energy
\[\frac{⟨\mathbf{c}⟩^† \mathbf{H}⟨\mathbf{c}⟩}{⟨\mathbf{c}⟩^†⟨\mathbf{c}⟩}\]
is calculated from
\[Ē_{v} = \frac{\sum_{a<b}^R \overline{(S_a+S_b) \mathbf{c}_a^† \mathbf{c}_b}} {2\sum_{a<b}^R \overline{\mathbf{c}_a^† \mathbf{c}_b}} ,\]
where the sum goes over distinct pairs out of the $R$ replicas. See arXiv:2103.07800. In the case of a non-Hermitian Hamiltonian, $Ē_{v}$ provides an estimator only for the real part of the energy.
The DataFrame and PMCSimulation versions can extract the relevant information from the result of solve. Set up the ProjectorMonteCarloProblem with the keyword argument replica_strategy = AllOverlaps(R) and R ≥ 2. If passing shifts and overlaps, the data has to be arranged in the correct order (as provided in the DataFrame version).
See AllOverlaps.
Additional docstrings
MonteCarloMeasurements.Particles — MethodMonteCarloMeasurements.Particles(r::BlockingResult; mc_samples = 2000)
MonteCarloMeasurements.±(r::BlockingResult)Convert a BlockingResult into a Particles object for nonlinear error propagation with MonteCarloMeasurements.
Rimu.StatsTools.BlockingResult — TypeBlockingResult(mean, err, err_err, p_cov, k, blocks)Result of blocking_analysis.
Fields:
- mean: sample mean
- err: standard error (estimated standard deviation of the mean)
- err_err: estimated uncertainty of- err
- p_cov: estimated pseudo covariance of- mean, relevant for complex time series
- k::Int: k-1 blocking steps were used to uncorrelate time series
- blocks::Int: number of uncorrelated values after blocking
Has methods for NamedTuple, val_and_errs, val, errs, mean_and_se, Measurements.:±, MonteCarloMeasurements.Particles, and Statistics.cov for Complex data.
Example:
julia> blocking_analysis(smoothen(randn(2^10), 2^5))
BlockingResult{Float64}
  mean = -0.026 ± 0.029
  with uncertainty of ± 0.003638545517264226
  from 32 blocks after 5 transformations (k = 6).Measurements.measurement — Methodmeasurement(r::BlockingResult)
Measurements.±(r::BlockingResult)Convert a BlockingResult into a Measurement for linear error propagation with Measurements.
Limitation: Does not account for covariance in complex BlockingResult. Consider using MonteCarloMeasurements.Particles(r)!
Rimu.StatsTools.blocker — Methodblocker(v::Vector) -> new_v::VectorReblock the data by successively taking the mean of two adjacent data points to form a new vector with a half of the length(v). The last data point will be discarded if length(v) is odd.
Rimu.StatsTools.blocks_with_m — Methodblocks_with_m(v; corrected = true) -> (;blocks, mean, std_err, std_err_err, p_cov, mj)Perform the blocking algorithm from Flyvberg and Peterson JCP (1989). Returns named tuple with the results from all blocking steps. See mtest().
Rimu.StatsTools.mtest — Methodmtest(mj::AbstractVector; α = 0.01) -> k
mtest(table::NamedTuple; α = 0.01) -> kHypothesis test for decorrelation of a time series after blocking transformations with significance level $1-α$ after Jonson PRE (2018). mj or table.mj is expected to be a vector with relevant $M_j$ values from a blocking analysis as obtained from blocks_with_m(). Returns the row number k where the M-test is passed. If the M-test has failed mtest() returns the value -1.
Rimu.StatsTools.RatioBlockingResult — TypeRatioBlockingResult(ratio, f, σ_f, δ_y, k, success)Result of ratio_of_means().
Fields:
- ratio::P: ratio with uncertainties propagated by- MonteCarloMeasurements
- f::T: ratio of means
- σ_f::T: std from linear propagation
- δ_y::T: coefficient of variation for denominator (≤ 0.1 for normal approx)
- k::Int: k-1 blocking steps were used to uncorrelate time series
- blocks::Int: number of data values after blocking
- success::Bool: false if any of the blocking steps failed
Has methods for NamedTuple, val_and_errs, val, errs.
Note: to compute statistics on the RatioBlockingResult, use functions pmedian, pquantile, pmiddle, piterate, pextrema, pminimum, pmaximum, pmean, and pcov.
Rimu.StatsTools.particles — Methodparticles(samples, μ, σ)
particles(samples, μ::AbstractVector, Σ::AbstractMatrix)Return Particles object from MonteCarloMeasurements with single- or multivariate normal distribution. Zero variance parameters are supported.
Rimu.StatsTools.particles — Methodparticles(samples, d)
particles(::Nothing, d)
particles(::Val{T}, d) where TReturn Particles object from  MonteCarloMeasurements using  a type-stable constructor if possible. Pass nothing for the default number of particles or Val(1_000) for using 1000 particles in a type-stable manner. If d is a Particles object it is passed through without re-sampling.
Rimu.StatsTools.ratio_estimators — Methodratio_estimators(x, y, [k]; corrected=true, mc_samples=10_000) -> (; r, f, σ_f, δ_y, n)Estimators for the ratio of means mean(x)/mean(y). If k is given, k-1 blocking steps are performed to remove internal correlations in the time series x and y. Otherwise these are assumed to be free of internal correlations. Correlations between x and y may be present and are taken into account.
Return values:
- r::Particlesis the Monte Carlo sampled ratio estimator, see- Particles
- f = mean(x)/mean(y)
- σ_fstandard deviation of- ffrom linear error propagation (normal approximation)
- δ_y = std(y)/mean(y)coefficient of variation; < 0.1 for normal approximation to work
- n: number of uncorrelated data used for uncertainty estimation
Rimu.StatsTools.x_by_y_linear — Methodx_by_y_linear(μ_x,μ_y,σ_x,σ_y,ρ) -> f, σ_fLinear error propagation for ratio f = x/y assuming x and y are correlated normal random variables and assuming the ratio can be approximated as a normal distribution. See wikipedia and Díaz-Francés, Rubio (2013).
\[σ_f = \sqrt{\frac{σ_x}{μ_y}^2 + \frac{μ_x σ_y}{μ_y^2}^2 - \frac{2 ρ μ_x}{μ_y^3}}\]
Core.NamedTuple — MethodNamedTuple(x::BlockingResult; n=1, p=nothing, name=:val)
NamedTuple(x::RatioBlockingResult; n=1, p=nothing, name=:val)Return a named tuple with value and error bars (see val_and_errs) as well as additional numerical fields relevant for x.
Example:
julia> results = [blocking_analysis(i:0.1:2i+20) for i in 1:3]; # mock results
julia> df = NamedTuple.(results, name=:res)|>DataFrame
3×7 DataFrame
 Row │ res      res_l    res_u    res_err_err  res_p_cov  res_k  res_blocks
     │ Float64  Float64  Float64  Float64      Float64    Int64  Int64
─────┼──────────────────────────────────────────────────────────────────────
   1 │    11.5  1.7282   1.7282      0.352767    2.98667      5          13
   2 │    13.0  1.7282   1.7282      0.352767    2.98667      5          13
   3 │    14.5  1.78885  1.78885     0.350823    3.2          5          14julia> rbs = ratio_of_means(1 .+sin.(1:0.1:11),2 .+sin.(2:0.1:12)); # more mock results
julia> [NamedTuple(rbs),]|>DataFrame
1×9 DataFrame
 Row │ val       val_l      val_u      val_f     val_σ_f    val_δ_y    val_k  val_blocks  val_success
     │ Float64   Float64    Float64    Float64   Float64    Float64    Int64  Int64       Bool
─────┼────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 0.581549  0.0925669  0.0812292  0.560532  0.0875548  0.0875548      4          12         true
See val_and_errs, val, errs, BlockingResult, RatioBlockingResult.
Rimu.StatsTools.autocovariance — Methodautocovariance(v::Vector,h::Int; corrected::Bool=true)$\hat{\gamma}(h) =\frac{1}{n}\sum_{t=1}^{n-h}(v_{t+h}-\bar{v})(v_t-\bar{v})^*$ Calculate the autocovariance of dataset v with a delay h. If corrected is true (the default) then the sum is scaled with n-h, whereas the sum is scaled with n if corrected is false where n = length(v).
Rimu.StatsTools.pseudo_cov — Methodpseudo_cov(x, y; xmean = mean(x), ymean = mean(y), corrected = true)Compute the pseudo covariance between collections x and y returning a scalar:
\[\frac{1}{n}\sum_{i=1}^{n} (x_i - \bar{x})(y_{i} - \bar{y})\]
Optionally, precomputed means can be passed as keyword arguments. pseudo_cov(x,y) is functionally equivalent to Statistics.cov(x, conj(y); corrected = false) but it is found to be significantly faster and avoids allocations.
Rimu.StatsTools.determine_constant_time_step — Methoddetermine_constant_time_step(df) -> time_stepGiven a DataFrame df, determine the time step that was used to compute it. Throw an error if time step is not constant.
Index
- Rimu.StatsTools
- Core.NamedTuple
- MonteCarloMeasurements.Particles
- Rimu.StatsTools.BlockingResult
- Rimu.StatsTools.RatioBlockingResult
- Measurements.measurement
- Rimu.StatsTools.autocovariance
- Rimu.StatsTools.blocker
- Rimu.StatsTools.blocking_analysis
- Rimu.StatsTools.blocking_analysis_data
- Rimu.StatsTools.blocks_with_m
- Rimu.StatsTools.determine_constant_time_step
- Rimu.StatsTools.errs
- Rimu.StatsTools.growth_estimator
- Rimu.StatsTools.growth_estimator_analysis
- Rimu.StatsTools.growth_witness
- Rimu.StatsTools.growth_witness
- Rimu.StatsTools.mean_and_se
- Rimu.StatsTools.mixed_estimator
- Rimu.StatsTools.mixed_estimator_analysis
- Rimu.StatsTools.mtest
- Rimu.StatsTools.particles
- Rimu.StatsTools.particles
- Rimu.StatsTools.projected_energy
- Rimu.StatsTools.pseudo_cov
- Rimu.StatsTools.ratio_estimators
- Rimu.StatsTools.ratio_of_means
- Rimu.StatsTools.rayleigh_replica_estimator
- Rimu.StatsTools.rayleigh_replica_estimator_analysis
- Rimu.StatsTools.replica_fidelity
- Rimu.StatsTools.shift_estimator
- Rimu.StatsTools.smoothen
- Rimu.StatsTools.to_measurement
- Rimu.StatsTools.val
- Rimu.StatsTools.val_and_errs
- Rimu.StatsTools.variational_energy_estimator
- Rimu.StatsTools.w_exp
- Rimu.StatsTools.w_lin
- Rimu.StatsTools.x_by_y_linear
- Statistics.cov