mlmc.tool
Contains classes that provide an interface to other resources such as HDF5, Gmsh, PBS, …
Submodules
mlmc.tool.distribution module
- class mlmc.tool.distribution.Distribution(moments_obj, moment_data, domain=None, force_decay=(True, True), monitor=False)[source]
Bases:
objectCalculation of the distribution
- estimate_density_minimize(tol=1e-05, reg_param=0.01)[source]
Optimize density estimation :type tol: :param tol: Tolerance for the nonlinear system residual, after division by std errors for individual moment means, i.e. res = || (F_i - mu_i) / sigma_i ||_2 :return: None
- estimate_density(tol=None)[source]
Run nonlinear iterative solver to estimate density, use previous solution as initial guess. Faster, but worse stability. :return: None
- mlmc.tool.distribution.compute_exact_moments(moments_fn, density, tol=0.0001)[source]
Compute approximation of moments using exact density. :type moments_fn: :param moments_fn: Moments function. :param n_moments: Number of mements to compute. :type density: :param density: Density function (must accept np vectors). :param a, b: Integral bounds, approximate integration over R. :type tol: :param tol: Tolerance of integration. :return: np.array, moment values
mlmc.tool.flow_mc module
- mlmc.tool.flow_mc.create_corr_field(model='gauss', corr_length=0.125, dim=2, log=True, sigma=1, mode_no=1000)[source]
Create correlated random-field provider (cf.Fields) according to selected backend.
- Parameters:
model – One of ‘fourier’, ‘svd’, ‘exp’, ‘TPLgauss’, ‘TPLexp’, ‘TPLStable’, or others (defaults to ‘gauss’).
corr_length – Correlation length (used by GSTools or SVD implementations).
dim – Spatial dimension of the field (1, 2 or 3).
log – If True, generate log-normal field (exponentiate underlying Gaussian field).
sigma – Standard deviation for the generated field.
mode_no – Number of Fourier modes
- Returns:
cf.Fields instance that can generate random field samples.
- mlmc.tool.flow_mc.substitute_placeholders(file_in, file_out, params)[source]
Replace placeholders of form ‘<name>’ in a template file with corresponding values.
- Parameters:
file_in – Path to the template file containing placeholders.
file_out – Path where the substituted output will be written.
params – Dictionary mapping placeholder names to replacement values, e.g. {‘mesh_file’: ‘mesh.msh’}.
- Returns:
List of parameter names that were actually used (replaced) in the template.
- mlmc.tool.flow_mc.force_mkdir(path, force=False)[source]
Create directory tree; optionally remove existing leaf directory first.
- Parameters:
path – Directory path to create (parents created as needed).
force – If True and the directory already exists, remove it (recursively) before creating.
- Returns:
None
- class mlmc.tool.flow_mc.FlowSim(config=None, clean=None)[source]
Bases:
Simulation- total_sim_id = 0
- MESH_FILE_VAR = 'mesh_file'
- TIMESTEP_H1_VAR = 'timestep_h1'
- TIMESTEP_H2_VAR = 'timestep_h2'
- GEO_FILE = 'mesh.geo'
- MESH_FILE = 'mesh.msh'
- YAML_TEMPLATE = 'flow_input.yaml.tmpl'
- YAML_FILE = 'flow_input.yaml'
- FIELDS_FILE = 'fields_sample.msh'
- level_instance(fine_level_params, coarse_level_params)[source]
Create a LevelSimulation object for given fine/coarse steps.
This method is called in the main process (Sampler) and must prepare common files (mesh, YAML) for that level. The returned LevelSimulation is serialized and sent to PBS jobs (PbsJob) for actual execution.
- Parameters:
fine_level_params (
List[float]) – list with single element [fine_step] (mesh step)coarse_level_params (
List[float]) – list with single element [coarse_step] (mesh step) or [0] for one-level MC
- Return type:
- Returns:
LevelSimulation configured with task size and calculate method
- static calculate(config, seed)[source]
Execute one MLMC sample calculation (fine and optional coarse) inside PBS job.
- Parameters:
config – Configuration dict from LevelSimulation.config_dict (contains common_files dirs, steps, fields params)
seed – Random seed for the sample generation (derived from sample id)
- Returns:
Tuple (fine_result_array, coarse_result_array), both numpy arrays (coarse may be zeros for one-level MC)
- static make_fields(fields, fine_mesh_data, coarse_mesh_data)[source]
Assign evaluation points to fields and return the Fields object prepared for sampling.
- Parameters:
fields – correlated_field.Fields instance (with local field definitions)
fine_mesh_data – Dict returned by extract_mesh() for the fine mesh
coarse_mesh_data – Dict returned by extract_mesh() for the coarse mesh (or None for one-level)
- Returns:
the same cf.Fields object with points set for sampling
- static generate_random_sample(fields, coarse_step, n_fine_elements)[source]
Generate random field samples for the fine and (optionally) coarse meshes.
- Parameters:
fields – cf.Fields object (already configured with points)
coarse_step – coarse-level step (0 for no coarse sample)
n_fine_elements – Number of elements that belong to fine mesh (used to split combined sample)
- Returns:
Tuple (fine_input_sample: dict, coarse_input_sample: dict) Each dict maps field name -> array shaped (n_elements, 1).
- static extract_mesh(mesh_file)[source]
Parse a Gmsh mesh file and extract points (element centers), element ids and region mapping.
- Parameters:
mesh_file – Path to .msh file to parse (Gmsh 2/4 depending on GmshIO implementation)
- Returns:
Dict with keys: - ‘points’: np.ndarray of shape (n_elements, dim) with element center coordinates - ‘point_region_ids’: np.ndarray of region id per element - ‘ele_ids’: np.ndarray of original element ids - ‘region_map’: dict mapping region name -> region id
- static result_format()[source]
Describe the simulation output format as a list of QuantitySpec objects.
- Return type:
List[QuantitySpec]- Returns:
List[QuantitySpec] describing each output quantity (name, unit, shape, times, locations)
mlmc.tool.gmsh_io module
Module containing an expanded python gmsh class
- class mlmc.tool.gmsh_io.GmshIO(filename=None)[source]
Bases:
objectThis is a class for storing nodes and elements. Based on Gmsh.py
Members: nodes – A dict of the form { nodeID: [ xcoord, ycoord, zcoord] } elements – A dict of the form { elemID: (type, [tags], [nodeIDs]) } physical – A dict of the form { name: (id, dim) }
Methods: read([file]) – Parse a Gmsh version 1.0 or 2.0 mesh file write_ascii([file]) – Output a Gmsh version 2.0 mesh file (ASCII) write_binary([file]) – Output a Gmsh version 2.0 mesh file (binary) write_element_data(f, ele_ids, name, values) – write $ElementData block write_fields(msh_file, ele_ids, fields) – convenience to write several ElementData blocks
- reset()[source]
Reinitialise internal storage.
Clears nodes, elements, physical names and element_data dictionaries.
- Returns:
None
- read_element_data_head(mshfile)[source]
Read header of a $ElementData block from an open mshfile.
The method expects the lines after ‘$ElementData’ to match the conventional Gmsh textual ElementData header layout:
<nstringtags> “<field name>” <nrealstags> <time> <ninttags> <time_index> <ncomponents> <nentries>
- Parameters:
mshfile – Open file-like object positioned after the ‘$ElementData’ line.
- Returns:
tuple (field, time, t_idx, n_comp, n_elem) - field: string field name - time: float time tag - t_idx: integer time index - n_comp: number of components per element - n_elem: number of element entries following header
- read(mshfile=None)[source]
Read a Gmsh .msh file.
- Supports parsing textual (ASCII) Gmsh files with sections like:
$MeshFormat
$Nodes / $NOD
$Elements / $ELM
$PhysicalNames
$ElementData
- Parsed data is stored in the instance attributes:
self.nodes: dict nodeID -> [x, y, z]
self.elements: dict elemID -> (type, tags_list, nodeIDs_list)
self.physical: dict name -> (id, dim)
self.element_data: dict field_name -> { time_idx: (time, {elemID: component_list}) }
- Parameters:
mshfile – Optional open file-like object or path string; if None uses filename passed to __init__.
- Returns:
None
- write_ascii(mshfile=None)[source]
Dump the mesh out to a Gmsh 2.0 (textual) msh file.
Writes $MeshFormat, $PhysicalNames, $Nodes and $Elements sections according to the current contents of self.physical, self.nodes and self.elements.
- Parameters:
mshfile – Optional open file or filename; if None uses self.filename opened for writing.
- Returns:
None
- write_binary(filename=None)[source]
Dump the mesh out to a Gmsh 2.0 msh file in binary format.
Note: this implementation mirrors the ASCII writer’s structure but writes binary packed integers/doubles. This method attempts to follow the Gmsh 2.2 binary formatting conventions.
- Parameters:
filename – Path to write binary .msh file; if None, uses self.filename.
- Returns:
None
- write_element_data(f, ele_ids, name, values)[source]
Write a single $ElementData block for a field to an open file stream.
The function writes a minimal textual $ElementData header and then one row per element with element ID followed by component values.
- Parameters:
f – Open file-like object opened for writing.
ele_ids – Iterable of element ids corresponding to the rows in ‘values’.
name – String name of the field (will be written as the ElementData field name).
values – numpy array of shape (N, L) where N == len(ele_ids) and L is components per element.
- Returns:
None
- write_fields(msh_file, ele_ids, fields)[source]
Create an MSH file that contains $ElementData blocks for the provided fields.
This is a convenience writer used to generate field input files for models (Flow123d). It writes a MeshFormat header and then for each field calls write_element_data.
- Parameters:
msh_file – Path to output MSH file (string). If falsy, uses self.filename when available.
ele_ids – Iterable of element ids (order must match field value ordering).
fields – Dict mapping field name -> array-like values (one row per element).
- Returns:
None
mlmc.tool.hdf5 module
- class mlmc.tool.hdf5.HDF5(file_path, load_from_file=False)[source]
Bases:
objectHDF5 file is organized into groups (h5py.Group objects) which is somewhat like dictionaries in python terminology - ‘keys’ are names of group members ‘values’ are members (groups (h5py.Group objects) and datasets (h5py.Dataset objects - similar to NumPy arrays)). Each group and dataset (including root group) can store metadata in ‘attributes’ (h5py.AttributeManager objects) HDF5 files (h5py.File) work generally like standard Python file objects
- Our HDF5 file strucutre:
Main Group: Keys:
- Levels: h5py.Group
- Attributes:
level_parameters: [[a], [b], [], …]
- Keys:
- <N>: h5py.Group (N - level id, start with 0)
- Attributes:
id: str n_ops_estimate: float
- Keys:
- scheduled: h5py.Dataset
dtype: S100 shape: (N,), N - number of scheduled values maxshape: (None,) chunks: True
- collected_values: h5py.Dataset
dtype: numpy.float64 shape: (Nc, 2, M) dtype structure is defined in simulation class maxshape: (None, 2, None) chunks: True
- collected_ids: h5py.Dataset
dtype: numpy.int16 index into scheduled shape: (Nc, 1) maxshape: (None, 1) chunks: True
- failed: h5py.Dataset
dtype: (‘S100’, ‘S1000’) shape: (Nf, 1) mashape: (None, 1) chunks: True
- create_file_structure(level_parameters)[source]
Create top-level HDF5 structure for MLMC results or load existing one.
- Parameters:
level_parameters – List[float] of level parameters to store in root attributes when initializing new file.
- Returns:
None
- load_from_file()[source]
Load root group attributes from an existing HDF5 file and set them as instance attributes.
Raises an Exception if required attributes (like ‘level_parameters’) are not present.
- Returns:
None
- clear_groups()[source]
Remove all top-level groups/datasets from the HDF5 file.
Useful when reinitializing a new MLMC run into an existing file.
- Returns:
None
- init_header(level_parameters)[source]
Initialize root attributes and create the top-level ‘Levels’ group.
- Parameters:
level_parameters – Iterable of level parameters to store in root attributes.
- Returns:
None
- add_level_group(level_id)[source]
Create (if necessary) and return a LevelGroup wrapper for a particular level.
- Parameters:
level_id – str, mlmc.Level identifier (e.g. ‘0’, ‘1’, …)
- Returns:
LevelGroup instance bound to the ‘/Levels/{level_id}’ HDF5 group
- property result_format_dset_name
Dataset name used to store the simulation result format (QuantitySpec array).
- Returns:
str dataset name
- save_result_format(result_format, res_dtype)[source]
Save simulation result format into a structured dataset.
The result_format is a list of QuantitySpec objects; res_dtype is a NumPy structured dtype describing how to store the QuantitySpec attributes in the dataset.
- Parameters:
result_format – List[QuantitySpec] (objects describing output fields)
res_dtype – numpy.dtype used for the dataset storage of a single QuantitySpec
- Returns:
None
- class mlmc.tool.hdf5.LevelGroup(file_name, hdf_group_path, level_id, loaded_from_file=False)[source]
Bases:
objectHelper class to manipulate per-level HDF5 group contents.
It provides convenience methods to append scheduled samples, collected results, failed entries and to iterate over collected data in chunks.
- SCHEDULED_DTYPE = {'formats': ['S100'], 'names': ['sample_id']}
- FAILED_DTYPE = {'formats': ('S100', 'S1000'), 'names': ('sample_id', 'message')}
- COLLECTED_ATTRS = {'sample_id': {'default_shape': (0,), 'dtype': {'formats': ['S100'], 'names': ['sample_id']}, 'maxshape': (None,), 'name': 'collected_ids'}}
- property collected_ids_dset
Name of dataset storing collected ids.
- Returns:
str
- property scheduled_dset
Name of dataset storing scheduled sample ids.
- Returns:
str
- property failed_dset
Name of dataset storing failed sample rows.
- Returns:
str
- append_scheduled(scheduled_samples)[source]
Append scheduled sample ids to the scheduled dataset.
- Parameters:
scheduled_samples – iterable of sample-id strings (or bytes-like)
- Returns:
None
- append_successful(samples)[source]
Append successful (collected) samples.
The samples array is expected to have rows of the form [sample_id, result_value]. The method appends sample ids to ‘collected_ids’ and result values to ‘collected_values’.
- Parameters:
samples (
array) – numpy.ndarray where each row is [sample_id, value], value may be array-like itself.- Returns:
None
- append_failed(failed_samples)[source]
Append failed sample rows to the failed dataset.
- Parameters:
failed_samples – iterable of failed sample descriptors (e.g. tuples (sample_id, message))
- Returns:
None
- scheduled()[source]
Read and return the scheduled dataset contents.
- Returns:
numpy.ndarray of scheduled entries (structured dtype)
- chunks(n_samples=None)[source]
Iterate over collected_values dataset chunks and yield ChunkSpec descriptors.
- Parameters:
n_samples – If provided, yield a single ChunkSpec from 0..n_samples instead of iterating actual chunks.
- Yield:
ChunkSpec(chunk_id, chunk_slice, level_id)
- collected(chunk_slice)[source]
Read a slice (chunk) from the collected_values dataset.
- Parameters:
chunk_slice – slice object describing which rows to read
- Returns:
numpy.ndarray with the chunk rows or None if dataset missing
- collected_n_items()[source]
Return the number of collected items (rows) stored for this level.
- Returns:
int number of collected rows
- get_finished_ids()[source]
Return concatenated list of successful and failed sample ids for this level.
- Returns:
numpy.ndarray of sample id strings (successful then failed)
- get_unfinished_ids()[source]
Compute unfinished sample ids = scheduled_ids finished_ids.
- Returns:
list of unfinished sample id strings
- get_failed_ids()[source]
Get list of failed sample ids for this level.
- Returns:
list of failed sample id strings
- clear_failed_dataset()[source]
Remove and recreate the failed dataset (clears all failure records).
- Returns:
None
- property n_ops_estimate
Number of operations estimate stored as a group attribute.
- Returns:
float or object stored under ‘n_ops_estimate’ attribute (if present)
mlmc.tool.pbs_job module
- class mlmc.tool.pbs_job.PbsJob(output_dir, jobs_dir, job_id, level_sim_file, debug)[source]
Bases:
object- SCHEDULED = '{}_scheduled.yaml'
- SUCCESSFUL_RESULTS = '{}_successful_results.yaml'
- FAILED_RESULTS = '{}_failed_results.yaml'
- TIME = '{}_times.yaml'
- PBS_ID = '{}_'
- CLASS_FILE = 'pbs_process_serialized.txt'
- SAMPLE_ID_JOB_ID = 'sample_id_job_id.json'
- classmethod create_job(output_dir, jobs_dir, job_id, level_sim_file, debug)[source]
Create and serialize a PbsJob descriptor for a PBS process to later deserialize.
The created descriptor (CLASS_FILE) is written under output_dir for the PBS worker.
- Parameters:
output_dir – str
jobs_dir – str
job_id – str
level_sim_file – str, format of LevelSimulation serialization filenames
debug – bool
- Returns:
PbsJob instance
- classmethod create_process()[source]
Create PbsJob instance inside PBS worker process.
The worker expects command-line arguments (see command_params) and a serialized CLASS_FILE in output_dir describing jobs_dir and level_sim_file format.
- Returns:
PbsJob instance
- static command_params()[source]
Parse PBS worker command-line parameters. Called inside worker process.
Expects sys.argv[1] = output_dir, sys.argv[2] = job_id
- Returns:
tuple (job_id: str, output_dir: str)
- calculate_samples()[source]
Main worker routine: calculate each scheduled sample, move produced files, and record success/failure.
- This method:
reads the scheduled list,
deserializes LevelSimulation objects on demand,
calls SamplingPool.calculate_sample for each scheduled sample,
moves successful/failed artifacts,
writes partial results to YAML files (successful, failed, times).
- Returns:
None
- save_sample_id_job_id(job_id, sample_ids)[source]
Save mapping of sample ids to this job_id so other tools can query which job handled a sample.
- Parameters:
job_id – str
sample_ids – iterable of sample-identifiers (each sample_id is usually a tuple or list, code expects sample_id[1])
- Returns:
None
- static job_id_from_sample_id(sample_id, jobs_dir)[source]
Lookup job id that processed a given sample id.
- Parameters:
sample_id – str sample identifier
jobs_dir – path to jobs directory where SAMPLE_ID_JOB_ID file is stored
- Returns:
str job id associated with sample_id
- static read_results(job_id, jobs_dir)[source]
Read and aggregate results produced by a PBS job into dictionaries.
The function reads SUCCESSFUL_RESULTS, FAILED_RESULTS and TIME YAML files (if present) and returns aggregated dicts keyed by level_id.
- Parameters:
job_id – str
jobs_dir – path to directory containing job result YAML files
- Returns:
tuple (successful_dict, failed_dict, time_dict) where: - successful_dict[level_id] = [(sample_id, result), …] - failed_dict[level_id] = [(sample_id, error_message), …] - time_dict[level_id] = [(n_samples, cumulative_time), …]
- static get_scheduled_sample_ids(job_id, jobs_dir)[source]
Read the scheduled YAML file and return the list of scheduled (level_id, sample_id, seed) tuples.
- Parameters:
job_id – str
jobs_dir – str
- Returns:
list of tuples (level_id, sample_id, seed)
- write_pbs_id(pbs_job_id)[source]
Write an empty file whose filename encodes the mapping from our internal job id to the external PBS job id.
- Parameters:
pbs_job_id – str (external PBS job identifier)
- Returns:
None
mlmc.tool.process_base module
- class mlmc.tool.process_base.ProcessBase[source]
Bases:
objectParent class for particular simulation processes. Subclasses should implement setup_config.
- static get_arguments(arguments)[source]
Parse command-line arguments.
- Parameters:
arguments – list of arguments (typically sys.argv[1:])
- Returns:
argparse.Namespace with parsed arguments: - command: one of [‘run’, ‘collect’, ‘renew’, ‘process’] - work_dir: str path - clean: bool - debug: bool
- run(renew=True)[source]
High-level entry point to run the MLMC workflow.
Creates the working directory, sets up MLMC configurations for a set of level counts (currently hard-coded to [1]) and schedules/generates jobs. After job creation it triggers collection of results via all_collect.
- Parameters:
renew – bool, if True indicates renewing failed samples (passed down to setup_config in some subclasses)
- Returns:
None
- set_environment_variables()[source]
Determine environment-dependent configuration values (PBS config, executables, timeouts).
The method inspects the work_dir path to decide whether it runs on a cluster or locally and sets attributes used later (pbs_config, sample_sleep, init_sample_timeout, sample_timeout, flow123d, gmsh).
- Returns:
None
- setup_config(n_levels, clean)[source]
Set simulation configuration depending on particular task.
Subclasses must override this method and return a configured mlmc.MLMC object.
- Parameters:
n_levels – int, number of MLMC levels
clean – bool, whether to clean/create new files or use existing ones
- Returns:
mlmc.MLMC instance (implementation dependent)
- Raises:
NotImplementedError – always in base class
- rm_files(output_dir)[source]
Remove (recursively) output_dir and create an empty directory in its place.
- Parameters:
output_dir – str path to remove and recreate
- Returns:
None
- create_pbs_object(output_dir, clean)[source]
Initialize PBS helper object for submitting/executing jobs.
This creates self.pbs_obj and configures it with common PBS settings.
- Parameters:
output_dir – str, directory where PBS scripts and job state will be created
clean – bool, if True remove existing scripts before creating new ones
- Returns:
None
- generate_jobs(mlmc, n_samples=None)[source]
Prepare and kick off sampling jobs for the provided MLMC object.
The method optionally sets the initial n_samples (if provided), refills the sampler queues and triggers the PBS object execution. It then waits for simulations to finish.
- Parameters:
mlmc – mlmc.MLMC instance
n_samples – None or list specifying number of samples to request for each level
- Returns:
None
- set_moments(n_moments, log=False)[source]
Create and store a moments function instance (Legendre polynomial family).
- Parameters:
n_moments – int, number of moments
log – bool, whether to apply log-transform to quantity prior to moment evaluation
- Returns:
Legendre moments instance
- n_sample_estimate(mlmc, target_variance=0.001)[source]
Heuristic routine to estimate a good number of initial samples for MLMC using target variance.
It triggers an initial sampling run, estimates the domain, constructs moments, and requests additional samples using mlmc.target_var_adding_samples.
- Parameters:
mlmc – mlmc.MLMC instance
target_variance – float target variance for moment estimates
- Returns:
None
- all_collect(sampler_list)[source]
Poll samplers to collect running samples until none are left.
Repeatedly asks each sampler for the number of running jobs and keeps polling until all complete.
- Parameters:
sampler_list – list of sampler-like objects providing ask_sampling_pool_for_samples(sleep, timeout)
- Returns:
None
- process_analysis(cl)[source]
Top-level analysis entry point. Calls specific analysis routines (many commented out).
- Parameters:
cl – CompareLevels instance (or equivalent) holding estimation/collected data
- Returns:
None
- analyze_pdf_approx(cl)[source]
Perform PDF approximation experiments and plotting.
- Parameters:
cl – CompareLevels instance
- Returns:
None
- analyze_regression_of_variance(cl, mlmc_level)[source]
Analyze regression of variance for a selected level.
- Parameters:
cl – CompareLevels instance
mlmc_level – int index of method/level to analyze
- Returns:
None
- analyze_error_of_variance(cl, mlmc_level)[source]
Analyze error of variance estimators and plot related diagnostics.
- Parameters:
cl – CompareLevels instance
mlmc_level – int index of method/level to analyze
- Returns:
None
- analyze_error_of_regression_variance(cl, mlmc_level)[source]
Bootstrap-based analysis of regression variance errors.
- Parameters:
cl – CompareLevels instance
mlmc_level – int index of method/level to analyze
- Returns:
None
- analyze_error_of_level_variances(cl, mlmc_level)[source]
Analyze errors in per-level variance estimates and plot results.
- Parameters:
cl – CompareLevels instance
mlmc_level – int index of method/level to analyze
- Returns:
None
mlmc.tool.simple_distribution module
- class mlmc.tool.simple_distribution.SimpleDistribution(moments_obj, moment_data, domain=None, force_decay=(True, True), verbose=False)[source]
Bases:
objectApproximate a probability density function (PDF) from given moments.
The class constructs a parametric PDF using Lagrange multipliers and fits those multipliers by minimizing a functional (or solving a root problem). Numerical integration and adaptive quadrature are used to compute moments, gradients and Jacobians required by the optimizer.
- estimate_density_minimize(tol=1e-05, reg_param=0.01)[source]
Estimate multipliers by minimizing the dual functional.
Uses scipy.optimize.minimize (trust-ncg by default) to minimize the functional _calculate_functional(multipliers). The function sets up quadrature and internal tolerances before solving, and updates the multipliers attribute with the optimizer result.
- Parameters:
tol – Optimization tolerance (used for jacobian/grad stopping).
reg_param – Regularization parameter (not used directly here but kept for API parity).
- Returns:
scipy OptimizeResult with fields: - x: optimized multipliers - success: bool convergence flag (set to True if solver succeeded or residual < tol) - nit: number of iterations (at least 1) - fun_norm: norm of gradient at solution - eigvals: eigenvalues of computed Jacobian (added by this method) - solver_res: raw solver residual information (copy of result.jac)
- Notes:
After optimization the code enforces normalization by subtracting log(moment_0) from multipliers[0] so that the integral of the density is consistent with moment_0.
- density(value)[source]
Evaluate the approximated density at the given point(s).
- Parameters:
value – scalar or numpy array of points
- Returns:
numpy array of density values (same shape as flattened input)
- Notes:
Uses self.multipliers, self._moment_errs and the basis moments returned by eval_moments() to build exponent power = sum_i multipliers_i * moment_i / err_i.
The result is clipped (power limited to [-200, 200]) to avoid overflow.
- cdf(values)[source]
Evaluate the cumulative distribution function (CDF) at the given points.
- Parameters:
values – scalar or array-like points
- Returns:
numpy array of CDF values corresponding to input points
- Notes:
The method integrates the density piecewise between successive query points using fixed_quad with n=10 on subintervals determined by the adaptive quadrature info.
Values outside domain are mapped to 0 (left) or 1 (right).
- eval_moments(x)[source]
Evaluate all basis moment functions at x up to current approximation size.
- Parameters:
x – scalar or array-like points
- Returns:
numpy.ndarray of shape (n_points, approx_size) or similar depending on moments_fn.eval_all
- end_point_derivatives()[source]
Approximate derivatives of all moment basis functions at domain endpoints.
- Returns:
numpy array of shape (2, approx_size) where index 0 = left derivative, index 1 = right derivative. The derivatives are scaled by moment errors.
- Notes:
Uses a tiny eps shift (1e-10) to compute forward/backward differences.
If corresponding decay_penalty is False for a side, that side’s derivative is left as zeros.
- mlmc.tool.simple_distribution.compute_exact_moments(moments_fn, density, tol=1e-10)[source]
Compute moments by integrating moments_fn against provided density.
- Parameters:
moments_fn – object with .domain and .size and .eval(i, x)
density – callable accepting numpy arrays (vectorized) returning density values
tol – absolute tolerance for numerical integration
- Returns:
numpy array of length moments_fn.size containing integrated moments
- mlmc.tool.simple_distribution.compute_semiexact_moments(moments_fn, density, tol=1e-10)[source]
Compute moments using a hybrid approach: use adaptive quad to identify subintervals then apply Gauss-Legendre nodes inside those subintervals for an accurate quadrature.
- Parameters:
moments_fn – moments object with .domain and .size and .eval_all
density – callable density(x)
tol – quad tolerance
- Returns:
vector of integrated moments (length = moments_fn.size)
- mlmc.tool.simple_distribution.compute_exact_cov(moments_fn, density, tol=1e-10)[source]
Compute covariance matrix of moment basis under the provided density.
- Parameters:
moments_fn – moments object
density – callable density(x)
tol – integration tolerance
- Returns:
symmetric matrix (size x size) containing E[m_i * m_j]
- mlmc.tool.simple_distribution.compute_semiexact_cov(moments_fn, density, tol=1e-10)[source]
Compute approximate covariance matrix using quadrature nodes determined by adaptive integration.
- Parameters:
moments_fn – moments object
density – callable density(x)
tol – integration tolerance
- Returns:
Jacobian-like matrix approximating covariance (moments weighted by density)
- mlmc.tool.simple_distribution.KL_divergence(prior_density, posterior_density, a, b)[source]
Compute Kullback-Leibler divergence between two densities over [a,b].
- Using numerically stable integrand:
integrand = p * log(p/q) - p + q
which equals D_KL(P||Q) when both integrate to 1 but remains finite even when Q is not perfectly normalized.
- Parameters:
prior_density – callable P(x)
posterior_density – callable Q(x)
a – left integration bound
b – right integration bound
- Returns:
scalar KL divergence (floored at 1e-10)
- mlmc.tool.simple_distribution.L2_distance(prior_density, posterior_density, a, b)[source]
Compute L2 distance between two densities on [a, b].
- Parameters:
prior_density – callable P(x)
posterior_density – callable Q(x)
a – left bound
b – right bound
- Returns:
scalar L2 norm: sqrt( integral (Q-P)^2 )
- mlmc.tool.simple_distribution.best_fit_all(values, range_a, range_b)[source]
Find the best linear fit across all given index ranges.
The function searches all combinations of indices a and b within range_a and range_b such that a < b and fits a first-degree polynomial to the values between these indices. It then selects the fit with the smallest residual normalized by the square of the interval length.
- Parameters:
values – Array-like sequence of values to fit.
range_a – Iterable of possible starting indices.
range_b – Iterable of possible ending indices.
- Returns:
Tuple (a, b, fit) corresponding to the best fit, where fit is the array of polynomial coefficients.
- mlmc.tool.simple_distribution.best_p1_fit(values)[source]
Recursively find the best linear (P1) fit segment of the sequence.
The method finds indices a < b such that the segment values[a:b] has the smallest residual (least-squares error) normalized by (b - a)**2. If the array is large, it downsamples the data before recursively fitting.
- Parameters:
values – Sequence of numeric values to fit.
- Returns:
Tuple (a, b, fit) representing the best segment and corresponding linear coefficients.
- mlmc.tool.simple_distribution.detect_treshold_slope_change(values, log=True)[source]
Detect the index where the slope of a sequence changes significantly.
This function fits linear segments to the data (optionally in log scale) and detects where the slope begins to deviate, returning both the threshold index and a modified version of the input values where the slope change is extrapolated.
- Parameters:
values – Monotonically increasing numeric sequence.
log – If True, the logarithm of the sequence is used for fitting.
- Returns:
Tuple (i_treshold, mod_vals) - i_treshold: Index where the slope change is detected. - mod_vals: Modified version of values with extrapolated segment.
- mlmc.tool.simple_distribution.lsq_reconstruct(cov, eval, evec, treshold)[source]
Perform least-squares reconstruction of the eigenvectors of a covariance matrix to restore orthogonality.
This method adjusts the eigenvectors using nonlinear least-squares minimization so that the reconstructed eigenvectors are orthogonal and diagonalize the covariance matrix as closely as possible.
- Parameters:
cov – Covariance matrix (2D array).
eval – Eigenvalues (1D array).
evec – Eigenvectors (2D array).
treshold – Number of eigenvectors to fix (use exact values up to this index).
- Returns:
Reconstructed orthogonal eigenvector matrix Q.
- mlmc.tool.simple_distribution.construct_ortogonal_moments(moments, cov, tol=None)[source]
Construct orthogonal statistical moments with respect to the covariance matrix.
This function computes a transformation that makes the given moments orthogonal under the provided covariance matrix. It determines the threshold for significant eigenvalues (either via slope detection or tolerance) and constructs a transformation matrix accordingly.
- Parameters:
moments – Input moments object.
cov – Covariance matrix estimated from samples.
tol – Optional eigenvalue threshold. If None, an automatic slope-change detection is used.
- Returns:
Tuple (ortogonal_moments, info) - ortogonal_moments: Transformed (orthogonalized) moments. - info: Tuple containing (eval, threshold, transformation_matrix).
mlmc.tool.stats_tests module
- mlmc.tool.stats_tests.t_test(mu_0, samples, max_p_val=0.01)[source]
Perform a one-sample two-tailed t-test to check if the sample mean equals mu_0.
This test ensures that false positives (rejecting H0 when true) occur with probability <= max_p_val.
- Parameters:
mu_0 – Expected mean value of the samples.
samples – Array-like of sample values to test.
max_p_val – Maximum allowed p-value for false rejection (significance threshold).
- Raises:
AssertionError – if the p-value is larger than max_p_val, indicating the mean is statistically equal to mu_0.
- mlmc.tool.stats_tests.chi2_test(var_0, samples, max_p_val=0.01, tag='')[source]
Perform a chi-squared test to check if the sample variance equals var_0.
False rejections should occur with probability <= max_p_val.
- Parameters:
var_0 – Expected variance of the samples.
samples – Array-like of sample values.
max_p_val – Maximum allowed p-value for false rejection (significance threshold).
tag – Optional string tag for printing/debugging.
- Raises:
AssertionError – if the p-value indicates variance significantly differs from var_0.
- mlmc.tool.stats_tests.anova(level_moments)[source]
Perform one-way ANOVA (analysis of variance) across multiple levels.
Tests the null hypothesis H0: all levels have the same mean.
- Parameters:
level_moments – List of arrays, each array containing moments/samples for a level.
- Returns:
True if H0 cannot be rejected (means are statistically equal), False if H0 is rejected (at least one mean differs).
Module contents
Contains classes that provide an interface to other resources such as HDF5, Gmsh, PBS, …