statsmodels.base.distributed_estimation.DistributedModel

class statsmodels.base.distributed_estimation.DistributedModel(partitions, model_class=None, init_kwds=None, estimation_method=None, estimation_kwds=None, join_method=None, join_kwds=None, results_class=None, results_kwds=None)[source]

Distributed model class

Parameters:
  • partitions (scalar) – The number of partitions that the data will be split into.

  • model_class (statsmodels model class) – The model class which will be used for estimation. If None this defaults to OLS.

  • init_kwds (dict-like or None) – Keywords needed for initializing the model, in addition to endog and exog.

  • init_kwds_generator (generator or None) – Additional keyword generator that produces model init_kwds that may vary based on data partition. The current usecase is for WLS and GLS

  • estimation_method (function or None) – The method that performs the estimation for each partition. If None this defaults to _est_regularized_debiased.

  • estimation_kwds (dict-like or None) – Keywords to be passed to estimation_method.

  • join_method (function or None) – The method used to recombine the results from each partition. If None this defaults to _join_debiased.

  • join_kwds (dict-like or None) – Keywords to be passed to join_method.

  • results_class (results class or None) – The class of results that should be returned. If None this defaults to RegularizedResults.

  • results_kwds (dict-like or None) – Keywords to be passed to results class.

partitions

See Parameters.

Type:

scalar

model_class

See Parameters.

Type:

statsmodels model class

init_kwds

See Parameters.

Type:

dict-like

init_kwds_generator

See Parameters.

Type:

generator or None

estimation_method

See Parameters.

Type:

function

estimation_kwds

See Parameters.

Type:

dict-like

join_method

See Parameters.

Type:

function

join_kwds

See Parameters.

Type:

dict-like

results_class

See Parameters.

Type:

results class

results_kwds

See Parameters.

Type:

dict-like

Notes

Examples

Methods

fit(data_generator[, fit_kwds, ...])

Performs the distributed estimation using the corresponding DistributedModel

fit_joblib(data_generator, fit_kwds, ...[, ...])

Performs the distributed estimation in parallel using joblib

fit_sequential(data_generator, fit_kwds[, ...])

Sequentially performs the distributed estimation using the corresponding DistributedModel