lr_cd.lr_data_generation

Module Contents

Functions

generate_data_lr(n, n_features, theta[, noise, ...])

Generate a number of data points base on the theta coefficients.

lr_cd.lr_data_generation.generate_data_lr(n, n_features, theta, noise=0.2, random_seed=123)[source]

Generate a number of data points base on the theta coefficients.

Parameters:
  • n (integer) – The number of data points.

  • n_features (integer) – The number of features to generate, excluding the intercept.

  • theta (ndarray) – The true scalar intercept and coefficient weights vector. The first element should always be the intercept.

  • noise (float) – The standard deviation of a normal distribution added to the generated target y array as noise.

  • random_seed (integer) – Random seed to ensure reproducibility.

Returns:

  • X (ndarray) – Feature data matrix of shape (n_samples, n_features).

  • y (ndarray) – Response data matrix of shape (n_samples, 1).

Examples

>>> from lr_cd.lr_data_generation import generate_data_lr
>>> theta = np.array([4, 3])
>>> generate_data_lr(n=10, n_features=1, theta=theta)