brainsig.model#

A module to fit elastic net logistic regression.

This module implements the ElasticNetClassifier for neural signature analysis.

Classes#

`ElasticNetClassifier`	Elastic Net Logistic Regression classifier for neural signature analysis.
`NeuralSignature`	Neural Signature classifier for fMRI task condition discrimination.

Module Contents#

class brainsig.model.ElasticNetClassifier(inner_folds: int = 5, outer_folds: int = 5, inner_scoring: str = 'roc_auc_ovr', outer_scoring: dict | None = None, cs: list | None = None, l1_ratios: list | None = None, max_iter: int = 1000, n_jobs: int = -1, random_state: int = 42)#

Elastic Net Logistic Regression classifier for neural signature analysis.

This classifier performs nested cross-validation with elastic net regularization for binary or multi-class classification tasks.

Parameters:

inner_folds (int, default=5) – Number of folds for inner cross-validation (hyperparameter tuning).
outer_folds (int, default=5) – Number of folds for outer cross-validation (performance evaluation).
inner_scoring (str, default='roc_auc_ovr') – Scoring metric for inner CV hyperparameter selection.
outer_scoring (dict or None, default=None) – Dictionary of scoring metrics for outer CV. If None, uses default metrics.
cs (list or None, default=None) – Regularization parameter values to test. If None, uses default values.
l1_ratios (list or None, default=None) – L1 penalty ratios for elastic net. If None, uses default values.
max_iter (int, default=1000) – Maximum number of iterations for solver convergence.
n_jobs (int, default=-1) – Number of parallel jobs. -1 uses all processors.
random_state (int, default=42) – Random seed for reproducibility.

models#

Fitted models for each target variable.

Type:: dict

cv_results#

Cross-validation results for each target variable.

Type:: dict

target_names#

Names of target variables.

Type:: list

inner_scoring = 'roc_auc_ovr'#

outer_scoring#

inner_folds = 5#

outer_folds = 5#

Cs = [0.001, 0.01, 0.1, 1, 10]#

l1_ratios = [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1.0]#

max_iter = 1000#

n_jobs = -1#

random_state = 42#

inner_cv#

outer_cv#

dataset = None#

models#

cv_results#

target_names = []#

define_model(random_state: int = 42) → sklearn.linear_model.LogisticRegressionCV#

Create a LogisticRegressionCV model with elastic net penalty.

Parameters:: random_state (int, default=42) – Random seed for model initialization.
Returns:: Configured logistic regression model with cross-validation.
Return type:: LogisticRegressionCV

build_cv_scheme(n_splits: int = 5, random_state: int = 42) → sklearn.model_selection.StratifiedKFold#

Build a stratified k-fold cross-validation scheme.

Parameters:

n_splits (int, default=5) – Number of folds for cross-validation.
random_state (int, default=42) – Random seed for reproducible splits.

Returns:

Configured cross-validation splitter.

Return type:

StratifiedKFold

fit_model(dataset, *, keep_dataset: bool = True) → None#

Fit elastic net models for each target variable.

Parameters:

dataset (Dataset) – Dataset object containing X_train and y_train arrays.
keep_dataset (bool, default=True) – Whether to store the dataset as an instance attribute.

predict(dataset) → dict#

Make predictions on test set for all targets.

Parameters:: dataset (Dataset) – Dataset object containing X_test arrays.
Returns:: Dictionary with predictions for each target, containing y_pred, y_pred_proba, and y_true arrays.
Return type:: dict

cross_validate(dataset) → dict#

Perform nested cross-validation for each target variable.

Parameters:: dataset (Dataset) – Dataset object containing training data.
Returns:: Cross-validation results for each target, including scores and fitted estimators for each fold.
Return type:: dict

get_cv_coefs(dataset, *, exponentiate: bool = False) → pandas.DataFrame#

Extract coefficients from cross-validated models.

Parameters:

dataset (Dataset) – Dataset object with feature names and target labels.
exponentiate (bool, default=False) – If True, exponentiate coefficients to get odds ratios.

Returns:

DataFrame with coefficients indexed by cv_fold, target_variable, and target class.

Return type:

pd.DataFrame

get_model_scores(dataset=None) → pandas.DataFrame#

Get fit statistics for the model trained on the full dataset.

Parameters:: dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
Returns:: DataFrame with accuracy, F1, and AUC scores for the fitted model, with columns: value, partition, metric, target, target_label.
Return type:: pd.DataFrame

get_cv_model_scores(dataset=None) → pandas.DataFrame#

Get fit statistics for each validation fold separately.

Parameters:: dataset (Dataset or None, default=None) – Dataset object with target labels for multi-class score interpretation.
Returns:: DataFrame with accuracy, F1, and AUC scores per CV fold, with columns: cv_fold, value, partition, metric, target, target_label.
Return type:: pd.DataFrame

get_roc_curve(dataset=None) → pandas.DataFrame#

Get ROC curve data for the model trained on the full dataset.

Parameters:: dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
Returns:: Long-format DataFrame with one row per threshold point, with columns: fpr, tpr, threshold, target, target_label.
Return type:: pd.DataFrame

get_cv_roc_curves(dataset) → pandas.DataFrame#

Get ROC curve data for each validation fold.

Parameters:: dataset (Dataset) – Dataset object providing X_train and y_train for fold reconstruction.
Returns:: Long-format DataFrame with one row per threshold point, with columns: cv_fold, partition, fpr, tpr, threshold, target, target_label.
Return type:: pd.DataFrame

class brainsig.model.NeuralSignature(inner_folds: int = 5, outer_folds: int = 5, inner_scoring: str = 'roc_auc', outer_scoring: dict | None = None, cs: list | None = None, l1_ratios: list | None = None, max_iter: int = 1000, n_jobs: int = -1, random_state: int = 42)#

Neural Signature classifier for fMRI task condition discrimination.

This class fits an elastic net logistic regression model to discriminate between two fMRI task conditions (labeled 1 and 0) and computes neural signature scores as the difference in predicted probabilities between conditions for each subject.

The neural signature score for a subject is computed as:: score = P(condition=1 | fMRI_condition1) - P(condition=1 | fMRI_condition0)

Parameters:

inner_folds (int, default=5) – Number of folds for inner cross-validation (hyperparameter tuning).
outer_folds (int, default=5) – Number of folds for outer cross-validation (performance evaluation).
inner_scoring (str, default='roc_auc') – Scoring metric for inner CV hyperparameter selection.
outer_scoring (dict or None, default=None) – Dictionary of scoring metrics for outer CV. If None, uses default metrics.
cs (list or None, default=None) – Regularization parameter values to test. If None, uses default values.
l1_ratios (list or None, default=None) – L1 penalty ratios for elastic net. If None, uses default values.
max_iter (int, default=1000) – Maximum number of iterations for solver convergence.
n_jobs (int, default=-1) – Number of parallel jobs. -1 uses all processors.
random_state (int, default=42) – Random seed for reproducibility.

classifier#

Underlying elastic net classifier for condition discrimination.

Type:: ElasticNetClassifier

signature_scores#

Computed neural signature scores for each subject.

Type:: pd.DataFrame or None

Examples

>>> # Prepare data with condition labels (1 and 0)
>>> neural_sig = NeuralSignature(random_state=42)
>>> neural_sig.fit(dataset)
>>> scores = neural_sig.compute_signature_scores(condition1_data, condition0_data)

classifier#

signature_scores = None#

fit(dataset, *, keep_dataset: bool = True) → None#

Fit the neural signature model to discriminate between task conditions.

The dataset should contain a binary target variable where: - Label 1 represents the first task condition - Label 0 represents the second task condition

Parameters:

dataset (Dataset) – Dataset object with binary condition labels (1 and 0).
keep_dataset (bool, default=True) – Whether to store the dataset in the classifier.

cross_validate(dataset) → dict#

Perform nested cross-validation for the neural signature model.

Parameters:: dataset (Dataset) – Dataset object containing training data with binary condition labels.
Returns:: Cross-validation results including scores and fitted estimators.
Return type:: dict

compute_signature_scores(condition1_data: numpy.ndarray, condition0_data: numpy.ndarray, *, subject_ids: list | None = None) → pandas.DataFrame#

Compute neural signature scores for each subject.

The neural signature score is computed as the difference in predicted probabilities for condition 1 between the two task conditions:

score = P(y=1 | condition1_data) - P(y=1 | condition0_data)

Parameters:

condition1_data (np.ndarray) – Preprocessed fMRI data for condition 1 (shape: n_subjects x n_features).
condition0_data (np.ndarray) – Preprocessed fMRI data for condition 0 (shape: n_subjects x n_features).
subject_ids (list or None, default=None) – Optional list of subject identifiers. If None, uses sequential indices.

Returns:

DataFrame with columns: subject_id, condition1_prob, condition0_prob, signature_score.

Return type:

pd.DataFrame

Raises:

ValueError – If the model hasn’t been fitted yet or if data shapes don’t match.

get_cv_signature_scores(dataset, condition1_indices: numpy.ndarray, condition0_indices: numpy.ndarray, *, subject_ids: list | None = None) → pandas.DataFrame#

Compute neural signature scores using cross-validated models.

This method computes signature scores for each CV fold, which is useful for estimating the generalizability of the neural signature.

Parameters:

dataset (Dataset) – Dataset object with full data (both conditions for all subjects).
condition1_indices (np.ndarray) – Indices in the dataset corresponding to condition 1 trials.
condition0_indices (np.ndarray) – Indices in the dataset corresponding to condition 0 trials.
subject_ids (list or None, default=None) – Optional list of subject identifiers.

Returns:

DataFrame with signature scores from each CV fold, including columns: cv_fold, subject_id, condition1_prob, condition0_prob, signature_score.

Return type:

pd.DataFrame

Raises:

ValueError – If cross-validation hasn’t been performed yet.

get_coefficients(dataset, *, exponentiate: bool = False) → pandas.DataFrame#

Get model coefficients (feature weights) from cross-validated models.

Parameters:

dataset (Dataset) – Dataset object with feature names.
exponentiate (bool, default=False) – If True, exponentiate coefficients to get odds ratios.

Returns:

DataFrame with coefficients for each feature across CV folds.

Return type:

pd.DataFrame

get_model_scores(dataset=None) → pandas.DataFrame#

Get fit statistics for the model trained on the full dataset.

Parameters:: dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
Returns:: DataFrame with accuracy, F1, and AUC scores for the fitted model, with columns: value, partition, metric, target, target_label.
Return type:: pd.DataFrame

get_cv_model_scores(dataset=None) → pandas.DataFrame#

Get fit statistics for each validation fold separately.

Parameters:: dataset (Dataset or None, default=None) – Dataset object with target labels for multi-class score interpretation.
Returns:: DataFrame with accuracy, F1, and AUC scores per CV fold, with columns: cv_fold, value, partition, metric, target, target_label.
Return type:: pd.DataFrame

get_roc_curve(dataset=None) → pandas.DataFrame#

Get ROC curve data for the model trained on the full dataset.

Parameters:: dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
Returns:: Long-format DataFrame with one row per threshold point, with columns: fpr, tpr, threshold, target, target_label.
Return type:: pd.DataFrame

get_cv_roc_curves(dataset) → pandas.DataFrame#

Get ROC curve data for each validation fold.

Parameters:: dataset (Dataset) – Dataset object used during cross-validation.
Returns:: Long-format DataFrame with one row per threshold point, with columns: cv_fold, partition, fpr, tpr, threshold, target, target_label.
Return type:: pd.DataFrame

save(path) → None#

Save the fitted model to disk using joblib.

Parameters:: path (str or Path) – File path to save the model to (e.g. ‘model.joblib’).

classmethod load(path) → NeuralSignature#

Load a saved model from disk.

Parameters:: path (str or Path) – File path of the saved model.
Returns:: The loaded model instance.
Return type:: NeuralSignature