brainsig.model#
A module to fit elastic net logistic regression.
This module implements the ElasticNetClassifier for neural signature analysis.
Classes#
Elastic Net Logistic Regression classifier for neural signature analysis. |
|
Neural Signature classifier for fMRI task condition discrimination. |
Module Contents#
- class brainsig.model.ElasticNetClassifier(inner_folds: int = 5, outer_folds: int = 5, inner_scoring: str = 'roc_auc_ovr', outer_scoring: dict | None = None, cs: list | None = None, l1_ratios: list | None = None, max_iter: int = 1000, n_jobs: int = -1, random_state: int = 42)#
Elastic Net Logistic Regression classifier for neural signature analysis.
This classifier performs nested cross-validation with elastic net regularization for binary or multi-class classification tasks.
- Parameters:
inner_folds (int, default=5) – Number of folds for inner cross-validation (hyperparameter tuning).
outer_folds (int, default=5) – Number of folds for outer cross-validation (performance evaluation).
inner_scoring (str, default='roc_auc_ovr') – Scoring metric for inner CV hyperparameter selection.
outer_scoring (dict or None, default=None) – Dictionary of scoring metrics for outer CV. If None, uses default metrics.
cs (list or None, default=None) – Regularization parameter values to test. If None, uses default values.
l1_ratios (list or None, default=None) – L1 penalty ratios for elastic net. If None, uses default values.
max_iter (int, default=1000) – Maximum number of iterations for solver convergence.
n_jobs (int, default=-1) – Number of parallel jobs. -1 uses all processors.
random_state (int, default=42) – Random seed for reproducibility.
- inner_scoring = 'roc_auc_ovr'#
- outer_scoring#
- inner_folds = 5#
- outer_folds = 5#
- Cs = [0.001, 0.01, 0.1, 1, 10]#
- l1_ratios = [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1.0]#
- max_iter = 1000#
- n_jobs = -1#
- random_state = 42#
- inner_cv#
- outer_cv#
- dataset = None#
- models#
- cv_results#
- target_names = []#
- define_model(random_state: int = 42) sklearn.linear_model.LogisticRegressionCV#
Create a LogisticRegressionCV model with elastic net penalty.
- Parameters:
random_state (int, default=42) – Random seed for model initialization.
- Returns:
Configured logistic regression model with cross-validation.
- Return type:
LogisticRegressionCV
- build_cv_scheme(n_splits: int = 5, random_state: int = 42) sklearn.model_selection.StratifiedKFold#
Build a stratified k-fold cross-validation scheme.
- fit_model(dataset, *, keep_dataset: bool = True) None#
Fit elastic net models for each target variable.
- get_cv_coefs(dataset, *, exponentiate: bool = False) pandas.DataFrame#
Extract coefficients from cross-validated models.
- get_model_scores(dataset=None) pandas.DataFrame#
Get fit statistics for the model trained on the full dataset.
- Parameters:
dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
- Returns:
DataFrame with accuracy, F1, and AUC scores for the fitted model, with columns: value, partition, metric, target, target_label.
- Return type:
pd.DataFrame
- get_cv_model_scores(dataset=None) pandas.DataFrame#
Get fit statistics for each validation fold separately.
- Parameters:
dataset (Dataset or None, default=None) – Dataset object with target labels for multi-class score interpretation.
- Returns:
DataFrame with accuracy, F1, and AUC scores per CV fold, with columns: cv_fold, value, partition, metric, target, target_label.
- Return type:
pd.DataFrame
- get_roc_curve(dataset=None) pandas.DataFrame#
Get ROC curve data for the model trained on the full dataset.
- Parameters:
dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
- Returns:
Long-format DataFrame with one row per threshold point, with columns: fpr, tpr, threshold, target, target_label.
- Return type:
pd.DataFrame
- get_cv_roc_curves(dataset) pandas.DataFrame#
Get ROC curve data for each validation fold.
- Parameters:
dataset (Dataset) – Dataset object providing X_train and y_train for fold reconstruction.
- Returns:
Long-format DataFrame with one row per threshold point, with columns: cv_fold, partition, fpr, tpr, threshold, target, target_label.
- Return type:
pd.DataFrame
- class brainsig.model.NeuralSignature(inner_folds: int = 5, outer_folds: int = 5, inner_scoring: str = 'roc_auc', outer_scoring: dict | None = None, cs: list | None = None, l1_ratios: list | None = None, max_iter: int = 1000, n_jobs: int = -1, random_state: int = 42)#
Neural Signature classifier for fMRI task condition discrimination.
This class fits an elastic net logistic regression model to discriminate between two fMRI task conditions (labeled 1 and 0) and computes neural signature scores as the difference in predicted probabilities between conditions for each subject.
- The neural signature score for a subject is computed as:
score = P(condition=1 | fMRI_condition1) - P(condition=1 | fMRI_condition0)
- Parameters:
inner_folds (int, default=5) – Number of folds for inner cross-validation (hyperparameter tuning).
outer_folds (int, default=5) – Number of folds for outer cross-validation (performance evaluation).
inner_scoring (str, default='roc_auc') – Scoring metric for inner CV hyperparameter selection.
outer_scoring (dict or None, default=None) – Dictionary of scoring metrics for outer CV. If None, uses default metrics.
cs (list or None, default=None) – Regularization parameter values to test. If None, uses default values.
l1_ratios (list or None, default=None) – L1 penalty ratios for elastic net. If None, uses default values.
max_iter (int, default=1000) – Maximum number of iterations for solver convergence.
n_jobs (int, default=-1) – Number of parallel jobs. -1 uses all processors.
random_state (int, default=42) – Random seed for reproducibility.
- classifier#
Underlying elastic net classifier for condition discrimination.
- Type:
- signature_scores#
Computed neural signature scores for each subject.
- Type:
pd.DataFrame or None
Examples
>>> # Prepare data with condition labels (1 and 0) >>> neural_sig = NeuralSignature(random_state=42) >>> neural_sig.fit(dataset) >>> scores = neural_sig.compute_signature_scores(condition1_data, condition0_data)
- classifier#
- signature_scores = None#
- fit(dataset, *, keep_dataset: bool = True) None#
Fit the neural signature model to discriminate between task conditions.
The dataset should contain a binary target variable where: - Label 1 represents the first task condition - Label 0 represents the second task condition
- compute_signature_scores(condition1_data: numpy.ndarray, condition0_data: numpy.ndarray, *, subject_ids: list | None = None) pandas.DataFrame#
Compute neural signature scores for each subject.
The neural signature score is computed as the difference in predicted probabilities for condition 1 between the two task conditions:
score = P(y=1 | condition1_data) - P(y=1 | condition0_data)
- Parameters:
condition1_data (np.ndarray) – Preprocessed fMRI data for condition 1 (shape: n_subjects x n_features).
condition0_data (np.ndarray) – Preprocessed fMRI data for condition 0 (shape: n_subjects x n_features).
subject_ids (list or None, default=None) – Optional list of subject identifiers. If None, uses sequential indices.
- Returns:
DataFrame with columns: subject_id, condition1_prob, condition0_prob, signature_score.
- Return type:
pd.DataFrame
- Raises:
ValueError – If the model hasn’t been fitted yet or if data shapes don’t match.
- get_cv_signature_scores(dataset, condition1_indices: numpy.ndarray, condition0_indices: numpy.ndarray, *, subject_ids: list | None = None) pandas.DataFrame#
Compute neural signature scores using cross-validated models.
This method computes signature scores for each CV fold, which is useful for estimating the generalizability of the neural signature.
- Parameters:
dataset (Dataset) – Dataset object with full data (both conditions for all subjects).
condition1_indices (np.ndarray) – Indices in the dataset corresponding to condition 1 trials.
condition0_indices (np.ndarray) – Indices in the dataset corresponding to condition 0 trials.
subject_ids (list or None, default=None) – Optional list of subject identifiers.
- Returns:
DataFrame with signature scores from each CV fold, including columns: cv_fold, subject_id, condition1_prob, condition0_prob, signature_score.
- Return type:
pd.DataFrame
- Raises:
ValueError – If cross-validation hasn’t been performed yet.
- get_coefficients(dataset, *, exponentiate: bool = False) pandas.DataFrame#
Get model coefficients (feature weights) from cross-validated models.
- get_model_scores(dataset=None) pandas.DataFrame#
Get fit statistics for the model trained on the full dataset.
- Parameters:
dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
- Returns:
DataFrame with accuracy, F1, and AUC scores for the fitted model, with columns: value, partition, metric, target, target_label.
- Return type:
pd.DataFrame
- get_cv_model_scores(dataset=None) pandas.DataFrame#
Get fit statistics for each validation fold separately.
- Parameters:
dataset (Dataset or None, default=None) – Dataset object with target labels for multi-class score interpretation.
- Returns:
DataFrame with accuracy, F1, and AUC scores per CV fold, with columns: cv_fold, value, partition, metric, target, target_label.
- Return type:
pd.DataFrame
- get_roc_curve(dataset=None) pandas.DataFrame#
Get ROC curve data for the model trained on the full dataset.
- Parameters:
dataset (Dataset or None, default=None) – Unused. Kept for API consistency.
- Returns:
Long-format DataFrame with one row per threshold point, with columns: fpr, tpr, threshold, target, target_label.
- Return type:
pd.DataFrame
- get_cv_roc_curves(dataset) pandas.DataFrame#
Get ROC curve data for each validation fold.
- Parameters:
dataset (Dataset) – Dataset object used during cross-validation.
- Returns:
Long-format DataFrame with one row per threshold point, with columns: cv_fold, partition, fpr, tpr, threshold, target, target_label.
- Return type:
pd.DataFrame
- save(path) None#
Save the fitted model to disk using joblib.
- Parameters:
path (str or Path) – File path to save the model to (e.g. ‘model.joblib’).
- classmethod load(path) NeuralSignature#
Load a saved model from disk.
- Parameters:
path (str or Path) – File path of the saved model.
- Returns:
The loaded model instance.
- Return type: