brainsig.model ============== .. py:module:: brainsig.model .. autoapi-nested-parse:: A module to fit elastic net logistic regression. This module implements the ElasticNetClassifier for neural signature analysis. Classes ------- .. autoapisummary:: brainsig.model.ElasticNetClassifier brainsig.model.NeuralSignature Module Contents --------------- .. py:class:: ElasticNetClassifier(inner_folds: int = 5, outer_folds: int = 5, inner_scoring: str = 'roc_auc_ovr', outer_scoring: dict | None = None, cs: list | None = None, l1_ratios: list | None = None, max_iter: int = 1000, n_jobs: int = -1, random_state: int = 42) Elastic Net Logistic Regression classifier for neural signature analysis. This classifier performs nested cross-validation with elastic net regularization for binary or multi-class classification tasks. :param inner_folds: Number of folds for inner cross-validation (hyperparameter tuning). :type inner_folds: int, default=5 :param outer_folds: Number of folds for outer cross-validation (performance evaluation). :type outer_folds: int, default=5 :param inner_scoring: Scoring metric for inner CV hyperparameter selection. :type inner_scoring: str, default='roc_auc_ovr' :param outer_scoring: Dictionary of scoring metrics for outer CV. If None, uses default metrics. :type outer_scoring: dict or None, default=None :param cs: Regularization parameter values to test. If None, uses default values. :type cs: list or None, default=None :param l1_ratios: L1 penalty ratios for elastic net. If None, uses default values. :type l1_ratios: list or None, default=None :param max_iter: Maximum number of iterations for solver convergence. :type max_iter: int, default=1000 :param n_jobs: Number of parallel jobs. -1 uses all processors. :type n_jobs: int, default=-1 :param random_state: Random seed for reproducibility. :type random_state: int, default=42 .. attribute:: models Fitted models for each target variable. :type: dict .. attribute:: cv_results Cross-validation results for each target variable. :type: dict .. attribute:: target_names Names of target variables. :type: list .. py:attribute:: inner_scoring :value: 'roc_auc_ovr' .. py:attribute:: outer_scoring .. py:attribute:: inner_folds :value: 5 .. py:attribute:: outer_folds :value: 5 .. py:attribute:: Cs :value: [0.001, 0.01, 0.1, 1, 10] .. py:attribute:: l1_ratios :value: [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1.0] .. py:attribute:: max_iter :value: 1000 .. py:attribute:: n_jobs :value: -1 .. py:attribute:: random_state :value: 42 .. py:attribute:: inner_cv .. py:attribute:: outer_cv .. py:attribute:: dataset :value: None .. py:attribute:: models .. py:attribute:: cv_results .. py:attribute:: target_names :value: [] .. py:method:: define_model(random_state: int = 42) -> sklearn.linear_model.LogisticRegressionCV Create a LogisticRegressionCV model with elastic net penalty. :param random_state: Random seed for model initialization. :type random_state: int, default=42 :returns: Configured logistic regression model with cross-validation. :rtype: LogisticRegressionCV .. py:method:: build_cv_scheme(n_splits: int = 5, random_state: int = 42) -> sklearn.model_selection.StratifiedKFold Build a stratified k-fold cross-validation scheme. :param n_splits: Number of folds for cross-validation. :type n_splits: int, default=5 :param random_state: Random seed for reproducible splits. :type random_state: int, default=42 :returns: Configured cross-validation splitter. :rtype: StratifiedKFold .. py:method:: fit_model(dataset, *, keep_dataset: bool = True) -> None Fit elastic net models for each target variable. :param dataset: Dataset object containing X_train and y_train arrays. :type dataset: Dataset :param keep_dataset: Whether to store the dataset as an instance attribute. :type keep_dataset: bool, default=True .. py:method:: predict(dataset) -> dict Make predictions on test set for all targets. :param dataset: Dataset object containing X_test arrays. :type dataset: Dataset :returns: Dictionary with predictions for each target, containing y_pred, y_pred_proba, and y_true arrays. :rtype: dict .. py:method:: cross_validate(dataset) -> dict Perform nested cross-validation for each target variable. :param dataset: Dataset object containing training data. :type dataset: Dataset :returns: Cross-validation results for each target, including scores and fitted estimators for each fold. :rtype: dict .. py:method:: get_cv_coefs(dataset, *, exponentiate: bool = False) -> pandas.DataFrame Extract coefficients from cross-validated models. :param dataset: Dataset object with feature names and target labels. :type dataset: Dataset :param exponentiate: If True, exponentiate coefficients to get odds ratios. :type exponentiate: bool, default=False :returns: DataFrame with coefficients indexed by cv_fold, target_variable, and target class. :rtype: pd.DataFrame .. py:method:: get_model_scores(dataset=None) -> pandas.DataFrame Get fit statistics for the model trained on the full dataset. :param dataset: Unused. Kept for API consistency. :type dataset: Dataset or None, default=None :returns: DataFrame with accuracy, F1, and AUC scores for the fitted model, with columns: value, partition, metric, target, target_label. :rtype: pd.DataFrame .. py:method:: get_cv_model_scores(dataset=None) -> pandas.DataFrame Get fit statistics for each validation fold separately. :param dataset: Dataset object with target labels for multi-class score interpretation. :type dataset: Dataset or None, default=None :returns: DataFrame with accuracy, F1, and AUC scores per CV fold, with columns: cv_fold, value, partition, metric, target, target_label. :rtype: pd.DataFrame .. py:method:: get_roc_curve(dataset=None) -> pandas.DataFrame Get ROC curve data for the model trained on the full dataset. :param dataset: Unused. Kept for API consistency. :type dataset: Dataset or None, default=None :returns: Long-format DataFrame with one row per threshold point, with columns: fpr, tpr, threshold, target, target_label. :rtype: pd.DataFrame .. py:method:: get_cv_roc_curves(dataset) -> pandas.DataFrame Get ROC curve data for each validation fold. :param dataset: Dataset object providing X_train and y_train for fold reconstruction. :type dataset: Dataset :returns: Long-format DataFrame with one row per threshold point, with columns: cv_fold, partition, fpr, tpr, threshold, target, target_label. :rtype: pd.DataFrame .. py:class:: NeuralSignature(inner_folds: int = 5, outer_folds: int = 5, inner_scoring: str = 'roc_auc', outer_scoring: dict | None = None, cs: list | None = None, l1_ratios: list | None = None, max_iter: int = 1000, n_jobs: int = -1, random_state: int = 42) Neural Signature classifier for fMRI task condition discrimination. This class fits an elastic net logistic regression model to discriminate between two fMRI task conditions (labeled 1 and 0) and computes neural signature scores as the difference in predicted probabilities between conditions for each subject. The neural signature score for a subject is computed as: score = P(condition=1 | fMRI_condition1) - P(condition=1 | fMRI_condition0) :param inner_folds: Number of folds for inner cross-validation (hyperparameter tuning). :type inner_folds: int, default=5 :param outer_folds: Number of folds for outer cross-validation (performance evaluation). :type outer_folds: int, default=5 :param inner_scoring: Scoring metric for inner CV hyperparameter selection. :type inner_scoring: str, default='roc_auc' :param outer_scoring: Dictionary of scoring metrics for outer CV. If None, uses default metrics. :type outer_scoring: dict or None, default=None :param cs: Regularization parameter values to test. If None, uses default values. :type cs: list or None, default=None :param l1_ratios: L1 penalty ratios for elastic net. If None, uses default values. :type l1_ratios: list or None, default=None :param max_iter: Maximum number of iterations for solver convergence. :type max_iter: int, default=1000 :param n_jobs: Number of parallel jobs. -1 uses all processors. :type n_jobs: int, default=-1 :param random_state: Random seed for reproducibility. :type random_state: int, default=42 .. attribute:: classifier Underlying elastic net classifier for condition discrimination. :type: ElasticNetClassifier .. attribute:: signature_scores Computed neural signature scores for each subject. :type: pd.DataFrame or None .. rubric:: Examples >>> # Prepare data with condition labels (1 and 0) >>> neural_sig = NeuralSignature(random_state=42) >>> neural_sig.fit(dataset) >>> scores = neural_sig.compute_signature_scores(condition1_data, condition0_data) .. py:attribute:: classifier .. py:attribute:: signature_scores :value: None .. py:method:: fit(dataset, *, keep_dataset: bool = True) -> None Fit the neural signature model to discriminate between task conditions. The dataset should contain a binary target variable where: - Label 1 represents the first task condition - Label 0 represents the second task condition :param dataset: Dataset object with binary condition labels (1 and 0). :type dataset: Dataset :param keep_dataset: Whether to store the dataset in the classifier. :type keep_dataset: bool, default=True .. py:method:: cross_validate(dataset) -> dict Perform nested cross-validation for the neural signature model. :param dataset: Dataset object containing training data with binary condition labels. :type dataset: Dataset :returns: Cross-validation results including scores and fitted estimators. :rtype: dict .. py:method:: compute_signature_scores(condition1_data: numpy.ndarray, condition0_data: numpy.ndarray, *, subject_ids: list | None = None) -> pandas.DataFrame Compute neural signature scores for each subject. The neural signature score is computed as the difference in predicted probabilities for condition 1 between the two task conditions:: score = P(y=1 | condition1_data) - P(y=1 | condition0_data) :param condition1_data: Preprocessed fMRI data for condition 1 (shape: n_subjects x n_features). :type condition1_data: np.ndarray :param condition0_data: Preprocessed fMRI data for condition 0 (shape: n_subjects x n_features). :type condition0_data: np.ndarray :param subject_ids: Optional list of subject identifiers. If None, uses sequential indices. :type subject_ids: list or None, default=None :returns: DataFrame with columns: subject_id, condition1_prob, condition0_prob, signature_score. :rtype: pd.DataFrame :raises ValueError: If the model hasn't been fitted yet or if data shapes don't match. .. py:method:: get_cv_signature_scores(dataset, condition1_indices: numpy.ndarray, condition0_indices: numpy.ndarray, *, subject_ids: list | None = None) -> pandas.DataFrame Compute neural signature scores using cross-validated models. This method computes signature scores for each CV fold, which is useful for estimating the generalizability of the neural signature. :param dataset: Dataset object with full data (both conditions for all subjects). :type dataset: Dataset :param condition1_indices: Indices in the dataset corresponding to condition 1 trials. :type condition1_indices: np.ndarray :param condition0_indices: Indices in the dataset corresponding to condition 0 trials. :type condition0_indices: np.ndarray :param subject_ids: Optional list of subject identifiers. :type subject_ids: list or None, default=None :returns: DataFrame with signature scores from each CV fold, including columns: cv_fold, subject_id, condition1_prob, condition0_prob, signature_score. :rtype: pd.DataFrame :raises ValueError: If cross-validation hasn't been performed yet. .. py:method:: get_coefficients(dataset, *, exponentiate: bool = False) -> pandas.DataFrame Get model coefficients (feature weights) from cross-validated models. :param dataset: Dataset object with feature names. :type dataset: Dataset :param exponentiate: If True, exponentiate coefficients to get odds ratios. :type exponentiate: bool, default=False :returns: DataFrame with coefficients for each feature across CV folds. :rtype: pd.DataFrame .. py:method:: get_model_scores(dataset=None) -> pandas.DataFrame Get fit statistics for the model trained on the full dataset. :param dataset: Unused. Kept for API consistency. :type dataset: Dataset or None, default=None :returns: DataFrame with accuracy, F1, and AUC scores for the fitted model, with columns: value, partition, metric, target, target_label. :rtype: pd.DataFrame .. py:method:: get_cv_model_scores(dataset=None) -> pandas.DataFrame Get fit statistics for each validation fold separately. :param dataset: Dataset object with target labels for multi-class score interpretation. :type dataset: Dataset or None, default=None :returns: DataFrame with accuracy, F1, and AUC scores per CV fold, with columns: cv_fold, value, partition, metric, target, target_label. :rtype: pd.DataFrame .. py:method:: get_roc_curve(dataset=None) -> pandas.DataFrame Get ROC curve data for the model trained on the full dataset. :param dataset: Unused. Kept for API consistency. :type dataset: Dataset or None, default=None :returns: Long-format DataFrame with one row per threshold point, with columns: fpr, tpr, threshold, target, target_label. :rtype: pd.DataFrame .. py:method:: get_cv_roc_curves(dataset) -> pandas.DataFrame Get ROC curve data for each validation fold. :param dataset: Dataset object used during cross-validation. :type dataset: Dataset :returns: Long-format DataFrame with one row per threshold point, with columns: cv_fold, partition, fpr, tpr, threshold, target, target_label. :rtype: pd.DataFrame .. py:method:: save(path) -> None Save the fitted model to disk using joblib. :param path: File path to save the model to (e.g. 'model.joblib'). :type path: str or Path .. py:method:: load(path) -> NeuralSignature :classmethod: Load a saved model from disk. :param path: File path of the saved model. :type path: str or Path :returns: The loaded model instance. :rtype: NeuralSignature