pyplt.fsmethods package

Submodules

pyplt.fsmethods.base module

class pyplt.fsmethods.base.FeatureSelectionMethod(description='A feature selection method.', name='', **kwargs)

Bases: object

Base class for all feature selection methods.

Initializes the FeatureSelectionMethod object.

Parameters:
  • description (str, optional) – a description of the feature selection method (default “A feature selection method.”).
  • name (str, optional) – the name of the feature selection method (default “”).
  • kwargs – any additional parameters for the feature selection method.
get_description()

Get the description of the feature selection method.

Returns:the description of the feature selection method.
Return type:str
get_name()

Get the name of the feature selection method.

Returns:the name of the feature selection method.
Return type:str
get_params()

Return all additional parameters of the feature selection method (if applicable).

Returns:a dict containing all additional parameters of the feature selection method with the parameter names as the dict’s keys and the corresponding parameter values as the dict’s values (if applicable).
Return type:dict
get_params_string()

Return a string representation of all additional parameters of the feature selection method (if applicable).

Returns:the string representation of all additional parameters of the feature selection method (if applicable).
Return type:str
get_selected_features()

Get the subset of selected features.

Returns:the subset of selected features.
Return type:list of str
select(objects, ranks, algorithm, test_objects=None, test_ranks=None, preprocessed_folds=None, progress_window=None, exec_stopper=None)

Abstract method for running the feature selection process.

All children classes must implement this method.

Parameters:
  • objects (pandas.DataFrame or None) – the objects data to be used during the feature selection process. If None, the data is obtained via the preprocessed_folds parameter instead.
  • ranks (pandas.DataFrame or None) – the pairwise rank data to be used during the feature selection process. If None, the data is obtained via the preprocessed_folds parameter instead.
  • algorithm (pyplt.plalgorithms.base.PLAlgorithm) – the algorithm to be used for feature selection (if applicable).
  • test_objects (pandas.DataFrame or None, optional) – optional test objects data to be used during the feature selection process (default None).
  • test_ranks (pandas.DataFrame or None, optional) – optional test pairwise rank data to be used during the feature selection process (default None).
  • preprocessed_folds (pyplt.evaluation.cross_validation.PreprocessedFolds or None, optional) – the data used to evaluate the feature set with in the form of pre-processed folds (default None). This is an alternative way to pass the data and is only considered if either of the objects and ranks parameters is None.
  • progress_window (pyplt.gui.experiment.progresswindow.ProgressWindow, optional) – a GUI object (extending the tkinter.Toplevel widget) used to display a progress log and progress bar during the experiment execution (default None).
  • exec_stopper (pyplt.util.AbortFlag, optional) – an abort flag object used to abort the execution before completion (default None).
Returns:

  • the subset of selected features – if execution is completed successfully.
  • None – if aborted before completion by exec_stopper.

Return type:

list of str

pyplt.fsmethods.sfs module

class pyplt.fsmethods.sfs.SFS(verbose=True)

Bases: pyplt.fsmethods.wrappers.WrapperFSMethod

Sequential Forward Selection (SFS) method.

SFS is a bottom-up hill-climbing algorithm where one feature is added at a time to the current feature set. The feature to be added is selected from the subset of the remaining features such that the new feature set generates the maximum value of the performance function over all candidate features for addition. The selection procedure begins with an empty feature set and terminates when an added feature yields equal or lower performance to the performance obtained without it. The performance of each subset of features considered is computed as the prediction accuracy of a model trained using that subset of features as input. All of the preference learning algorithms implemented in the tool can be used to train this model; i.e., RankSVM and Backpropagation.

Extends the pyplt.fsmethods.wrappers.WrapperFSMethod class which, in turn, extends the pyplt.fsmethods.base.FeatureSelectionMethod class.

Initializes the feature selection method with the appropriate name and description.

Parameters:verbose (bool) – specifies whether or not to display detailed progress information to the progress_window if one is used (default True).
select(objects, ranks, algorithm, test_objects=None, test_ranks=None, preprocessed_folds=None, progress_window=None, exec_stopper=None)

Carry out the feature selection process according to the SFS algorithm.

Parameters:
  • objects (pandas.DataFrame or None) – the objects data used to train the models used to evaluate and select features. If None, the data is obtained via the preprocessed_folds parameter instead.
  • ranks (pandas.DataFrame or None) – the pairwise rank data used to train the models used to evaluate and select features. If None, the data is obtained via the preprocessed_folds parameter instead.
  • algorithm (pyplt.plalgorithms.base.PLAlgorithm) – the algorithm used to train models to evaluate the features with (via the training accuracy).
  • test_objects (pandas.DataFrame or None, optional) – optional objects data used to test the models used to evaluate and select features (default None).
  • test_ranks (pandas.DataFrame or None, optional) – optional pairwise rank data used to test the models used to evaluate and select features (default None).
  • preprocessed_folds (pyplt.evaluation.cross_validation.PreprocessedFolds or None, optional) – the data used to evaluate the feature set with in the form of pre-processed folds (default None). This is an alternative way to pass the data and is only considered if either of the objects and ranks parameters is None.
  • progress_window (pyplt.gui.experiment.progresswindow.ProgressWindow, optional) – a GUI object (extending the tkinter.Toplevel widget) used to display a progress log and progress bar during the experiment execution (default None).
  • exec_stopper (pyplt.util.AbortFlag, optional) – an abort flag object used to abort the execution before completion (default None).
Returns:

  • the subset of features selected by SFS – if execution is completed successfully.
  • None – if aborted before completion by exec_stopper.

Return type:

list of str

pyplt.fsmethods.wrappers module

class pyplt.fsmethods.wrappers.WrapperFSMethod(description='A wrapper feature selection method.', **kwargs)

Bases: pyplt.fsmethods.base.FeatureSelectionMethod

Parent class for all wrapper-type feature selection methods.

Initializes the wrapper-type feature selection method.

Parameters:
  • description (str, optional) – a description of the feature selection method (default “A wrapper feature selection method.”).
  • kwargs – any additional parameters for the feature selection method.

Module contents

This package contains backend modules that manage the feature selection step of an experiment.