preprocessing package

Submodules

preprocessing.PCABinning module

class preprocessing.PCABinning.PCABinning(file_path, root, new_dataset_name_prefix, metadata_columns=2, min_features=8, step_size=0.005, overlap=0.0025, n_components=2)[source]

Bases: object

apply_dim_reduction()[source]

Applies PCA dimensionality reduction to the dataset by binning based on the specified ppm range, step size, and overlap. Saves the PCA-transformed data along with metadata.

preprocessing.WaveletDenoise module

class preprocessing.WaveletDenoise.WaveletDenoise(df, metadata_columns, unique_id_col, class_label_col, save_path, wavelets, level=1, save_data=True, show_graph=True)[source]

Bases: object

static madev(d, axis=None)[source]

Calculate the mean absolute deviation of a dataset.

Parameters: - d (np.array): The dataset for which the mean absolute deviation is calculated. - axis (int, optional): The axis along which the mean is computed. The default is to compute the mean of the flattened array.

Returns: - The mean absolute deviation of the dataset.

plot_data(filtered_data, wavelet_name)[source]

Generates and saves a plot comparing the original and denoised data, including scaling, for each sample.

Parameters: - filtered_data (np.array): The denoised data to be plotted. - wavelet_name (str): The name of the wavelet used for denoising, included in the plot title.

process_and_visualize()[source]

Processes the dataset through wavelet denoising for each specified wavelet type and optionally saves the results and generates visualizations.

This method iterates over each wavelet specified in the wavelets list, denoises the data using that wavelet, and then optionally saves the denoised data and generates a plot comparing the original and denoised signals.

save_denoised_data(filtered_data, wavelet_name)[source]

Saves the denoised data to a CSV file, preserving the initial metadata columns.

Parameters: - filtered_data (np.array): The denoised data to be saved. - wavelet_name (str): The name of the wavelet used for denoising, used in the filename.

wavelet_denoising(x, wavelet='db1')[source]

Performs wavelet denoising on a given signal using specified wavelet type.

Parameters: - x (np.array): The signal to be denoised. - wavelet (str, optional): The type of wavelet to use for denoising. Defaults to ‘db1’.

Returns: - The denoised signal as a numpy array.

preprocessing.preprocess_control module

Module contents