pysdkit.utils#

Created on 2025/02/06 10:30:01 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

`pysdkit.utils._fft`
`pysdkit.utils._mirror`	Created on 2024/5/18 22:15 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
`pysdkit.utils._hilbert`	Created on Sat Mar 7 12:09:42 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
`pysdkit.utils._process`	Created on Sat Mar 5 21:57:53 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
`pysdkit.utils._differ`	Created on Sat Mar 18 22:05:02 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
`pysdkit.utils._smooth1d`	Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
`pysdkit.utils._function`	Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
`pysdkit.utils._instantaneous`	Created on 2025/02/02 00:13:28 @author: Whenxuan Wang @email: wwhenxuan@gmail.com 这部分代码等待进行优化
`pysdkit.utils._kernel_matrix`
`pysdkit.utils._diagnalization`

utils.fft#

pysdkit.utils._fft.fft(ts: ndarray) → ndarray[source]#: Fast Fourier Transform

pysdkit.utils._fft.fft2d(img: ndarray) → ndarray[source]#: Fast Fourier Transform for 2D Images

pysdkit.utils._fft.fftshift(ts: ndarray) → ndarray[source]#: Fast Fourier Transform Shift

pysdkit.utils._fft.ifft(ts: ndarray) → ndarray[source]#: Inverse Fast Fourier Transform

pysdkit.utils._fft.ifft2d(img: ndarray) → ndarray[source]#: Inverse Fast Fourier Transform for 2D Images

pysdkit.utils._fft.ifftshift(ts: ndarray) → ndarray[source]#: Inverse Fast Fourier Transform

utils.mirror#

Created on 2024/5/18 22:15 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

pysdkit.utils._mirror.fmirror(ts: ndarray, sym: int) → ndarray[source]#

Implements a signal mirroring expansion function.

This function mirrors ‘sym’ elements at both the beginning and the end of the given array ‘ts’, to create a new extended array.

Parameters:

ts – The one-dimensional numpy array to be mirrored.
sym – The number of elements to mirror from both the start and the end of the array ‘ts’. This value must be less than or equal to half the length of the array.

Returns:

The array after mirror expansion, which will have a length equal to the original array length plus twice the ‘sym’.

Examples:

>>> array = np.array([1, 2, 3, 4, 5])
>>> fmirror(array, 2)

array([2, 1, 1, 2, 3, 4, 5, 5, 4])

Note:

If ‘sym’ exceeds half the length of the array, the function may not work as expected, so it’s recommended to check the value of ‘sym’ beforehand.

utils.hilbert#

Created on Sat Mar 7 12:09:42 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

pysdkit.utils._hilbert.hilbert_imaginary(signal: ndarray) → ndarray[source]#: Get the imaginary part of the Hilbert transformed signal

pysdkit.utils._hilbert.hilbert_real(signal: ndarray) → ndarray[source]#: Get the real part of the Hilbert transformed signal

pysdkit.utils._hilbert.hilbert_spectrum(imfs_env: ndarray, imfs_freq: ndarray, fs: int, freq_lim: Tuple[float, float] | None = None, freq_res: float | None = None, time_range: Tuple[float, float] | None = None, time_scale: int | None = 1) → Tuple[ndarray, ndarray, ndarray][source]#

Compute the Hilbert spectrum H(t, f) using numpy.

Parameters:

imfs_env – The envelope functions of all IMFs.
imfs_freq – The instantaneous frequency functions.
fs – Sampling frequency in Hz.
freq_lim – Frequency range (min, max). Defaults to (0, fs/2).
freq_res – Frequency resolution. Defaults to (freq_max - freq_min)/200.
time_range – Time range (start, end) in seconds.
time_scale – Temporal scaling factor (Default: 1).

Returns:

(spectrum, time_axis, freq_axis) - spectrum : ndarray, shape (…, time_bins, freq_bins), Hilbert spectrum matrix - time_axis : ndarray, 1D, Time axis y - freq_axis : ndarray, 1D, Frequency axis y

pysdkit.utils._hilbert.hilbert_transform(signal: ndarray) → ndarray[source]#

Apply the Hilbert transform to a given numpy signal.

Parameters:: signal – NumPy array containing the input signal.
Returns:: A NumPy array containing the analytical signal obtained from the Hilbert transform.

pysdkit.utils._hilbert.plot_hilbert(signal: ndarray, analytical_signal: ndarray | None = None, return_figure: bool = False) → figure | None[source]#

Plot the Hilbert transform of a signal

Parameters:

signal – Original NumPy signal.
analytical_signal – A NumPy array containing the analytical signal obtained from the Hilbert transform.
return_figure – Whether to return the figure.

Returns:

The plot figure or None.

pysdkit.utils._hilbert.plot_hilbert_complex_plane(analytical_signal: ndarray, return_figure: bool = False) → figure[source]#

Plot the Hilbert transform of a signal on the complex plane.

Parameters:

analytical_signal – NumPy array containing the analytical signal (Hilbert transform of the original).
return_figure – Whether to return the figure.

Returns:

The plot figure or None.

utils.process#

Created on Sat Mar 5 21:57:53 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

The following code is mainly used to find extreme points in the EMD algorithm

Code taken from laszukdawid/PyEMD

pysdkit.utils._process.common_dtype(x: ndarray, y: ndarray) → Tuple[ndarray, ndarray][source]#

Casts inputs (x, y) into a common numpy DTYPE.

Parameters:

x – Input 1D Signal 1 - Numpy Array
y – Input 1D Signal 2 - Numpy Array

Returns:

Output two array with same common dtype - Numpy Array

pysdkit.utils._process.find_zero_crossings(signal: ndarray) → ndarray[source]#

Detects zero crossings in a given signal. A zero crossing occurs when two consecutive signal points have opposite signs, indicating a transition from positive to negative values or vice versa. This function also considers signal points that are exactly zero as zero crossings.

Parameters:: signal – The input signal as a NumPy array.
Returns:: An array of indices where zero crossings occur.

pysdkit.utils._process.get_timeline(range_max: int, dtype: dtype | None = None) → ndarray[source]#

Generates a numeric sequence representing a timeline for a signal. This sequence can be specified with a data type to ensure adequate representation of the data range.

Parameters:

range_max – The largest value in the range, equivalent to range(range_max), typically representing the length of the signal.
dtype – The minimum definition type. The returned timeline will have a dtype that is the same or with a higher byte size.

Returns:

The timeline array.

pysdkit.utils._process.normalize_signal(t: ndarray) → ndarray[source]#

Normalize time array so that it doesn’t explode on tiny values.

Returned array starts with 0 and the smallest increase is by 1.

Parameters:: t – Input 1D Signal - Numpy Array
Returns:: Output 1D Signal after normalize - Numpy Array

pysdkit.utils._process.not_duplicate(ts: ndarray) → ndarray[source]#

Returns indices for not repeating values, where there is no extremum.

This feature is particularly important for extreme value detection and data simplification in signal processing, and can help avoid double calculations of consecutive repeated values in extreme value detection and other analyses. For example, when determining which points should be used to calculate the envelope in the EMD algorithm, continuously repeated data points can be excluded, thereby improving calculation efficiency and accuracy.

Parameters:: ts – Input 1D Signal 1 - Numpy Array
Returns:: Index of distinct values in array

utils.differ#

Created on Sat Mar 18 22:05:02 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

pysdkit.utils._differ.differ(y: ~numpy.ndarray, delta: float, dtype: ~numpy.dtype = <class 'numpy.float64'>) → ndarray[source]#

Compute the derivative of a discrete time series y.

Parameters:

y – The input time series.
delta – The sampling time interval of y.
dtype – The data type of numpy array

Returns:

numpy.ndarray: The derivative of the time series.

utils.smooth1d#

Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

pysdkit.utils._smooth1d.exponential_smoothing(signal: ndarray, alpha: float = 0.4) → ndarray[source]#

Exponential Smoothing (Single Exponential Smoothing)

Parameters:

signal – Input signal (numpy array)
alpha – Smoothing factor, range from 0 to 1 (default is 0.4)

Returns:

Smoothed signal (numpy array)

pysdkit.utils._smooth1d.gaussian_smoothing(signal: ndarray, sigma: int = 2) → ndarray[source]#

Gaussian Filtering Smoothing

Parameters:

signal – Input signal (numpy array)
sigma – Standard deviation for Gaussian kernel (default is 2)

Returns:

Smoothed signal (numpy array)

pysdkit.utils._smooth1d.savgol_smoothing(signal: ndarray, window_length: int = 11, poly_order: int = 2) → ndarray[source]#

Savitzky-Golay Filtering Smoothing

Parameters:

signal – Input signal (numpy array)
window_length – Length of the filter window (default is 11, must be odd)
poly_order – Order of the polynomial used to fit the X (default is 2)

Returns:

Smoothed signal (numpy array)

utils.function#

Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com

This py includes a series of functional modules

This function calculates the covariance matrix of the input signal’s lag matrix It generates a specific lag matrix based on the input signal x and the specified mode, and then calculates the covariance matrix of that matrix.

The covariance matrix is very important in signal processing, time series analysis, statistical modeling, and other fields, as it can describe the correlation of the signal at different lags.

Parameters:

x – the input signal of 1d ndarray
mode –
Specifies the type of lag matrix to be generated. The supported modes include:
- mode = ‘full’:
  lags_matrix is the full Toeplitz convolutional matrix with dimensions [lags+N-1,lags],
  
  math:: out = [ [x,0..0]^T,[0,x,0..0]^T,…,[0,..0,x]^T ], where: N is the size of x.
- mode = ‘prew’:
  lags_matrix is the prewindowed matrix with the first N columns of the full matrix, and dimension = [N,lags];
- mode = ‘postw’:
  lags_matrix is the postwindowed matrix with the last N columns of the full matrix, and dimension = [N,lags];
- mode = ‘covar’ or ‘valid’:
  lags_matrix is the trimmed full matrix with the first and last m columns cut off
  
  (out = full[lags:N-lags,:]), with dimension = [N-lags+1,lags];
- mode = ‘same’:
  conv_matrix is the trimmed full matrix with the first and last m columns cut off
  
  (out = full[(lags-1)//2:N+(lags-1)//2,:]), with dimension = [N,lags];
- mode = ‘traj’:
  lags_matrix is the trajectory or so-called caterpillar matrix with dimension = [N,lags];
- mode = ‘hanekl’:
  lags_matrix is the Hankel matrix with dimension = [N,N];
- mode = ‘toeplitz’:
  lags_matrix is the symmetric Toeplitz matrix, with dimension = [N,N].
lags – An integer or None, representing the number of columns in the lag matrix (default is N // 2, where N is the length of the input signal).
ret_base – if true, then the lag matrix will also be returned
dtype – The numpy data type used, None means using the data type of the input signal

Returns:

ret_base is False:
- matrix: 2d ndarray.
ret_base is True:
- matrix: 2d ndarray, covariance matrix.
- lags_matrix: lag matrix.

Note:

Lag matrices of different modes have different shapes and uses, and the choice of mode depends on the specific application scenario. The calculation of the covariance matrix is based on the dot product of the lag matrix, so its result reflects the correlation of the input signal at different lags. If the input signal is short, the value of lags may need to be adjusted to avoid generating an overly large lag matrix.

pysdkit.utils._function.decimal_scaling_normalization(x: ndarray) → ndarray[source]#

Perform Decimal Scaling normalization on the input signal

Parameters:: x – Input 1D sequence
Returns:: Normalized sequence

pysdkit.utils._function.index_of_orthogonality(signal: ndarray, IMFs: ndarray) → float[source]#

Any pair of IMFs is locally orthogonal. To evaluate EMD performance, an Index of Orthogonality (IO) was proposed, so that the closer to zero, the more effective will be the decomposition.

Parameters:

signal – the row input signal.
IMFs – Intrinsic Mode Function after decomposition.

Returns:

the value of index of orthogonality

Index of Orthogonality is proposed py Under Barbosa de Souza in A survey on Hilbert-Huang transform: Evolution, challenges and solutions. Digital Signal Processing 120 (2022) 103292.

pysdkit.utils._function.is_1d(x: ndarray) → bool[source]#: Check if the input sequence is one-dimensional

pysdkit.utils._function.is_complex(x: ndarray) → bool[source]#: Check if the input sequence is a complex sequence

pysdkit.utils._function.lags_matrix(x: ndarray, mode: str | None = 'full', lags: int | None = None, dtype: dtype | None = None) → ndarray[source]#

This function generates the lag matrix of a signal, also known as the data matrix or correlation matrix

This type of matrix is very common in signal processing, time series analysis, adaptive filter design, system identification, and other fields. It can generate various types of matrices based on different modes, such as Toeplitz matrices, Hankel matrices, convolution matrices, etc.

Parameters:

x – 1D numpy ndarray signal
mode –
Specifies the type of lag matrix to be generated. The supported modes include:
- mode = ‘full’ :
  lags_matrix is the full Toeplitz convolutional matrix with dimensions [lags+N-1,lags],
  
  math:: out = [ [x,0..0]^T,[0,x,0..0]^T,…,[0,..0,x]^T ], where: N is the size of x.
- mode = ‘prew’:
  lags_matrix is the prewindowed matrix with the first N columns of the full matrix, and dimension = [N,lags];
- mode = ‘postw’:
  lags_matrix is the postwindowed matrix with the last N columns of the full matrix, and dimension = [N,lags];
- mode = ‘covar’ or ‘valid’:
  lags_matrix is the trimmed full matrix with the first and last m columns cut off
  
  (out = full[lags:N-lags,:]), with dimension = [N-lags+1,lags];
- mode = ‘same’:
  conv_matrix is the trimmed full matrix with the first and last m columns cut off
  
  (out = full[(lags-1)//2:N+(lags-1)//2,:]), with dimension = [N,lags];
- mode = ‘traj’:
  lags_matrix is the trajectory or so-called caterpillar matrix with dimension = [N,lags];
- mode = ‘hanekl’:
  lags_matrix is the Hankel matrix with dimension = [N,N];
- mode = ‘toeplitz’:
  lags_matrix is the symmetric Toeplitz matrix, with dimension = [N,N].
lags – An integer or None, representing the number of columns in the lag matrix (default is N // 2, where N is the length of the input signal).
dtype – The numpy data type used, None means using the data type of the input signal

Returns:

A 2D array representing the generated lag matrix.

The generation method for each mode is mainly based on the arrangement and combination of different lagged versions of the input signal.

By choosing the appropriate mode, matrices suitable for different signal processing and time series analysis tasks can be generated.

pysdkit.utils._function.log_transformation(x: ndarray) → ndarray[source]#

Perform log transformation on the input signal

Parameters:: x – Input 1D sequence
Returns:: Transformed sequence

pysdkit.utils._function.max_absolute_normalization(x: ndarray) → ndarray[source]#

Perform Max Absolute normalization on the input signal

Parameters:: x – Input 1D sequence
Returns:: Normalized sequence

pysdkit.utils._function.max_min_normalization(x: ndarray) → ndarray[source]#

Perform min-max normalization on the input signal

Parameters:: x – Input 1D sequence
Returns:: Normalized sequence

pysdkit.utils._function.to_1d(data: ndarray) → ndarray[source]#

Transform any data to a 1D numpy ndarray

Parameters:: data – None, float, int or ndarray of any data type
Returns:: the transformed 1D numpy ndarray

pysdkit.utils._function.to_2d(data: ndarray, column: bool | None = False) → ndarray[source]#

Transform any data to a 2D numpy ndarray

Parameters:

data – None, float, int or ndarray of any data type
column – Whether to output a row vector or a column vector. Determines where the new dimension is added.

Returns:

the transformed 2D numpy ndarray

pysdkit.utils._function.z_score_normalization(x: ndarray) → ndarray[source]#

Perform Z-score normalization on the input signal

Parameters:: x – Input 1D sequence
Returns:: Normalized sequence

utils.instantaneous#

Created on 2025/02/02 00:13:28 @author: Whenxuan Wang @email: wwhenxuan@gmail.com 这部分代码等待进行优化

pysdkit.utils._instantaneous.find_extrema(signal: ndarray) → Tuple[ndarray, ndarray][source]#: 获取输入信号的极值点

pysdkit.utils._instantaneous.inst_freq_local(data: ndarray) → Tuple[ndarray, ndarray][source]#: 进行希尔伯特黄变换并获得谱分布

utils.kernel_matrix#

pysdkit.utils._kernel_matrix.euclidian_matrix(X, Y, inner=False, square=True, normalize=False) → ndarray[source]#

Matrix of euclidian distance I.E. Pairwise distance matrix

Parameters:

X – 2d or 1d input ndarray
Y – 2d or 1d input ndarray
inner – inner or outer dimesions
square – if false, then sqrt will be taken
normalize – if true, distance will be normalized as d = d/(std(x)*std(y))

Returns:

2d ndarray, pairwise distance matrix

This function is used to generate the kernel matrix of the input signal

It is similar to the calculation of the covariance matrix, but the kernel function is used to measure the similarity between data points. The kernel matrix is very important in fields such as machine learning, signal processing, and time series analysis, especially when dealing with nonlinear relationships.

The kernel matrix calculates the similarity between different lagged versions of the input signal through the kernel function. The choice of kernel function and parameter settings can be adjusted according to the specific application scenario.

Parameters:

x – 1D numpy ndarray signal
mode –
Specifies the type of lag matrix to be generated. The supported modes include:
- mode = ‘full’:
  lags_matrix is the full Toeplitz convolutional matrix with dimensions [lags+N-1,lags],
  
  math:: out = [ [x,0..0]^T,[0,x,0..0]^T,…,[0,..0,x]^T ], where: N is the size of x.
- mode = ‘prew’:
  lags_matrix is the prewindowed matrix with the first N columns of the full matrix, and dimension = [N,lags];
- mode = ‘postw’:
  lags_matrix is the postwindowed matrix with the last N columns of the full matrix, and dimension = [N,lags];
- mode = ‘covar’ or ‘valid’:
  lags_matrix is the trimmed full matrix with the first and last m columns cut off
  
  (out = full[lags:N-lags,:]), with dimension = [N-lags+1,lags];
- mode = ‘same’:
  conv_matrix is the trimmed full matrix with the first and last m columns cut off
  
  (out = full[(lags-1)//2:N+(lags-1)//2,:]), with dimension = [N,lags];
- mode = ‘traj’:
  lags_matrix is the trajectory or so-called caterpillar matrix with dimension = [N,lags];
- mode = ‘hanekl’:
  lags_matrix is the Hankel matrix with dimension = [N,N];
- mode = ‘toeplitz’:
  lags_matrix is the symmetric Toeplitz matrix, with dimension = [N,N].
kernel – kernel = {exp, rbf, polynomial, sigmoid, linear, euclid, minkowsky, thin_plate, bump, polymorph}
kpar – kernel parameter, depends on the kernel type
lags – number of lags (N//2 dy default of None)
return_base – if true, than lags matrix will be also returned
normalization – if True, than matrix mean will be substructed

Returns:

ret_base is False:
- kernel matrix: 2d ndarray.
ret_base is True:
- matrix: 2d ndarray, kernel matrix.
- lags_matrix: lags matrix.

Note:

Selection of kernel function: Different kernel functions are suitable for different application scenarios. For example, the RBF kernel is suitable for processing Gaussian distributed data, while the polynomial kernel is suitable for data with polynomial relationships. Normalization: Normalization can reduce the scale difference of the kernel matrix, making it more suitable for subsequent analysis. Adjustment of the number of lags: If the input signal is short, you may need to adjust the value of lags to avoid generating an overly large lag matrix.

utils.diagnalization#

pysdkit.utils._diagnalization.diagonal_average(matrix: ndarray, reverse: bool | None = True, samesize: bool | None = False, averaging: bool | None = True) → ndarray[source]#

Perform Hankel averaging (or diagonal averaging) on the input matrix

The main function of this function is to average or sum the diagonals of the input matrix. According to the parameter settings, you can choose to extract the diagonals in the forward or reverse direction, and you can choose whether to calculate the average value of the diagonal elements. The function returns a one-dimensional array representing the processed result.

Parameters:

matrix (2D_ndarray) – The input matrix to be averaged.
reverse (bool) – If True, diagonals are taken in reverse order (from the bottom-right to the top-left).
samesize (bool) – If True, only diagonals from the main to the leftmost are taken.
averaging (bool) – If True, the mean value of each diagonal is taken; otherwise, the sum is taken.

Return vector:

1D ndarray, The resulting vector after diagonal averaging.

Note:

If samesize = False:
- If reverse = False, diagonals from the bottom-left to the top-right are taken.
- If reverse = True, diagonals from the bottom-right to the top-left are taken.
If samesize = True:
- If reverse = False, diagonals from the bottom-left to the main diagonal are taken.
- If reverse = True, diagonals from the bottom-right to the main diagonal are taken.

pysdkit.utils._diagnalization.get_diagonal(matrix: ndarray, idx: int, reverse: bool | None = False) → ndarray[source]#

Extract the specified diagonal from a matrix

The main function of this function is to extract the diagonal elements of the specified index from the input matrix. The index of the diagonal is calculated from the main diagonal (index is 0), the positive index represents the diagonal to the right of the main diagonal, and the negative index represents the diagonal to the left of the main diagonal. The function also supports reverse extraction of diagonals, that is, starting from the lower right corner of the matrix.

Parameters:

matrix (2D_ndarray) – The input matrix from which the diagonal is extracted.
idx (int) – The index of the diagonal relative to the main diagonal (zero diagonal). Positive indices are to the right of the main diagonal, and negative indices are to the left.
reverse (bool) – If True, extract the diagonal in reverse order (from the bottom-right to the top-left).

Return diag:

1D ndarray, The extracted diagonal elements.

Notes:

If reverse = False:
- idx = 0: main diagonal
- idx > 0: diagonals to the left of the main diagonal
- idx < 0: diagonals to the right of the main diagonal
If reverse = True:
- idx = 0: main backward diagonal
- idx > 0: diagonals to the right of the main backward diagonal
- idx < 0: diagonals to the left of the main backward diagonal

Example:

>>> a = [1, 2, 3, 4, 5]
>>> b = signals.matrix.toeplitz(a)[:3, :]
>>> print(b)
>>> print(get_diagonal(b,0))  # zero diagonal
>>> print(get_diagonal(b,-2)) # 2 diagonals to the left
>>> print(get_diagonal(b,3))  # 3 diagonals to the right
>>> print(get_diagonal(b,0,reverse=True))  # zero backward diagonal
>>> print(get_diagonal(b,-1,reverse=True)) # 1 right backward diagonal
>>> print(get_diagonal(b,1,reverse=True))  # 1 left backward diagonal