pysdkit.utils#
Created on 2025/02/06 10:30:01 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
Created on 2024/5/18 22:15 @author: Whenxuan Wang @email: wwhenxuan@gmail.com |
|
Created on Sat Mar 7 12:09:42 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com |
|
Created on Sat Mar 5 21:57:53 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com |
|
Created on Sat Mar 18 22:05:02 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com |
|
Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com |
|
Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com |
|
Created on 2025/02/02 00:13:28 @author: Whenxuan Wang @email: wwhenxuan@gmail.com 这部分代码等待进行优化 |
|
utils.fft#
utils.mirror#
Created on 2024/5/18 22:15 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
- pysdkit.utils._mirror.fmirror(ts: ndarray, sym: int) ndarray[source]#
Implements a signal mirroring expansion function.
This function mirrors ‘sym’ elements at both the beginning and the end of the given array ‘ts’, to create a new extended array.
- Parameters:
ts – The one-dimensional numpy array to be mirrored.
sym – The number of elements to mirror from both the start and the end of the array ‘ts’. This value must be less than or equal to half the length of the array.
- Returns:
The array after mirror expansion, which will have a length equal to the original array length plus twice the ‘sym’.
- Examples:
>>> array = np.array([1, 2, 3, 4, 5]) >>> fmirror(array, 2)
array([2, 1, 1, 2, 3, 4, 5, 5, 4])
- Note:
If ‘sym’ exceeds half the length of the array, the function may not work as expected, so it’s recommended to check the value of ‘sym’ beforehand.
utils.hilbert#
Created on Sat Mar 7 12:09:42 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
- pysdkit.utils._hilbert.hilbert_imaginary(signal: ndarray) ndarray[source]#
Get the imaginary part of the Hilbert transformed signal
- pysdkit.utils._hilbert.hilbert_real(signal: ndarray) ndarray[source]#
Get the real part of the Hilbert transformed signal
- pysdkit.utils._hilbert.hilbert_spectrum(imfs_env: ndarray, imfs_freq: ndarray, fs: int, freq_lim: tuple[float, float] | None = None, freq_res: float | None = None, time_range: tuple[float, float] | None = None, time_scale: int | None = 1) Tuple[ndarray, ndarray, ndarray][source]#
Compute the Hilbert spectrum H(t, f) using numpy.
- Parameters:
imfs_env – The envelope functions of all IMFs.
imfs_freq – The instantaneous frequency functions.
fs – Sampling frequency in Hz.
freq_lim – Frequency range (min, max). Defaults to (0, fs/2).
freq_res – Frequency resolution. Defaults to (freq_max - freq_min)/200.
time_range – Time range (start, end) in seconds.
time_scale – Temporal scaling factor (Default: 1).
- Returns:
(spectrum, time_axis, freq_axis) - spectrum : ndarray, shape (…, time_bins, freq_bins), Hilbert spectrum matrix - time_axis : ndarray, 1D, Time axis y - freq_axis : ndarray, 1D, Frequency axis y
- pysdkit.utils._hilbert.hilbert_transform(signal: ndarray) ndarray[source]#
Apply the Hilbert transform to a given numpy signal.
- Parameters:
signal – NumPy array containing the input signal.
- Returns:
A NumPy array containing the analytical signal obtained from the Hilbert transform.
- pysdkit.utils._hilbert.plot_hilbert(signal: ndarray, analytical_signal: ndarray | None = None, return_figure: bool = False) figure | None[source]#
Plot the Hilbert transform of a signal
- Parameters:
signal – Original NumPy signal.
analytical_signal – A NumPy array containing the analytical signal obtained from the Hilbert transform.
return_figure – Whether to return the figure.
- Returns:
The plot figure or None.
- pysdkit.utils._hilbert.plot_hilbert_complex_plane(analytical_signal: ndarray, return_figure: bool = False) figure[source]#
Plot the Hilbert transform of a signal on the complex plane.
- Parameters:
analytical_signal – NumPy array containing the analytical signal (Hilbert transform of the original).
return_figure – Whether to return the figure.
- Returns:
The plot figure or None.
utils.process#
Created on Sat Mar 5 21:57:53 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
The following code is mainly used to find extreme points in the EMD algorithm
Code taken from laszukdawid/PyEMD
- pysdkit.utils._process.common_dtype(x: ndarray, y: ndarray) Tuple[ndarray, ndarray][source]#
Casts inputs (x, y) into a common numpy DTYPE.
- Parameters:
x – Input 1D Signal 1 - Numpy Array
y – Input 1D Signal 2 - Numpy Array
- Returns:
Output two array with same common dtype - Numpy Array
- pysdkit.utils._process.find_zero_crossings(signal: ndarray) ndarray[source]#
Detects zero crossings in a given signal. A zero crossing occurs when two consecutive signal points have opposite signs, indicating a transition from positive to negative values or vice versa. This function also considers signal points that are exactly zero as zero crossings.
- Parameters:
signal – The input signal as a NumPy array.
- Returns:
An array of indices where zero crossings occur.
- pysdkit.utils._process.get_timeline(range_max: int, dtype: dtype | None = None) ndarray[source]#
Generates a numeric sequence representing a timeline for a signal. This sequence can be specified with a data type to ensure adequate representation of the data range.
- Parameters:
range_max – The largest value in the range, equivalent to range(range_max), typically representing the length of the signal.
dtype – The minimum definition type. The returned timeline will have a dtype that is the same or with a higher byte size.
- Returns:
The timeline array.
- pysdkit.utils._process.normalize_signal(t: ndarray) ndarray[source]#
Normalize time array so that it doesn’t explode on tiny values.
Returned array starts with 0 and the smallest increase is by 1.
- Parameters:
t – Input 1D Signal - Numpy Array
- Returns:
Output 1D Signal after normalize - Numpy Array
- pysdkit.utils._process.not_duplicate(ts: ndarray) ndarray[source]#
Returns indices for not repeating values, where there is no extremum.
This feature is particularly important for extreme value detection and data simplification in signal processing, and can help avoid double calculations of consecutive repeated values in extreme value detection and other analyses. For example, when determining which points should be used to calculate the envelope in the EMD algorithm, continuously repeated data points can be excluded, thereby improving calculation efficiency and accuracy.
- Parameters:
ts – Input 1D Signal 1 - Numpy Array
- Returns:
Index of distinct values in array
utils.differ#
Created on Sat Mar 18 22:05:02 2024 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
- pysdkit.utils._differ.differ(y: ~numpy.ndarray, delta: float, dtype: ~numpy.dtype = <class 'numpy.float64'>) ndarray[source]#
Compute the derivative of a discrete time series y.
- Parameters:
y – The input time series.
delta – The sampling time interval of y.
dtype – The data type of numpy array
- Returns:
numpy.ndarray: The derivative of the time series.
utils.smooth1d#
Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
- pysdkit.utils._smooth1d.exponential_smoothing(signal: ndarray, alpha: float = 0.4) ndarray[source]#
Exponential Smoothing (Single Exponential Smoothing)
- Parameters:
signal – Input signal (numpy array)
alpha – Smoothing factor, range from 0 to 1 (default is 0.4)
- Returns:
Smoothed signal (numpy array)
- pysdkit.utils._smooth1d.gaussian_smoothing(signal: ndarray, sigma: int = 2) ndarray[source]#
Gaussian Filtering Smoothing
- Parameters:
signal – Input signal (numpy array)
sigma – Standard deviation for Gaussian kernel (default is 2)
- Returns:
Smoothed signal (numpy array)
- pysdkit.utils._smooth1d.savgol_smoothing(signal: ndarray, window_length: int = 11, poly_order: int = 2) ndarray[source]#
Savitzky-Golay Filtering Smoothing
- Parameters:
signal – Input signal (numpy array)
window_length – Length of the filter window (default is 11, must be odd)
poly_order – Order of the polynomial used to fit the X (default is 2)
- Returns:
Smoothed signal (numpy array)
utils.function#
Created on 2024/6/3 15:31 @author: Whenxuan Wang @email: wwhenxuan@gmail.com
This py includes a series of functional modules
- pysdkit.utils._function.covariance_matrix(x: ndarray, mode: str | None = 'full', lags: int | None = None, ret_base: bool | None = False, dtype: dtype | None = None) tuple[ndarray, ndarray] | ndarray[source]#
This function calculates the covariance matrix of the input signal’s lag matrix It generates a specific lag matrix based on the input signal x and the specified mode, and then calculates the covariance matrix of that matrix.
The covariance matrix is very important in signal processing, time series analysis, statistical modeling, and other fields, as it can describe the correlation of the signal at different lags.
- Parameters:
x – the input signal of 1d ndarray
mode –
Specifies the type of lag matrix to be generated. The supported modes include:
- mode = ‘full’:
lags_matrix is the full Toeplitz convolutional matrix with dimensions [lags+N-1,lags],
math:: out = [ [x,0..0]^T,[0,x,0..0]^T,…,[0,..0,x]^T ], where: N is the size of x.
- mode = ‘prew’:
lags_matrix is the prewindowed matrix with the first N columns of the full matrix, and dimension = [N,lags];
- mode = ‘postw’:
lags_matrix is the postwindowed matrix with the last N columns of the full matrix, and dimension = [N,lags];
- mode = ‘covar’ or ‘valid’:
lags_matrix is the trimmed full matrix with the first and last m columns cut off
(out = full[lags:N-lags,:]), with dimension = [N-lags+1,lags];
- mode = ‘same’:
conv_matrix is the trimmed full matrix with the first and last m columns cut off
(out = full[(lags-1)//2:N+(lags-1)//2,:]), with dimension = [N,lags];
- mode = ‘traj’:
lags_matrix is the trajectory or so-called caterpillar matrix with dimension = [N,lags];
- mode = ‘hanekl’:
lags_matrix is the Hankel matrix with dimension = [N,N];
- mode = ‘toeplitz’:
lags_matrix is the symmetric Toeplitz matrix, with dimension = [N,N].
lags – An integer or None, representing the number of columns in the lag matrix (default is N // 2, where N is the length of the input signal).
ret_base – if true, then the lag matrix will also be returned
dtype – The numpy data type used, None means using the data type of the input signal
- Returns:
- ret_base is False:
matrix: 2d ndarray.
- ret_base is True:
matrix: 2d ndarray, covariance matrix.
lags_matrix: lag matrix.
- Note:
Lag matrices of different modes have different shapes and uses, and the choice of mode depends on the specific application scenario. The calculation of the covariance matrix is based on the dot product of the lag matrix, so its result reflects the correlation of the input signal at different lags. If the input signal is short, the value of lags may need to be adjusted to avoid generating an overly large lag matrix.
- pysdkit.utils._function.decimal_scaling_normalization(x: ndarray) ndarray[source]#
Perform Decimal Scaling normalization on the input signal
- Parameters:
x – Input 1D sequence
- Returns:
Normalized sequence
- pysdkit.utils._function.index_of_orthogonality(signal: ndarray, IMFs: ndarray) float[source]#
Any pair of IMFs is locally orthogonal. To evaluate EMD performance, an Index of Orthogonality (IO) was proposed, so that the closer to zero, the more effective will be the decomposition.
- Parameters:
signal – the row input signal.
IMFs – Intrinsic Mode Function after decomposition.
- Returns:
the value of index of orthogonality
Index of Orthogonality is proposed py Under Barbosa de Souza in A survey on Hilbert-Huang transform: Evolution, challenges and solutions. Digital Signal Processing 120 (2022) 103292.
- pysdkit.utils._function.is_1d(x: ndarray) bool[source]#
Check if the input sequence is one-dimensional
- pysdkit.utils._function.is_complex(x: ndarray) bool[source]#
Check if the input sequence is a complex sequence
- pysdkit.utils._function.lags_matrix(x: ndarray, mode: str | None = 'full', lags: int | None = None, dtype: dtype | None = None) ndarray[source]#
This function generates the lag matrix of a signal, also known as the data matrix or correlation matrix
This type of matrix is very common in signal processing, time series analysis, adaptive filter design, system identification, and other fields. It can generate various types of matrices based on different modes, such as Toeplitz matrices, Hankel matrices, convolution matrices, etc.
- Parameters:
x – 1D numpy ndarray signal
mode –
Specifies the type of lag matrix to be generated. The supported modes include:
- mode = ‘full’ :
lags_matrix is the full Toeplitz convolutional matrix with dimensions [lags+N-1,lags],
math:: out = [ [x,0..0]^T,[0,x,0..0]^T,…,[0,..0,x]^T ], where: N is the size of x.
- mode = ‘prew’:
lags_matrix is the prewindowed matrix with the first N columns of the full matrix, and dimension = [N,lags];
- mode = ‘postw’:
lags_matrix is the postwindowed matrix with the last N columns of the full matrix, and dimension = [N,lags];
- mode = ‘covar’ or ‘valid’:
lags_matrix is the trimmed full matrix with the first and last m columns cut off
(out = full[lags:N-lags,:]), with dimension = [N-lags+1,lags];
- mode = ‘same’:
conv_matrix is the trimmed full matrix with the first and last m columns cut off
(out = full[(lags-1)//2:N+(lags-1)//2,:]), with dimension = [N,lags];
- mode = ‘traj’:
lags_matrix is the trajectory or so-called caterpillar matrix with dimension = [N,lags];
- mode = ‘hanekl’:
lags_matrix is the Hankel matrix with dimension = [N,N];
- mode = ‘toeplitz’:
lags_matrix is the symmetric Toeplitz matrix, with dimension = [N,N].
lags – An integer or None, representing the number of columns in the lag matrix (default is N // 2, where N is the length of the input signal).
dtype – The numpy data type used, None means using the data type of the input signal
- Returns:
A 2D array representing the generated lag matrix.
The generation method for each mode is mainly based on the arrangement and combination of different lagged versions of the input signal.
By choosing the appropriate mode, matrices suitable for different signal processing and time series analysis tasks can be generated.
- pysdkit.utils._function.log_transformation(x: ndarray) ndarray[source]#
Perform log transformation on the input signal
- Parameters:
x – Input 1D sequence
- Returns:
Transformed sequence
- pysdkit.utils._function.max_absolute_normalization(x: ndarray) ndarray[source]#
Perform Max Absolute normalization on the input signal
- Parameters:
x – Input 1D sequence
- Returns:
Normalized sequence
- pysdkit.utils._function.max_min_normalization(x: ndarray) ndarray[source]#
Perform min-max normalization on the input signal
- Parameters:
x – Input 1D sequence
- Returns:
Normalized sequence
- pysdkit.utils._function.to_1d(data: ndarray) ndarray[source]#
Transform any data to a 1D numpy ndarray
- Parameters:
data – None, float, int or ndarray of any data type
- Returns:
the transformed 1D numpy ndarray
- pysdkit.utils._function.to_2d(data: ndarray, column: bool | None = False) ndarray[source]#
Transform any data to a 2D numpy ndarray
- Parameters:
data – None, float, int or ndarray of any data type
column – Whether to output a row vector or a column vector. Determines where the new dimension is added.
- Returns:
the transformed 2D numpy ndarray
utils.instantaneous#
Created on 2025/02/02 00:13:28 @author: Whenxuan Wang @email: wwhenxuan@gmail.com 这部分代码等待进行优化
utils.kernel_matrix#
- pysdkit.utils._kernel_matrix.euclidian_matrix(X, Y, inner=False, square=True, normalize=False) ndarray[source]#
Matrix of euclidian distance I.E. Pairwise distance matrix
- Parameters:
X – 2d or 1d input ndarray
Y – 2d or 1d input ndarray
inner – inner or outer dimesions
square – if false, then sqrt will be taken
normalize – if true, distance will be normalized as d = d/(std(x)*std(y))
- Returns:
2d ndarray, pairwise distance matrix
- pysdkit.utils._kernel_matrix.kernel_matrix(x: ndarray, mode: str | None = 'full', kernel: str | None = 'linear', kpar: int = 1, lags: int | None = None, return_base: bool | None = False, normalization: bool | None = True) ndarray | Tuple[ndarray, ndarray][source]#
This function is used to generate the kernel matrix of the input signal
It is similar to the calculation of the covariance matrix, but the kernel function is used to measure the similarity between data points. The kernel matrix is very important in fields such as machine learning, signal processing, and time series analysis, especially when dealing with nonlinear relationships.
The kernel matrix calculates the similarity between different lagged versions of the input signal through the kernel function. The choice of kernel function and parameter settings can be adjusted according to the specific application scenario.
- Parameters:
x – 1D numpy ndarray signal
mode –
Specifies the type of lag matrix to be generated. The supported modes include:
- mode = ‘full’:
lags_matrix is the full Toeplitz convolutional matrix with dimensions [lags+N-1,lags],
math:: out = [ [x,0..0]^T,[0,x,0..0]^T,…,[0,..0,x]^T ], where: N is the size of x.
- mode = ‘prew’:
lags_matrix is the prewindowed matrix with the first N columns of the full matrix, and dimension = [N,lags];
- mode = ‘postw’:
lags_matrix is the postwindowed matrix with the last N columns of the full matrix, and dimension = [N,lags];
- mode = ‘covar’ or ‘valid’:
lags_matrix is the trimmed full matrix with the first and last m columns cut off
(out = full[lags:N-lags,:]), with dimension = [N-lags+1,lags];
- mode = ‘same’:
conv_matrix is the trimmed full matrix with the first and last m columns cut off
(out = full[(lags-1)//2:N+(lags-1)//2,:]), with dimension = [N,lags];
- mode = ‘traj’:
lags_matrix is the trajectory or so-called caterpillar matrix with dimension = [N,lags];
- mode = ‘hanekl’:
lags_matrix is the Hankel matrix with dimension = [N,N];
- mode = ‘toeplitz’:
lags_matrix is the symmetric Toeplitz matrix, with dimension = [N,N].
kernel – kernel = {exp, rbf, polynomial, sigmoid, linear, euclid, minkowsky, thin_plate, bump, polymorph}
kpar – kernel parameter, depends on the kernel type
lags – number of lags (N//2 dy default of None)
return_base – if true, than lags matrix will be also returned
normalization – if True, than matrix mean will be substructed
- Returns:
- ret_base is False:
kernel matrix: 2d ndarray.
- ret_base is True:
matrix: 2d ndarray, kernel matrix.
lags_matrix: lags matrix.
- Note:
Selection of kernel function: Different kernel functions are suitable for different application scenarios. For example, the RBF kernel is suitable for processing Gaussian distributed data, while the polynomial kernel is suitable for data with polynomial relationships. Normalization: Normalization can reduce the scale difference of the kernel matrix, making it more suitable for subsequent analysis. Adjustment of the number of lags: If the input signal is short, you may need to adjust the value of lags to avoid generating an overly large lag matrix.
utils.diagnalization#
- pysdkit.utils._diagnalization.diagonal_average(matrix: ndarray, reverse: bool | None = True, samesize: bool | None = False, averaging: bool | None = True) ndarray[source]#
Perform Hankel averaging (or diagonal averaging) on the input matrix
The main function of this function is to average or sum the diagonals of the input matrix. According to the parameter settings, you can choose to extract the diagonals in the forward or reverse direction, and you can choose whether to calculate the average value of the diagonal elements. The function returns a one-dimensional array representing the processed result.
- Parameters:
matrix (2D_ndarray) – The input matrix to be averaged.
reverse (bool) – If True, diagonals are taken in reverse order (from the bottom-right to the top-left).
samesize (bool) – If True, only diagonals from the main to the leftmost are taken.
averaging (bool) – If True, the mean value of each diagonal is taken; otherwise, the sum is taken.
- Return vector:
1D ndarray, The resulting vector after diagonal averaging.
- Note:
- If samesize = False:
If reverse = False, diagonals from the bottom-left to the top-right are taken.
If reverse = True, diagonals from the bottom-right to the top-left are taken.
- If samesize = True:
If reverse = False, diagonals from the bottom-left to the main diagonal are taken.
If reverse = True, diagonals from the bottom-right to the main diagonal are taken.
- pysdkit.utils._diagnalization.get_diagonal(matrix: ndarray, idx: int, reverse: bool | None = False) ndarray[source]#
Extract the specified diagonal from a matrix
The main function of this function is to extract the diagonal elements of the specified index from the input matrix. The index of the diagonal is calculated from the main diagonal (index is 0), the positive index represents the diagonal to the right of the main diagonal, and the negative index represents the diagonal to the left of the main diagonal. The function also supports reverse extraction of diagonals, that is, starting from the lower right corner of the matrix.
- Parameters:
matrix (2D_ndarray) – The input matrix from which the diagonal is extracted.
idx (int) – The index of the diagonal relative to the main diagonal (zero diagonal). Positive indices are to the right of the main diagonal, and negative indices are to the left.
reverse (bool) – If True, extract the diagonal in reverse order (from the bottom-right to the top-left).
- Return diag:
1D ndarray, The extracted diagonal elements.
- Notes:
- If reverse = False:
idx = 0: main diagonal
idx > 0: diagonals to the left of the main diagonal
idx < 0: diagonals to the right of the main diagonal
- If reverse = True:
idx = 0: main backward diagonal
idx > 0: diagonals to the right of the main backward diagonal
idx < 0: diagonals to the left of the main backward diagonal
- Example:
>>> a = [1, 2, 3, 4, 5] >>> b = signals.matrix.toeplitz(a)[:3, :] >>> print(b) >>> print(get_diagonal(b,0)) # zero diagonal >>> print(get_diagonal(b,-2)) # 2 diagonals to the left >>> print(get_diagonal(b,3)) # 3 diagonals to the right >>> print(get_diagonal(b,0,reverse=True)) # zero backward diagonal >>> print(get_diagonal(b,-1,reverse=True)) # 1 right backward diagonal >>> print(get_diagonal(b,1,reverse=True)) # 1 left backward diagonal