Kaggle Projects References

Posted on September 26, 2023September 26, 2023ANN, Seismic&DeepL, Statistics

Stands on giants’ shoulder.

Laboratory Earthquake Time Prediction

Predict time-to-failure from acoustic signals recorded in double-direct-shear lab-scale earthquake experiments. Could be similar enough to Bing’s requirement.

One thing I notice is they prefer to extract statistical features from the time series, like skewness, kurtosis for model training instead of using raw data and deep learning. {A notebook for data exploration}, {Second place} with the contribution rank of their extracted features:

{First place notebook} use:

a LSTM+Conv+Linear NN with extracted rolling stats features.
Joint loss with MAE on time to failure, time since failure, and BCE on failure time<0.5s or not.
Optimizer use Nadam with lr=0.008.

They parallel the parameter tuning, too. Must learn how to parallel and brute force search optimal parameters!

Optimizers

{This post} summaries the main optimizer types are
['Adadelta', 'Adagrad', 'Adam', 'RMSprop', 'SGD']
The feature of each of them are:

AdamDelta calculates the learning rate using cumulative second-order moment.
{StackOverflow} posit the learning rate
$$x_{t+1}=x_{t}+\Delta x_{t}$$
$$\Delta x_{t}=-\frac{\text{RMS}\left[\Delta x\right]_{t-1}}{\text{RMS}\left[g\right]_{t}}g_{t}$$
Where $$\text{RMS}\left[ \Delta x^2 \right] _ {t-1} = \sqrt{ E\left[ \Delta x^2\right]_{t-1}+\varepsilon}$$
RMSprop
Use square root of mean
Adam
$$
g_t=\frac{\partial L}{\partial w_t};\ w_{t+1}=w_t-\widehat{m_t}\left(\frac{\alpha}{\sqrt{\widehat{v_t}}+\varepsilon}\right)\\
\widehat{m_t}=\frac{m_t}{1-\beta_1^t};\ \widehat{v_t}=\frac{v_t}{1-\beta_2^t}\\
m_t=\beta_1m_{t-1}+\left(1-\beta_1\right)g_t;\ v_t=\beta_2v_{t-1}+\left(1-\beta_2\right){g_t}^2\\
$$
Setting $\beta_1=0$, Adam-> RMSProp.
Setting $\beta_2=0$, Adam-> SGD+Momentum
$\beta_1->0$, Adam-> AdaGrad

The amsgrad param allows Adam to use $v_t = max(v_t, v_{t-1})$, which resembles AdaMax.

I am still confused the difference between AdaDelta and RMSprop, this blog may clarify it {Link}

Similarity Metrics

Start from {this twdDtSc post}

Lp Norm:
Besides the popular L2 (Euclidean Distance) and L1 (MAE), Chebyshev distance $L_{\inf}$ is even more robust to outliers than L1. which also takes and max distance along all candidate dimensions.
Pearson & Spearman correlation distance:
$$
\text{corr_dist} = 1- \frac{cov(A, B)}{\sqrt{var(A) \cdot var(B)}}
$$
Spearman correlation additionally consider ranked variables, which is a non-parametric measurement. Details see { biological explanation on type of variables}. Ideologically, ranked variables underestimate late
standardized Euclidean Distance
This is commonly used in preprocessing and standard normalization, with equation
$$
d_{P, Q} = \sqrt{\sum_{i=1}^n {(\frac{p_i – q_i}{\sigma_i})^2}}
$$
Chi-square distance
In face recognition, it is used for histogram matching
$$
d(P, Q) = \sum{ \frac{(P_i – Q_i)^2}{P_i + Q_i} }
$$

Which differs from the $\chi^2 = \sum{\frac{(O_i – E_i)^2}{E_i}}$
Etc. Other similarity metrics includes Hanming, cosine similarity please refer to the {original post}.

Top 10 time series comp

List from {Medium}. May not fit my purpose, most of them are customer behavior oriented.

The Rossman Sales Forecasting competition: https://www.kaggle.com/c/rossmann-store-sales
The M5 Forecasting competition: https://www.kaggle.com/c/m5-forecasting-accuracy
Top one use gradient boost tree-based algo: lightgbm, winner {github repo}{lightgbm_doc}
The Global Energy Forecasting Competition 2014: https://www.kaggle.com/c/global-energy-forecasting-competition-2014-load-forecasting
The Santa Time Series Forecasting competition: https://www.kaggle.com/c/santa-time-series
The Mercari Price Suggestion competition: https://www.kaggle.com/c/mercari-price-suggestion-challenge
The Corporación Favorita Grocery Sales Forecasting competition: https://www.kaggle.com/c/favorita-grocery-sales-forecasting
The GE Flight Quest II — Turbulence Prediction competition: https://www.kaggle.com/c/turbulence-forecasting-challenge-ii
The Bike Sharing Demand competition: https://www.kaggle.com/c/bike-sharing-demand
The Web Traffic Time Series Forecasting competition: https://www.kaggle.com/c/web-traffic-time-series-forecasting
The Energy Forecasting competition: https://www.kaggle.com/c/ashrae-energy-prediction

Laboratory Earthquake Time Prediction

Optimizers

Similarity Metrics

Top 10 time series comp

Leave a Reply Cancel reply