});

Kaggle Projects References

Stands on giants’ shoulder.

Laboratory Earthquake Time Prediction

LANL Earthquake Prediction | Kaggle

Predict time-to-failure from acoustic signals recorded in double-direct-shear lab-scale earthquake experiments. Could be similar enough to Bing’s requirement.

One thing I notice is they prefer to extract statistical features from the time series, like skewness, kurtosis for model training instead of using raw data and deep learning. {A notebook for data exploration}, {Second place} with the contribution rank of their extracted features:

{First place notebook} use:

  • a LSTM+Conv+Linear NN with extracted rolling stats features.
  • Joint loss with MAE on time to failure, time since failure, and BCE on failure time<0.5s or not.
  • Optimizer use Nadam with lr=0.008.

They parallel the parameter tuning, too. Must learn how to parallel and brute force search optimal parameters!

Optimizers

{This post} summaries the main optimizer types are
['Adadelta', 'Adagrad', 'Adam', 'RMSprop', 'SGD']
The feature of each of them are:

  • AdamDelta calculates the learning rate using cumulative second-order moment.
    {StackOverflow} posit the learning rate
    $$x_{t+1}=x_{t}+\Delta x_{t}$$
    $$\Delta x_{t}=-\frac{\text{RMS}\left[\Delta x\right]_{t-1}}{\text{RMS}\left[g\right]_{t}}g_{t}$$
    Where $$\text{RMS}\left[ \Delta x^2 \right] _ {t-1} = \sqrt{ E\left[ \Delta x^2\right]_{t-1}+\varepsilon}$$
  • RMSprop
    Use square root of mean
  • Adam
    $$
    g_t=\frac{\partial L}{\partial w_t};\ w_{t+1}=w_t-\widehat{m_t}\left(\frac{\alpha}{\sqrt{\widehat{v_t}}+\varepsilon}\right)\\
    \widehat{m_t}=\frac{m_t}{1-\beta_1^t};\ \widehat{v_t}=\frac{v_t}{1-\beta_2^t}\\
    m_t=\beta_1m_{t-1}+\left(1-\beta_1\right)g_t;\ v_t=\beta_2v_{t-1}+\left(1-\beta_2\right){g_t}^2\\
    $$
    Setting \(\beta_1=0\), Adam-> RMSProp.
    Setting \(\beta_2=0\), Adam-> SGD+Momentum
    \(\beta_1->0\), Adam-> AdaGrad

    The amsgrad param allows Adam to use \(v_t = max(v_t, v_{t-1})\), which resembles AdaMax.

    I am still confused the difference between AdaDelta and RMSprop, this blog may clarify it {Link}

Similarity Metrics

Start from {this twdDtSc post}

  • Lp Norm:
    Besides the popular L2 (Euclidean Distance) and L1 (MAE), Chebyshev distance \(L_{\inf}\) is even more robust to outliers than L1. which also takes and max distance along all candidate dimensions.
  • Pearson & Spearman correlation distance:
    $$
    \text{corr_dist} = 1- \frac{cov(A, B)}{\sqrt{var(A) \cdot var(B)}}
    $$
    Spearman correlation additionally consider ranked variables, which is a non-parametric measurement. Details see { biological explanation on type of variables}. Ideologically, ranked variables underestimate late
  • standardized Euclidean Distance
    This is commonly used in preprocessing and standard normalization, with equation
    $$
    d_{P, Q} = \sqrt{\sum_{i=1}^n {(\frac{p_i – q_i}{\sigma_i})^2}}
    $$
  • Chi-square distance
    In face recognition, it is used for histogram matching
    $$
    d(P, Q) = \sum{ \frac{(P_i – Q_i)^2}{P_i + Q_i} }
    $$

    Which differs from the \(\chi^2 = \sum{\frac{(O_i – E_i)^2}{E_i}}\)
  • Etc. Other similarity metrics includes Hanming, cosine similarity please refer to the {original post}.

Top 10 time series comp

List from {Medium}. May not fit my purpose, most of them are customer behavior oriented.

  1. The Rossman Sales Forecasting competition: https://www.kaggle.com/c/rossmann-store-sales
    The M5 Forecasting competition: https://www.kaggle.com/c/m5-forecasting-accuracy
    Top one use gradient boost tree-based algo: lightgbm, winner {github repo}{lightgbm_doc}
  2. The Global Energy Forecasting Competition 2014: https://www.kaggle.com/c/global-energy-forecasting-competition-2014-load-forecasting
  3. The Santa Time Series Forecasting competition: https://www.kaggle.com/c/santa-time-series
  4. The Mercari Price Suggestion competition: https://www.kaggle.com/c/mercari-price-suggestion-challenge
  5. The Corporación Favorita Grocery Sales Forecasting competition: https://www.kaggle.com/c/favorita-grocery-sales-forecasting
  6. The GE Flight Quest II — Turbulence Prediction competition: https://www.kaggle.com/c/turbulence-forecasting-challenge-ii
  7. The Bike Sharing Demand competition: https://www.kaggle.com/c/bike-sharing-demand
  8. The Web Traffic Time Series Forecasting competition: https://www.kaggle.com/c/web-traffic-time-series-forecasting
  9. The Energy Forecasting competition: https://www.kaggle.com/c/ashrae-energy-prediction

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.