NeurIPs Paper Reviews 2023 #2

Sharpness-Aware Minimization Leads to Low-Rank Features

Maksym Andriushchenko, Dara Bahri, Hossein Mobahi, Nicolas Flammarion

In overparametrised neural networks, sharpness of minima has been observed to correlate negatively with the generalisation error of the model. Sharpness-aware minimisation (SAM) is a recent algorithm that introduces an explicit sharpness penalty to the optimisation objective which has been shown to improve model performance.

In this paper, the authors investigate the effect that SAM has on the features of the model. They demonstrate that SAM reduces the feature rank at different layers, as measured by the number of principal components that are needed to capture 99% of the variance, compared to networks that are trained using standard minimisation algorithms. This can for instance be used to reduce the dimensionality of the feature space, improving the performance of downstream tasks. In contrast, the authors found that directly imposing a lower feature rank on the model itself did not lead to improved generalisation. This suggests that the low rank is a useful side effect but not a full explanation of the benefits of SAM.

To further understand the mechanism behind this effect, the authors study a two-layer ReLU network. They show, both experimentally and theoretically, that SAM decreases pre-activation values within the network. This, in turn, reduces the number of non-zero activations and results in the observed low rank of the features.

Sharpness-Aware Minimization Leads to Low-Rank Features

NeurIPS 2022 Paper Reviews

Read paper reviews from NeurIPS 2022 from a number of our quantitative researchers and machine learning practitioners.

Read now

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Duncan C. McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Ganesh Ramakrishnan, Micah Goldblum, Colin White

This paper presents a comprehensive study comparing the performance of neural network (NN), gradient boosted decision trees (GBDT), and baseline algorithms like linear or k-nearest neighbour models on a large number of tabular datasets. It also introduces a benchmark suite of challenging tabular datasets to accelerate research in this area.

The study shows that no single algorithm dominates on all datasets, nearly all algorithms examined ranked first on at least one dataset. When aggregating the algorithms by their respective family, GBDTs are high-performing on slightly more datasets than NNs and baseline methods while also being faster than NNs. However, in many cases the difference in performance between NNs and GBDTs is either negligible or, at fixed budget, tuning the hyperparameters of GBDT is more useful than trying out different methods.

Additionally, the authors present a metafeature analysis to identify dataset properties that correlate with superior performance of certain techniques, which is helpful for practitioners selecting suitable methods for their respective datasets. For example, the authors demonstrate that GBDTs tend to outperform NNs on datasets with heavy-tailed, skewed, or high-variance features.

Quantitative Research and Machine Learning

Want to learn more about life as a researcher at G-Research?

Learn more

Latest News

Invisible Work of OpenStack: Eventlet Migration

25 Mar 2025

Hear from Jay, an Open Source Software Engineer, on tackling technical debt in OpenStack. As technology evolves, outdated code becomes inefficient and harder to maintain. Jay highlights the importance of refactoring legacy systems to keep open-source projects sustainable and future-proof.

Read article

SXSW 2025: Key takeaways from our Engineers

24 Mar 2025

At G-Research we stay at the cutting edge by prioritising learning and development. That’s why we encourage our people to attend events like SXSW, where they can engage with industry experts and explore new ideas. Hear from two Dallas-based Engineers, as they share their key takeaways from SXSW 2025.

Read article

G-Research February 2025 Grant Winners

17 Mar 2025

Each month, we provide up to £2,000 in grant money to early career researchers in quantitative disciplines. Hear from our February grant winners.

Read article

Latest Events

Quantitative Engineering
Quantitative Research

KubeCon

01 Apr 2025 - 04 Apr 2025 ExCeL London, Royal Victoria Dock, 1 Western Gateway, London, E16 1XL

More info

Quantitative Engineering
Quantitative Research

Women in Quant Finance

15 Jun 2025 - 16 Jun 2025 1 Soho Place, London, W1D 3BG

More info

Quantitative Engineering
Quantitative Research

Pub Quiz: Paris

15 May 2025 Paris - to be confirmed after registration

More info

NeurIPs Paper Reviews 2023 #2

Sharpness-Aware Minimization Leads to Low-Rank Features

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Quantitative Research and Machine Learning

Read more of our quantitative researchers thoughts

Latest News

Latest Events

KubeCon

Women in Quant Finance

Pub Quiz: Paris

Stay up to date with
G-Research

NeurIPs Paper Reviews 2023 #2

Sharpness-Aware Minimization Leads to Low-Rank Features

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Quantitative Research and Machine Learning

Read more of our quantitative researchers thoughts

Latest News

Latest Events

KubeCon

Women in Quant Finance

Pub Quiz: Paris

Stay up to date with G-Research

Stay up to date with
G-Research