Skip to main content
Back to News
NeurIPS Paper Reviews 2024 #6

NeurIPS Paper Reviews 2024 #6

7 February 2025
  • News
  • Quantitative Research

Georg, Quant Research Manager

In this paper review series, our team of researchers and machine learning practitioners discuss the papers they found most interesting at NeurIPS 2024.

Here, discover the perspectives of Quant Research Manager, Georg.

Optimal Parallelization of Boosting

Arthur da Cunha, Mikael Møller Høgsgaard, Kasper Green Larsen

The main contribution of this paper lies in the presentation of an algorithm that parallelizes boosting in a somewhat efficient way.

The scenario used by the authors is quite restrictive but the general idea should be easily extendable:

Consider a binary classification problem (i.e. all responses are either +1 or -1) on a sample set X of size m. A γ-weak-learner W is a learning algorithm (e.g. fitting a tree) that for any distribution D over X, when given at least some constant number of samples from D, produces with constant probability a prediction h that is wrong with probability less than 1/2 – γ under D.

In many cases, computing W is not parallelizable and classical boosting frameworks (e.g. AdaBoost) need to call into W sequentially, thus no parallel computation is possible.

The main point of the paper is to create an algorithm that leverages parallel calls into W in some optimal way. The total amount of work done will increase exponentially in this case, but wall clock time might still reduce if enough computational capacity is available.

The main idea of the algorithm is as follows:

For p rounds, subsample t sample-sets from your current distribution in parallel. Group those t weak learners into a certain number of groups and (sequentially) apply a normal update step for the best (or -worst) weak learner in each group if it improves overall loss.

The paper then suggests how to choose p and t optimally given a desired degree of parallelism t. Here t = 1 is roughly equivalent to AdaBoost and p log t should remain approximately constant for optimal performance.

The paper also proves that this bound is optimal up to some log-factors, by constructing a slightly pathological weak-learner for the biased coin problem. I am not completely convinced that we would encounter such weak learners a lot in practice and potentially further improvements are possible in many practical instances.

Optimal Parallelization of Boosting
NeurIPS 2023 Paper Reviews

Read paper reviews from NeurIPS 2023 from a number of our quantitative researchers and machine learning practitioners.

Read now

Learning Formal Mathematics From Intrinsic Motivation

Gabriel Poesia, David Broman, Nick Haber, Noah D. Goodman

This paper introduces an AI agent (MINIMO) that independently learns to perform formal mathematical reasoning using only axioms without relying on human-created proofs or data, as opposed to many others that need those for training.

MINIMO generates its own conjectures through constrained decoding and type-directed synthesis, ensuring that they are valid and well-formed, even when starting from an untrained model.

MINIMO uses a Transformer language model both for generating these conjectures and for theorem proving, utilizing the model as both a policy and value function to guide Monte Carlo Tree Search (MCTS) during proof searches.

The key innovation in my view is the self-improvement loop that mimics how human learn mathematics: asking ever harder questions and trying to prove the hardest ones currently attackable. MINIMO also uses hindsight relabelling, which captures both successful and failed proof attempts, adding a lot of successful proofs to the agent’s learning dynamics.

The authors present experiments across three domains: propositional logic, arithmetic, and group theory. MINIMO successfully bootstraps from the axioms in each of them, progressively tackling more complex problems. The agent is also capable of solving human-written theorems (e.g. from the Natural Numbers Game) that were not part of its training but are considered at least “interesting exercises” by real mathematicians.

A limitation of the current model is its lack of ability to accumulate learned theorems as reusable lemmas, which will hinder its scalability to deeper theories. In my view this is probably the biggest unsolved challenged to make this approach more interesting to tackle any real mathematical problems.

 

Learning Formal Mathematics From Intrinsic Motivation
Quantitative Research and Machine Learning

Want to learn more about life as a researcher at G-Research?

Learn more

Learning on Large Graphs using Intersecting Communities

Ben Finkelshtein, Ismail Ilkan Ceylan, Michael M. Bronstein, Ron Levie

The Szemerédi regularity lemma is one of the fundamental results of extremal graph theory which has basically sparked a whole new area of research within mathematics.

In a very short hand way, it states that all dense, big enough graphs look the same and are only characterised by a partition of their vertices into independent sets and the edge-densities between those.  The Weak Regularity Lemma by Frieze and Kannan further extends this to an algorithmically feasible version to find those partitions.

This paper presents a new method for efficient learning on large, non-sparse graphs by approximating them with Intersecting Community Graphs (ICGs) – combinations of intersecting cliques. The authors prove the existence of the ICGs (a mathematical result probably interesting in its own), describe how to construct them efficiently, and then provide an algorithm to learn on the ICGs instead of on the graph directly.

Traditional Message Passing Neural Networks face computational challenges when dealing with large, dense graphs as their complexity is linear in the number of edges. This restricts scalability, especially in applications involving massive networks like social media platforms.

Training the newly proposed ICGs is mostly relevant for reasonably dense graphs (at least O(number of vertices^3/2) edges) and training them has complexity only linear in the number of vertices – a significant improvement.

The paper also presents some experimental results which shows from tasks like node classification and spatio-temporal data processing on real-world datasets that shows on-par performance of the ICGs while using fewer computational resources.

Learning on Large Graphs using Intersecting Communities

Read more paper reviews

ICML 2024: Paper Review #1

Discover the perspectives of Casey, one of our Machine Learning Engineer, on the following papers:

  • Towards scalable and stable parallelization of nonlinear RNNs
  • logarithmic math in accurate and efficient AI inference accelerators
Read now
ICML 2024: Paper Review #2

Discover the perspectives of Trenton, one of our Software Engineer, on the following papers:

  • FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
  • Parallelizing Linear Transformers with the Delta Rule over Sequence Length
  • RL-GPT: Integrating Reinforcement Learning and Code-as-policy
Read now
ICML 2024: Paper Review #3

Discover the perspectives of Mark, one of our Senior Quantitative Researcher, on the following papers:

  • Why Transformers Need Adam: A Hessian Perspective
  • Poisson Variational Autoencoder
  • Noether’s Razor: Learning Conserved Quantities
Read now
ICML 2024: Paper Review #4

Discover the perspectives of Angus, one of our Machine Learning Engineer, on the following papers:

  • einspace: Searching for Neural Architectures from Fundamental Operations
  • SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
Read now
ICML 2024: Paper Review #5

Discover the perspectives of Dustin, one of our Scientific Directors, on the following papers:

  • QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
  • An Image is Worth 32 Tokens for Reconstruction and Generation
  • Dimension-free deterministic equivalents and scaling laws for random feature regression
Read now
ICML 2024: Paper Review #7

Discover the perspectives of Cedric, one of our Quantitative Researchers, on the following papers:

  • Preference Alignment with Flow Matching
  • A Generative Model of Symmetry Transformations
Read now
ICML 2024: Paper Review #8

Discover the perspectives of Hugh, one of our Scientific Directors, on the following papers:

  • Better by default: Strong pre-tuned MLPs and boosted trees on tabular data
  • Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data
ICML 2024: Paper Review #9

Discover the perspectives of Andrew, one of our Quant Research Managers, on the following papers:

  • Algorithmic Capabilities of Random Transformers
  • The Road Less Scheduled
  • Time Series in the Age of Large Models

Latest News

NeurIPS Paper Reviews 2024 #9
  • 07 Feb 2025

In this NeurIPS paper review series, Andrew, Quant Research Manager, shares his perspectives on the most exciting research presented at the conference, providing a comprehensive look at the newest trends and innovations shaping the future of ML.

Read article
NeurIPS Paper Reviews 2024 #8
  • 07 Feb 2025

In this NeurIPS paper review series, Hugh, Scientific Director, shares his perspectives on the most exciting research presented at the conference, providing a comprehensive look at the newest trends and innovations shaping the future of ML.

Read article
NeurIPS Paper Reviews 2024 #7
  • 07 Feb 2025

In this NeurIPS paper review series, Cedric, Quantitative Researcher, shares his perspectives on the most exciting research presented at the conference, providing a comprehensive look at the newest trends and innovations shaping the future of ML.

Read article

Latest Events

  • Quantitative Engineering
  • Quantitative Research

Boston Trivia Night

06 Mar 2025 Boston - to be confirmed after registration
  • Platform Engineering
  • Software Engineering

Imperial Doc Soc Coding Challenge

20 Feb 2025 Imperial College London, Exhibition Rd, South Kensington, London, SW7 2AZ
  • Infrastructure Engineering
  • Platform Engineering
  • Software Engineering

SXSW – Software Engineering Social

11 Mar 2025 Moonshine Grill, 303 Red River St, Austin, TX 78701, United States

Stay up to date with
G-Research