NeurIPS Paper Reviews 2024 #6

7 February 2025

News
Quantitative Research

Georg, Quant Research Manager

In this paper review series, our team of researchers and machine learning practitioners discuss the papers they found most interesting at NeurIPS 2024.

Here, discover the perspectives of Quant Research Manager, Georg.

Optimal Parallelization of Boosting

Arthur da Cunha, Mikael Møller Høgsgaard, Kasper Green Larsen

The main contribution of this paper lies in the presentation of an algorithm that parallelizes boosting in a somewhat efficient way.

The scenario used by the authors is quite restrictive but the general idea should be easily extendable:

Consider a binary classification problem (i.e. all responses are either +1 or -1) on a sample set X of size m. A γ-weak-learner W is a learning algorithm (e.g. fitting a tree) that for any distribution D over X, when given at least some constant number of samples from D, produces with constant probability a prediction h that is wrong with probability less than 1/2 – γ under D.

In many cases, computing W is not parallelizable and classical boosting frameworks (e.g. AdaBoost) need to call into W sequentially, thus no parallel computation is possible.

The main point of the paper is to create an algorithm that leverages parallel calls into W in some optimal way. The total amount of work done will increase exponentially in this case, but wall clock time might still reduce if enough computational capacity is available.

The main idea of the algorithm is as follows:

For p rounds, subsample t sample-sets from your current distribution in parallel. Group those t weak learners into a certain number of groups and (sequentially) apply a normal update step for the best (or -worst) weak learner in each group if it improves overall loss.

The paper then suggests how to choose p and t optimally given a desired degree of parallelism t. Here t = 1 is roughly equivalent to AdaBoost and p log t should remain approximately constant for optimal performance.

The paper also proves that this bound is optimal up to some log-factors, by constructing a slightly pathological weak-learner for the biased coin problem. I am not completely convinced that we would encounter such weak learners a lot in practice and potentially further improvements are possible in many practical instances.

Optimal Parallelization of Boosting

NeurIPS 2023 Paper Reviews

Read paper reviews from NeurIPS 2023 from a number of our quantitative researchers and machine learning practitioners.

Read now

Learning Formal Mathematics From Intrinsic Motivation

Gabriel Poesia, David Broman, Nick Haber, Noah D. Goodman

This paper introduces an AI agent (MINIMO) that independently learns to perform formal mathematical reasoning using only axioms without relying on human-created proofs or data, as opposed to many others that need those for training.

MINIMO generates its own conjectures through constrained decoding and type-directed synthesis, ensuring that they are valid and well-formed, even when starting from an untrained model.

MINIMO uses a Transformer language model both for generating these conjectures and for theorem proving, utilizing the model as both a policy and value function to guide Monte Carlo Tree Search (MCTS) during proof searches.

The key innovation in my view is the self-improvement loop that mimics how human learn mathematics: asking ever harder questions and trying to prove the hardest ones currently attackable. MINIMO also uses hindsight relabelling, which captures both successful and failed proof attempts, adding a lot of successful proofs to the agent’s learning dynamics.

The authors present experiments across three domains: propositional logic, arithmetic, and group theory. MINIMO successfully bootstraps from the axioms in each of them, progressively tackling more complex problems. The agent is also capable of solving human-written theorems (e.g. from the Natural Numbers Game) that were not part of its training but are considered at least “interesting exercises” by real mathematicians.

A limitation of the current model is its lack of ability to accumulate learned theorems as reusable lemmas, which will hinder its scalability to deeper theories. In my view this is probably the biggest unsolved challenged to make this approach more interesting to tackle any real mathematical problems.

Learning Formal Mathematics From Intrinsic Motivation

Quantitative Research and Machine Learning

Want to learn more about life as a researcher at G-Research?

Learn more

Learning on Large Graphs using Intersecting Communities

Ben Finkelshtein, Ismail Ilkan Ceylan, Michael M. Bronstein, Ron Levie

The Szemerédi regularity lemma is one of the fundamental results of extremal graph theory which has basically sparked a whole new area of research within mathematics.

In a very short hand way, it states that all dense, big enough graphs look the same and are only characterised by a partition of their vertices into independent sets and the edge-densities between those. The Weak Regularity Lemma by Frieze and Kannan further extends this to an algorithmically feasible version to find those partitions.

This paper presents a new method for efficient learning on large, non-sparse graphs by approximating them with Intersecting Community Graphs (ICGs) – combinations of intersecting cliques. The authors prove the existence of the ICGs (a mathematical result probably interesting in its own), describe how to construct them efficiently, and then provide an algorithm to learn on the ICGs instead of on the graph directly.

Traditional Message Passing Neural Networks face computational challenges when dealing with large, dense graphs as their complexity is linear in the number of edges. This restricts scalability, especially in applications involving massive networks like social media platforms.

Training the newly proposed ICGs is mostly relevant for reasonably dense graphs (at least O(number of vertices^3/2) edges) and training them has complexity only linear in the number of vertices – a significant improvement.

The paper also presents some experimental results which shows from tasks like node classification and spatio-temporal data processing on real-world datasets that shows on-par performance of the ICGs while using fewer computational resources.

Learning on Large Graphs using Intersecting Communities

Latest News

Invisible Work of OpenStack: Eventlet Migration

25 Mar 2025

Hear from Jay, an Open Source Software Engineer, on tackling technical debt in OpenStack. As technology evolves, outdated code becomes inefficient and harder to maintain. Jay highlights the importance of refactoring legacy systems to keep open-source projects sustainable and future-proof.

Read article

SXSW 2025: Key takeaways from our Engineers

24 Mar 2025

At G-Research we stay at the cutting edge by prioritising learning and development. That’s why we encourage our people to attend events like SXSW, where they can engage with industry experts and explore new ideas. Hear from two Dallas-based Engineers, as they share their key takeaways from SXSW 2025.

Read article

G-Research February 2025 Grant Winners

17 Mar 2025

Each month, we provide up to £2,000 in grant money to early career researchers in quantitative disciplines. Hear from our February grant winners.

Read article

Latest Events

Quantitative Engineering
Quantitative Research

MPP/MPQ Career Day

30 Apr 2025 Max Planck Institute for Physics, Boltzmannstraße 8, 85748 Garching bei München, Germany

More info

Quantitative Engineering
Quantitative Research

Imperial PhD Careers Fair

10 Jun 2025 Queen's Tower Rooms, Sherfield Building, South Kensington Campus, Imperial College London, London, SW7 2AZ

NeurIPS Paper Reviews 2024 #6

Georg, Quant Research Manager

Optimal Parallelization of Boosting

Learning Formal Mathematics From Intrinsic Motivation

Learning on Large Graphs using Intersecting Communities

Read more paper reviews

Latest News

Latest Events

MPP/MPQ Career Day

Imperial PhD Careers Fair

Oxbridge Women in Computer Science Conference

Stay up to date with
G-Research

NeurIPS Paper Reviews 2024 #6

Georg, Quant Research Manager

Optimal Parallelization of Boosting

Learning Formal Mathematics From Intrinsic Motivation

Learning on Large Graphs using Intersecting Communities

Read more paper reviews

Latest News

Latest Events

MPP/MPQ Career Day

Imperial PhD Careers Fair

Oxbridge Women in Computer Science Conference

Stay up to date with G-Research

Stay up to date with
G-Research