hackernews client

Recursive Language Models (RLMs)

82 pointsposted 10 hours ago

(alexzhang13.github.io)

24 Comments

cs702

6 hours ago

Briefly, an RLM wraps an existing language model (LM) together with an environment that can dynamically manipulate the prompt that will be fed into the LM.

The authors use as an environment a Python REPL that itself can call other instances of the LM. The prompt is programmatically manipulated as a Python variable on the REPL.

The motivation is for the LM to use Python commands, including commands that call other LM instances, to figure out how best to modify the context at inference time.

The results from early testing look impressive at a first glance: An RLM wrapping GPT-5-mini outperforms GPT-5 by a wide margin on long-context tasks, at significant lower cost.

I've added this to my reading list.

This reminded me of ViperGPT[1] from a couple of years ago, which is similar but specific to vision language models. Both of them have a root llm which given a query produces a python program to decompose the query into separate steps, with the generated python program calling a sub model. One difference is this model has a mutable environment in the notebook, but I'm not sure how much of a meaningful difference that is.

[1] https://viper.cs.columbia.edu/static/viper_paper.pdf

ttul

2 hours ago

This is what Codex is doing. The LM has been trained to work well with the kinds of tools that a solid developer would use to navigate and search around a code repository and then to reason about what it finds. It’s also really competent at breaking down a task into steps. But I think the real magic - watching this thing for at least 40 of the last 50 working hours - is how it uses command line tools to dig through code quickly and accurately.

It’s not relying on the LM context much. You can generally code away for an hour before you run out of context and have to run a compression step or just start fresh.

nowittyusername

5 hours ago

My existing project is very similar to this with some other goodies. I agree with the author that focus on systems versus LLM's is the proper next move. Orchestrating systems that manage multiple different llms and other scripts together can accomplish a lot more then a simple ping pong type of behavior. Though I suspect most people who work with agentic solutions are already quite aware of this. What most in that space haven't cracked yet is the dynamic self modifying and improving system, that should be the ultimate goal for these types of systems.

jgbuddy

8 hours ago

This is old news! Agent-loops are not a model architechture

adastra22

5 hours ago

I’m confused over your definition of model architecture.

layer8

6 hours ago

Loops aren’t recursion?

antonvs

2 hours ago

Loops and recursion are fundamentally equivalent.

See e.g. https://textbooks.cs.ksu.edu/cc210/16-recursion/08-recursion...

laughingcurve

7 hours ago

Everything old is new again when you are in academia

hodgehog11

6 hours ago

This feels primarily like an issue with machine learning, at least among mathematical subdisciplines. As new people continue to be drawn into the field, they rarely bother to read what has come even a few years prior (nevermind a few decades prior).

behnamoh

4 hours ago

in today's news: MIT researchers found out about AI agents and rebranded it as RLM for karma.

quibit

6 hours ago

> Lastly, in our experiments we only consider a recursive depth of 1 — i.e. the root LM can only call LMs, not other RLMs. It is a relatively easy change to allow the REPL environment to call RLMs instead of LMs, but we felt that for most modern “long context” benchmarks, a recursive depth of 1 was sufficient to handle most problems. However, for future work and investigation into RLMs, enabling larger recursive depth will naturally lead to stronger and more interesting systems.

It feels a little disingenuous to call it a Recursive Language Model when the recursive depth of the study was only 1.

fizx

4 hours ago

I read the article, and I'm struggling to see what ideas it brings beyond CodeAct (tool use is python) or the "task" tool in Claude code (spinning off sub-agents to preserve context).

lukebechtel

an hour ago

this doesn't appear to bring anything new to the table.

please correct me if I'm wrong..this is just subagent architecture?

It was even before the rise of LLMs

The authors may want to consider a more specific name

halfmatthalfcat

8 hours ago

It broke new ground!

Recursive Language Models (RLMs)

24 Comments

cs702

nathanwh

ttul

nowittyusername

jgbuddy

adastra22

layer8

antonvs

laughingcurve

hodgehog11

behnamoh

quibit

fizx

lukebechtel

yandie

UltraSane

ipnon

wild_egg

rancar2

ayazhan

yorwba

foolswisdom

gdiamos

halfmatthalfcat