Dissociating language and thought in large language models

23 pointsposted 4 hours ago
by rntn

1 Comments

prolyxis

an hour ago

The human brain, the authors argue, in fact uses multiple networks when interpreting and producing language. These include:

- the language network, which delivers formal linguistic competence - the multiple demand network, which provides reasoning ability - the default network, which tracks narratives above the clause level - the theory of mind network, which infers the mental state of another entity

This leads to their argument that a modular structure would lead to enhanced ability for an LLM to be both formally and functionally competent. (While LLMs currently exhibit human-level formal linguistic competence, their functional competence--the ability to navigate the real world through language--has room for improvement.)

Transformer models, they note, have degree of emergent modularity through "allowing different attention heads to attend to different input features."

I was wondering, is it possible to characterize the degree of emergent modularity in current systems?