lsy
a year ago
Note that for the purposes of this paper a “problem” just means a formally decidable problem or a formal language, and the proof is that by creatively arranging transformers you can make individual transformer runs behave like individual Boolean circuits. However, this is a long way from any practical application of transformers: for one thing, most problems we care about are not stated as formal languages, and we already have an exceptionally more efficient way to implement Boolean circuits.
shawntan
a year ago
If a "problem we care about" is not stated as a formal language, does it mean it does not exist in the hierarchy of formal languages? Or is it just as yet unclassified?
tsimionescu
a year ago
It means that there are two problems: one, to formalize the problem as stated while capturing all relevant details, and two, solving the resulting formal problem. Until you solve problem one, you can't use formal methods to say anything about the problem (it's not even clear a priori that a problem is even solvable).
Unfortunately, the task of a formalizing an informal problem is itself an informal problem that we don't know how to formalize, so we can't say much about it. So overall, we can't say much about how hard the general problem "given a problem statement from a human, solve that problem" is, whether any particular system (including a human!) can solve it and how long that might take with what resources.
viraptor
a year ago
> task of a formalizing an informal problem is itself an informal problem
I couldn't find details about this - do you know of a paper or some resource which digs into that idea?
tsimionescu
a year ago
No, but it's pretty obvious, isn't it? If you have an informal problem statement, say "I want this button to be bigger", formalizing it can't be a formal process.
naasking
a year ago
> "I want this button to be bigger", formalizing it can't be a formal process.
while (!is_button_big_enough()) {
button.scaleUp(1.1);
}
This is one trivial way to do it, and seems like it would be formalizable. is_button_big_enough is simply an input to whatever process is responsible for judging such a thing, whether that be a design specification or perhaps input from a person.tsimionescu
a year ago
You've translated my informal problem statement into a quasi-formal process, using your inherent natural language processing skills, and your knowledge of general human concepts like size. But you haven't explained the formal process you followed to go from my problem statement to this pseudocode.
And your pseudocode template only works for one particular kind of informal problem statement. If I instead have the problem "how much money do I need to buy this house and this chair?", or "does this byte fit in my mouth?", your general form will not work.
And what's more, you haven't actually produced a formally solvable problem definition, that we could analyze for complexity and computability, because you rely on two completely unspecified functions. Where is the formal defintion of a button? Is it a physical push button or a UI control or a clothing button? What does it mean that it is bigger or smaller? When do we know it's big enough, is that computable? And how do we scale it up? Do we increase its volume? Its surface area? One of its sides? Or maybe the radius? And how do we go about doing that? All of these, and many more, need to be explicitly defined in order to apply any kind of formal analysis to this problem. And there is no formal way to do so in a way that matches the intent of whoever posed the problem.
naasking
a year ago
> And what's more, you haven't actually produced a formally solvable problem definition, that we could analyze for complexity and computability, because you rely on two completely unspecified functions. Where is the formal defintion of a button?
Well your statement was underspecified. You said "I want this button bigger". There are procedures to translate informal statements to formal ones, but one basic step is underspecified referents are delegated to abstractions that encapsulate those details, so "this button" designates some kind of model of a button, and "I" refers to a subject outside the system thereby implying some kind of interactive process to query the subject whether the model is satisfactory, eg. a dialog prompt asking "Is this button big enough now?"
You call these skills "inherent", but humans are not magical. We employ bug riddled poorly specified procedures for doing this kind of interpretive work, and LLMs have already started to do this too, and they'll only get better. Is asking a deterministic LLM to generate a formal specification or program to achieve some result a formal process? I don't think these lines are as clear as many think, not anymore.
tsimionescu
a year ago
I think we're mostly agreed actually. I'm not trying to claim that this is an unsolvable problem, just that it's a difficult problem that we don't have a solution for yet. And yes, LLMs are probably our best tool so far. And asking for clarifying questions is clearly a part of the process.
I will say that there is also a possibility the general form of the formal problem is in fact uncomputable. It seems possible to me it might be related to the halting problem. But, until we have a formal specification of it, we won't know, of course.
freejazz
a year ago
What is the repeatable method by which you came to that conclusion? That is what needs to be formalized for your response to make sense.
naasking
a year ago
There are procedures for translating informal statements to formal ones. If I submit such informal statements to an LLM and ask it to generate a spec or program to achieve some result, that can be made repeatable. There are various arrangements to make this more robust, like having another LLM generate test cases to check the work of the other. Does this qualify?
viraptor
a year ago
It's... "knee-jerk obvious". But is it actually true? People seem to be interested in the concept in formal logic arguments for example https://www.researchgate.net/publication/346658578_How_to_Fo... (which uses formal process for part of formalization), so maybe it's not as simple as it seems initially. I mean, if we're already talking about formal problems, it could use a stronger proof ;)
tsimionescu
a year ago
At best, this is a formal process for manipulating certain kinds of statements. But the general problem, "take a human's statement of a problem and translate it into a formal statement of a problem that, if solved, will address what the human was asking for" is far harder and more nebulous. Ultimately, it's exactly the problem that LLMs have been invented for, so it has been studied in that sense (and there is a broad literature in AI for NLP, algorithm finding, expert systems, etc). But no one would claim that they are even close to having a formal specification of this problem that they could analyze the complexity of.
moi2388
a year ago
Why not? Bigger is a measure of size and ought to be easy enough to formalise.
Apply a transformation to B which increases its area and leaves the proportion of its sides equal.
Why would this statement not be formalisable?
tsimionescu
a year ago
I'm not saying that the statement "I want this button to be bigger" can't be formalized. I'm saying that there is no formal process you can follow to get from this problem to a formal problem that is equivalent. There isn't even a formal process you can use to check if a formal definition is equivalent to this problem.
Consider that if someone asked you solve this problem for them with just this statement, either of the following could be a sketch of a more formal statement of what they actually want:
1. In a given web page, the css class used for a particular <button> element should be changed to make the button's height larger by 10%, without changing any other <button> element on the page, or any other dimension.
2. For a particular piece of garment that you are given, the top most button must be replaced with a different button that appears to have the same color and finish to a human eye, and that has the same 3D shape up to human observational precision, but that has a radius large enough to not slip through the opposing hole under certain forces that are commonly encountered, but not so large that it doesn't fit in the hole when pushed with certain forces that are comfortable for humans.
I think you would agree that (a) someone who intended you to solve either of these problems might reasonably describe them with the statement I suggested, and (b), that it would be very hard to devise a formal mathematical process to go from that statement to exactly one of these statements.
moi2388
a year ago
Ah, gotcha. I agree it would be difficult. I’m still not convinced it would be impossible though.
LLMs could even formalise what you want in the context, even now.
Or do you mean that you can’t formalise every statement when given incomplete information about the context of the statement, since then we have a single word pointing to multiple different contexts?
tsimionescu
a year ago
Oh yes, it's not impossible, I'm just saying we don't know how to do it yet. LLMs themselves are probably our best attempt so far.
Zhyl
a year ago
But here's the thing, it's not that the statement isn't formalisable, it's the method that you used to formalise it isn't formalisable.
qwertytyyuu
a year ago
Yeah you could make it one pixel bigger but if someone asked you that, is that what they actually want?
esjeon
a year ago
Ah, you are informally inquiring about a formal description concerning the informal nature of formalization of informal questions.
Joke aside, this is about the nature of the formalization process itself. If the process of formalizing informal problems were fully formalized, it would be possible to algorithmically compute the solution and even optimize it mathematically. However, since this is obviously impossible (e.g. vague human language), it suggests that the formalization process can't be fully formalized.
wslh
a year ago
My 2 cents: Since LLMs (Large Language Models) operate as at least a subset of Turing machines (which recognize recursively enumerable languages), the chain of thought (CoT) approach could be equivalent to or even more expressive than that subset. In fact, CoT could perfectly be a Turing machine.
If we leave CoT aside for a moment, it's worth exploring the work discussed in the paper "Neural Networks and the Chomsky Hierarchy"[1], which analyzes how neural networks (including LLMs) map onto different levels of the Chomsky hierarchy, with a particular focus on their ability to recognize formal languages across varying complexity.
flir
a year ago
> In fact, CoT could perfectly be a Turing machine.
Are we going to need an infinite number of LLMs, arranged on a tape?
julienreszka
a year ago
> most problems we care about are not stated as formal languages
then a way would be to translate them to formal language