Chain of Thought prompting reminds me of Facilitated Communication:
https://en.wikipedia.org/wiki/Facilitated_communication
A long discredited intervention where a "facilitator" guides the hand of a non-verbal human to help them write down their thoughts and experiences. Experiments that blinded the facilitator to the observations of the subject, where the written message matched the facilitator's, rather than the subject's, observations, have convincingly proved that it was so much bunkum. It's the Clever Hans Effect all by another name, and with non-verbal humans rather than horses.
Chain of Thought works like that: without hand-holding by a human who understands how to answer a question, the LLM's performance drops, or drops off a cliff even. Of course this is much harder to prove for LLMs than it was for facilitated communication because LLMs don't really do anything without a prompt in the first place. Which should be a very big hint of what's really going on with CoT.
People out there trying build some semblance of AI out of an LLM using larger and larger networks of “agents” that generate, classify, revise and verify data using the same LLM they're building larger and larger networks of agents upon to try and build some semblance of AI.
The end game is a brain-sized network where each neuron is an agent sending a 1M token prompt to a 10T parameter model to update their "weights".
Sometimes it looks like the computationalists are trying to sneak back into the room while no-one is looking.
There does seem to be quite a lot of independent ad-hoc efforts making custom notations for C-O-T. I feel like we're in a period similar to just after the first programming languages and compilers were invented but regular expressions were yet to come. In a way that's quite exciting, its another little Cambrian explosion.
I don't think it will be a panacea though. In my observations of failures of reasoning in LLMs, a lot of the problem isn't that they fail to follow logical steps but that they fail to notice the presence of implied premises completely. Chain of Thought is good for spotting the wrong reasoning, but not for spotting that the problem is not the one that it appears at first glance.