LLMs Aren't Thinking

I haven’t had any other argument for my stance that LLMs aren’t thinking (at least like how humans do), other than that it is constructed to just guess the most likely output.

There was a person on Twitter complaining how one of the coding benchmarks for LLMs only test for Python:

swe-bench is 100% python based?? we’re evaluating model coding abilities based on one language? and it’s python?!

(from: https://x.com/adamdotdev/status/1907824331695988824)

Would be better if this benchmark did include a plethora of languages. However, as some of the replies say, the programming language used is incidental to the process of solving an issue with computers. So, the language at the end of the day shouldn’t matter so much.

It shouldn’t matter… right?

But what about languages that are less popular, and thus the base models have had less resources to train on?

I’ve heard Zig developers say that they feel LLMs do not perform as well writing Zig, compared to other more popular languages. Of course some of this has to do with Zig still a pre-v1.0.0 language. But then how come human programmers are able to write prolific Zig?

I think the difference is the ability to think. A human programmer with at least some proficiency in a similar language such as C/C++ can pick up Zig quite easily I hear. Humans are capable of translating that thinking over.

I’m not discrediting the power of LLMs. In the latter half of this post, I’ve been comparing their reasoning abilities to that of humans. Isn’t that already incredible? I didn’t think I’d be able to see something like this until much further in the future, and I’m still amazed by what it does for me in my day-to-day.

But I think it’s still a bit far from AGI, and not just a few years ahead, pending another breakthrough happens during that short period. I’m doubtful that’ll come from just throwing more data at it, and besides, they’re apparently running out of more quality data to train on anyways.

So I say LLMs aren’t thinking (at least not like humans (at least not yet)).