Awareness is just continuous propagation of the neural network, be that artificial or biological. The reason thoughts just "appear" is because the brain is continuously propagating signal through the neural network. LLMs also do this during their decoding phase, where they reason continuously with every token that they generate. There is no difference here.
Then you say "we don't think most of the times using language exclusively" , but neither do LLMs. What most people fail to realise is that in between each token being generated, black magic is happening in between the transformer layers. The same type of magic you describe. High dimensional. Based on complex concepts. Merging of ideas. Fusion of vectors to form a combined concept. Smart compression. Application of abstract rules. An LLM does all of these things, and more, and you can prove this by how complex their output is. Or, you can read studies by Anthropic on interpretability, and how LLMs do math underneath the transformer layers. How they manipulate information.
AGI is not here with LLMs, but its not because they lack reasoning ability. It's due to something different. Here is what I think is truly missing: continuous learning, long term memory, and infinite and efficient context/operation. All of these are tied together deeply, and thus I believe we are but a simple breakthrough away from AGI.
There are very significant differences between biological and artificial neural networks. Artificial neural networks are mathematical attempts to replicating how the brain’s neurons work. They are not and were never meant to be 1 to 1 replications. There is the difference in scale, where the “parameters” of human neural networks absolutely dwarf the current LLMs we have today. There is also the fact that they are materially different. The underlying biology and cell structure affects biological neural networks in ways that artificial neural networks just simply dont have access to.
The idea of awareness being propagations through the NN is an interesting concept though. I wonder if this idea be proven through monitoring the electrical signals within the brain.
People like to focus on the differences between the brain and artificial neural networks. I myself believe the only thing that truly matters is that you can form complex functions with the common neuron element. This is achieved via linking lots them together, and by each having a property known as non-linearity. These two things ensure that with neurons you can just about approximate any linear or non-linear function or behaviour. This means you can simulate inside your network pretty much any reality within this universe, its causation and the effects.
The deeper your network the more complex the reality you can "understand". Understand just means simulate and run inputs to get outputs in a way that matches the real phenomenon. When someone is said to be "smart", it means they possess a set of rules and functions that can very accurately predict a reality.
You mention scale, and while its true the number of neuron elements the brain has is larger than any LLM, its also true the brain is more sparse, meaning much less of the neurons are active at the same time. For a more fair comparison, you can also remove the motor cortex from the discussion, and talk just about the networks that reason. I believe the scale is comparable.
In essence, I think it doesn't matter that the brain has a whole bunch of chemistry added into it that artificial neural networks don't. The underlying deep non-linear function mapping capability is the same, and I believe this depth is, in both cases, comparable.
While thinking that current mathematical model replicate accurately a fondamental aspect of biological neural network might be right, it doesn't mean that nothing is missing to achieve the stated goal of true intelligence.
Maybe we've just reach the ability the replicate the function of an artificially powered dead brain that would be randomly stimulated and nothing more. Is this really a path to intelligence ?
Isn't the brain randomly stimulated already? Even not being dead? Don't you think the complex reasoning is a cause of the neurons themselves and not the stimulation?
Animals are alive and are not nearly as smart. Its because their neural networks are not as deep. Its not for the lack of proper chemistry or stimulation.
Why would it have to be a 1 to 1 replication? Isn't that a strawman argument? NNs can basically store the collective of knowledge of humanity in that miniscule amount of neurons. NNs also run at much much higher frequency than human brains. Does that make human brains inferior and not worthy of being called aware by the same line of argumentation? Why do these differences even matter? I can imagine a vastly different form of awareness than humans just fine. They can both be aware and not that similar.
> Awareness is just continuous propagation of the neural network, be that artificial or biological. The reason thoughts just "appear" is because the brain is continuously propagating signal through the neural network.
This is just a claim you are making, without evidence.
The way you understand awareness is not through "this is like that" comparisons. These comparisons fall over almost immediately as soon as you turn your attention to the mind itself, by observing it for any length of time. Try it. Go observe your mind in silence for months. You will observe for yourself it is not what you've declared it to be.
> An LLM does all of these things, and more, and you can prove this by how complex their output is.
Complex output does not prove anything. You are again just making claims.
It is astoundingly easy to push an LLM over to collapse into ungrounded nonsense. Humans don't function this way because the two modes of reasoning are not alike. It's up to those making extraordinary claims to prove otherwise. As it is, the evidence does not exist that they behave comparably.
What makes you think you can understand the subjective experience of LLMs then? It's out of reach of science, so the only way is to ask them? How can you be sure they don't have subjective experience? Remember that you forbade yourself from using science to answer it.
Also, if subjective experience has any effect on the material world, then we can measure it and test it, putting it in reach of science. If it doesn't, why does it even matter in this discussion? By definition it has no relation to the AGI discussion since that's an empirical matter.
Haha, well, I would appreciate if comments included more substantive evidence when they make claims like they did.
I see a lot of "AGI boosters/doomers" comfortable making big claims without providing research to back what, when challenged, prove to be just their model or feeling of how things function.
Oh I agree with you, I was just underscoring that.
It seems to be a case of people looking at a problem they have little knowledge or understanding of and thinking "how hard can it be"? In this case, the answer is "so hard that philosophers have dubbed it 'the hard problem'".
Philosophers are mostly unaware of artificial neural networks. The game has changed, you can understand a lot about the human mind if you understand AI. Don't get too stuck in the past.
How about an objection to what I said? A case where someone is conscious but without continuous propagation of neural signals? Or something
You didn’t provide enough of a hypothesis to seriously discuss.
> A case where someone is conscious but without continuous propagation of neural signals?
That would be irrelevant. All known conscious beings are made up of biological cells, but that doesn’t prove that all conscious beings must be made of biological cells, or that biological cells are the key causative factor in consciousness. The same goes for “continuous propagation of neural signals.”
You described a personal conjecture as though it solved a known hard problem, even throwing in the word “just” as though the solution is really simple. This is a lot like the Feynman quote about quantum mechanics: if you think you understand it, you almost certainly don’t. You may not even have recognized the core problem yet. The original Chalmers paper is a good place to start: https://consc.net/papers/facing.pdf
But coming at it from a computational perspective, in some ways it’s even easier to see the problem. We don’t generally assume that a deterministic, non-neural net program written in say Python has a conscious subjective experience. To use Nagel’s terminology, there is “nothing it is like” to be that program. But, an LLM or any other computational neural net is no different from a program like that. It’s executing deterministic instructions, like a machine, because it is a machine. We can speculate about consciousness being some sort of emergent property that arises in such systems given the right conditions, but that’s all it is: speculation.
And it’s completely unclear what those right conditions might be, or how those conditions could possibly give rise to conscious experience. Why aren’t humans philosophical zombies with no conscious experience, just reacting to input like machines? No-one has succeeded in getting past the conjecture stage in answering that question.
I am prepared and want to discuss seriously every one of my view points. The initial comment was just the abstract. I am extremely confident in my world view about Deep Learning and cognitive ability. And the reason for that is because I generally try to avoid doing what you just did, that is, reading what other people think regarding this subject.
I instead choose to ground my views in real world experiments and information I have gathered and experienced. This primarily consists of an enormous amount of experimentation with Deep Learning models, both inference and training. My views come mostly from that. I don't recite Andrej Karpathy or Ilya Sutskever. I don't even care about their opinions for the most part. I experiment with the models to such an extreme degree that I understand very well how they behave and their limitations. And I believe if you are going to create a breakthrough, this is the only way to do so.
> an LLM or any other computational neural net is no different from a program like that
I don't think so. A program doesn't exhibit highly complex abstract thought in a very high-dimensional space.
> It’s executing deterministic instructions, like a machine, because it is a machine
Its true that LLMs are deterministic. But do you really think that the magic behind the brain is only due to temperature and randomness? Do you really think that non-deterministic behavior is the magic ingredient that makes up what we are referring to as consciousness?
I could inject noise into an LLM at every parameter dynamically during inference. The output would come out just fine. After all, LLMs are high dimensional and can handle a little noise. Would it really be more conscious after that? You can find experiments where people remove entire layers of the LLM and it still works fine. A little noise would be even less harmful than that.
You see, when I'm arguing, I'm not citing what some other person said. I at most will cite experiments from other people and their results. When I am contradicting your arguments, I present you a reality you can go and try in the real world and verify. You can go verify yourself LLMs exhibit complex high-dimensional thought. You can verify yourself that if you inject noise dynamically through inference on every parameter, you still get coherent output from the LLM.
So, if you are willing to continue this discussion, I ask of you that you present some sort of "probing" of the real world and the respective "reaction" of the same real world as arguments. That is what finding the truth means.
And lastly. I am presenting a Theory. This means I believe that my points form a foundation that makes my theory stronger than yours. It means I have better evidence that backs it up. It doesn't mean I have proved what consciousness is. Instead, it primarily means I can make more accurate predictions using my theory on real world scenarios involving artificial and biological neural networks. And my personal experience shows me that is true.
The sentence "It is astoundingly easy to push an LLM over to collapse into ungrounded nonsense" makes me wonder.
How easy? What specific methods accomplish this? Are these methods fundamentally different from those that mislead humans?
How is this different from exploiting cognitive limitations in any reasoning system—whether a developing child's incomplete knowledge or an adult's reliance on heuristics?
How is it different from Fake News and adults taking Fake News for granted and replicating bullshit?
Complex output can sometimes give you the wrong idea, I agree. For instance, a study Anthropic did a while back showed that, when an LLM was asked HOW it performed a mathematical computation (35 + 59), the response the LLM gave was different from the mechanistic interpretation of the layers [1]. This showed LLMs can be deceptive. But they are also trained to be deceptive. Supervised fine tuning is imitation learning. This leads the model to learn to be deceptive, or answer what is usually the normal explanation, such as "I sum first 5+9, then add the remainder to... etc". The LLM does this rather than actually examining the past keys and values. But it does not mean it can't examine its past keys and values. These encode the intermediate results of each layer, and can be examined to identify patterns. What Anthropic researchers did was examine how the token for 35 and for 39 was fused together in the layers. They compare these tokens to other tokens, such as 3 , 5 , 9. For an LLM, tokens are high dimensional concepts. This is why you can compare the vectors to each other, and figure out the similarity, and therefore break down the thought process. Yes, this is exactly what I have been discussing above. Underneath each token prediction, this black magic is happening, where the model is fusing concepts through summation of the vectors (attention scores). Then, merged representations are parsed by the MLPs to generate the refined fused idea, often adding new knowledge stored inside the network. And this continues layer after layer. A repeated combination of concepts, that start with first understanding the structure and order of the language itself, and end with manipulation of complex mathematical concepts, almost detached from the original tokens themselves.
Even though complex output can be deceptive of the underlying mental model used to produce it, in my personal experience, LLMs have produced for me output that must imply extremely complex internal behaviour, with all the characteristics I mentioned before. Namely, I frequently program with LLMs, and there is simply zero percent probability that their output tokens exist WITHOUT first having thought at a very deep level about the unique problem I presented to them. And I think anyone that has used the models to the level I have, and interacted with them this extensively, knows that behind each token there is this black magic.
To summarize, I am not being naive by saying I believe everything my LLM says to me. I rather know very intimately where the LLM is deceiving me and when its producing output where its mental model must have been very advanced to do so. And this is through personal experience playing with this technology, both inference and training.
> What most people fail to realise is that in between each token being generated, black magic is happening in between the transformer layers.
Thank you by saying that. I think most people have an incomplete mental model for how LLMs work. And it's very misleading for understanding what they really do and can achieve. "Next token prediction" is done only at the output layer. It's not what really happens internally. The secret sauce is at the hidden layers of a very deep neural network. There are no words or tokens inside the network. A transformer is not the simple token estimator that most people imagine.
AGI is not here with LLMs, but its not because they lack reasoning ability. It's due to something different. Here is what I think is truly missing: continuous learning, long term memory, and infinite and efficient context/operation. All of these are tied together deeply, and thus I believe we are but a simple breakthrough away from AGI.