Neuroscientists suggest Hebbian learning.
Better known as "neurons that fire together, wire together" which can be roughly rephrased to "neurons identify correlations in the input". They do this in a quasi-layered environment differentially modulated by waves of the materials used to alter their connections (a.k.a. trophic factor).
Imagine the sensory pathway from sense organ to brain. As information travels it preferentially goes in one direction as opposed to another. When it reaches a dead end it splits and forms its own branch. Then the next bit of information will reach this new dead end, and split again. And so on until everything is processed and categorized. Basically a graph search problem in a self-modifying graph where you can have cycles and feedback loops and layers and many other structures that are not yet captured by the state of the art in ML, and where each node is specialized for a specific task.
An interesting piece featured in the article: “Concept and Location Neurons in the Human Brain Provide the ‘What’ and ‘Where’ in Memory Formation”, Nature Communications 2024 (https://doi.org/10.1038/s41467-024-52295-5)
This wasn’t in the article, but I feel it makes for good background reading: “Universal Principles Justify the Existence of Concept Cells”, Scientific Reports 2020 (https://doi.org/10.1038/s41598-020-64466-7)
This is really cool - it's a relatively well known idea, but it's great to see it get refined and better understood. It's amazing how sparse the brain is; a single neuron can trigger a profound change in contextual relations, and play a critical role in how things get interpreted, remembered, predicted, or otherwise processed.
That single cell will have up to 10,000 features, and those features are implicitly processed; they're only activated if the semantic relevance of a particular feature surpasses some threshold of contribution to whatever it is you're thinking at a given moment. Each of those features is binary, either off or on at a given time t in processing. Compare this to artificial neural networks, where a particular notion or concept or idea is an embedding; if you had 10,000 features, each of those is activated and processed every pass. Attention and gating and routing and MoE get into sparsity and start moving artificial networks in the right direction, but they're still enormously clunky and inefficient compared to bio brains.
Implicit sparse distributed representation is how the brain can get to ~2% sparse activations, with rapid, precise, and deep learning of features in realtime, where learning one new thing can recontextualize huge swathes of knowledge.
These neurons also allow feats of memory, like learning the order of 10 decks of cards in 5 minutes, or reciting 1 million digits of pi, or cabbies learning "The Knowledge" and learning every street, road, sidwalk, alley, bridge, and other feature in London, able to traverse the terrain in their mind. It's wonderful that this knowledge is available to us, that the workings of our minds are becoming unveiled.
I am always skeptical of neural correlate studies. The conclusions that are drawn are often bogus and lacking in rigor. The free will experiments are notorious in this regard.
That some part of the brain lights up when some concept is entertained is not surprising to me. We often imagine instances of things when entertaining a concept. However, I would reject the interpretation that the concept is reducible to this brain activity or the neuron(s) involved. Concepts are, by definition, abstract, which is to say, entities that do not exist "in the real" as abstract entities, only as instantiations. For while Alice and Bob may be concrete persons in the real, the concept of "Humanity" does not exist in the real as a concretum. Only Alice and Bob as instance of Humanity do. Thus, Humanity as such only exists in the mind as an abstracted concept.
That's important, because it leads to the question of why abstract concepts can exist in the mind as abstract concepts, and not as concrete instances. The answer involves the presupposition that matter is the instantiating principle. From there, it follows that the mind is only able to entertain so-called "universals" because it is itself not entirely a material faculty. Concepts are not images (as images are concrete and exclude all by the instance they express), and if we want to avoid incoherent and retorsion arguments, the concept must be understood to be a sign whose content is not mere representation. It must be intentional, or else knowledge becomes impossible. Indeed, conventional and iconic signs become impossible without a final intentional terminus.
This isn't to say that I don't think brain studies are worthless. On the contrary. I simply caution against allowing sloppy materialistic metaphysics to corrupt properly empirical conclusions.
The authors of the second piece specifically said this was not the same thing: the fact that they weakly fire for loosely-associated concepts is very different from (and ultimately shallower than) concept neurons:
Looking to neuroscience, they might sound like “grandmother neurons,” but their associative nature distinguishes them from how many neuroscientists interpret that term. The term “concept neurons” has sometimes been used to describe biological neurons with similar properties, but this framing might encourage people to overinterpret these artificial neurons. Instead, the authors generally think of these neurons as being something like the visual version of a topic feature, activating for features we might expect to be similar in a word embedding.
The "turtle+PhD" artificial neuron is a good example of this distinction: it is just pulling together loosely-related concepts of turtles and academia into one loose neuron, without actually being a coherent concept.
LLM can concept cells as well. Claude artificially amplified neuron responsible for the Golden Gate Bridge, the resulting LLM, called 'Golden Gate Claude', would mention the Golden Gate Bridge bridge when ever it could.
It wasn't a neuron but a cluster of neurons, according to the article, and I believe this kind of stuff is generally best done and talked about at the level of latent space, not actual neurons. It's already been shown that the LLMs encode concepts along the dimensions of the (extremely highly dimensional) latent space. "King - Man + Woman = Queen" is old school; there's been demos showing you can average out a bunch of texts to identify vectors for concepts like, say, "funny" or "academic style", and then say have LLM rewrite some text while you do the equivalent of "- a*<Academic style> + b*<Funny>" during inference to make a piece of scientific writing into more of a joke.
I'm surprised we don't hear more about this (last mention I remember was in terms of suppressing "undesirable" vectors in the name of "alignment"). I'd love to get my hands on a tool that makes it easy to do this on some of the SOTA OSS models.
But temporal instability observed in repeat functional imaging studies indicates that functional localization constant: the regions of the brain that activate for a given cue vary over time.
The important part about the statements in the drift paper are the qualifiers:
> Cells whose activity was previously correlated with environmental and behavioral variables are most frequently no longer active in response to the same variables weeks later. At the same time, a mostly new pool of neurons develops activity patterns correlated with these variables.
“Most frequently” and “mostly new” —- this means that some neurons still fire across the weeks-long periods for the same activities, leaving plenty of potential space for concept cells.
This doesn’t necessarily mean concept cells exist, but it does allow for the possibility of their existence.
I also didn’t check which regions of the brain were evaluated in each concept, as it is likely they have some different characteristics at the neuron level.
I wonder how concepts are "learned" or how they evolve.
in the 1990s I studied "Women, Fire and dangerous Things" https://scholar.google.com/scholar?cluster=11854940822538766...
And among other things they observed and pointed out how different concepts are between cultures.
Neuroscientists suggest Hebbian learning. Better known as "neurons that fire together, wire together" which can be roughly rephrased to "neurons identify correlations in the input". They do this in a quasi-layered environment differentially modulated by waves of the materials used to alter their connections (a.k.a. trophic factor).
I believe it's like a tree that's sprouting.
Imagine the sensory pathway from sense organ to brain. As information travels it preferentially goes in one direction as opposed to another. When it reaches a dead end it splits and forms its own branch. Then the next bit of information will reach this new dead end, and split again. And so on until everything is processed and categorized. Basically a graph search problem in a self-modifying graph where you can have cycles and feedback loops and layers and many other structures that are not yet captured by the state of the art in ML, and where each node is specialized for a specific task.
Somewhat Analogous to tokenization?
An interesting piece featured in the article: “Concept and Location Neurons in the Human Brain Provide the ‘What’ and ‘Where’ in Memory Formation”, Nature Communications 2024 (https://doi.org/10.1038/s41467-024-52295-5)
This wasn’t in the article, but I feel it makes for good background reading: “Universal Principles Justify the Existence of Concept Cells”, Scientific Reports 2020 (https://doi.org/10.1038/s41598-020-64466-7)
This is really cool - it's a relatively well known idea, but it's great to see it get refined and better understood. It's amazing how sparse the brain is; a single neuron can trigger a profound change in contextual relations, and play a critical role in how things get interpreted, remembered, predicted, or otherwise processed.
That single cell will have up to 10,000 features, and those features are implicitly processed; they're only activated if the semantic relevance of a particular feature surpasses some threshold of contribution to whatever it is you're thinking at a given moment. Each of those features is binary, either off or on at a given time t in processing. Compare this to artificial neural networks, where a particular notion or concept or idea is an embedding; if you had 10,000 features, each of those is activated and processed every pass. Attention and gating and routing and MoE get into sparsity and start moving artificial networks in the right direction, but they're still enormously clunky and inefficient compared to bio brains.
Implicit sparse distributed representation is how the brain can get to ~2% sparse activations, with rapid, precise, and deep learning of features in realtime, where learning one new thing can recontextualize huge swathes of knowledge.
These neurons also allow feats of memory, like learning the order of 10 decks of cards in 5 minutes, or reciting 1 million digits of pi, or cabbies learning "The Knowledge" and learning every street, road, sidwalk, alley, bridge, and other feature in London, able to traverse the terrain in their mind. It's wonderful that this knowledge is available to us, that the workings of our minds are becoming unveiled.
I am always skeptical of neural correlate studies. The conclusions that are drawn are often bogus and lacking in rigor. The free will experiments are notorious in this regard.
That some part of the brain lights up when some concept is entertained is not surprising to me. We often imagine instances of things when entertaining a concept. However, I would reject the interpretation that the concept is reducible to this brain activity or the neuron(s) involved. Concepts are, by definition, abstract, which is to say, entities that do not exist "in the real" as abstract entities, only as instantiations. For while Alice and Bob may be concrete persons in the real, the concept of "Humanity" does not exist in the real as a concretum. Only Alice and Bob as instance of Humanity do. Thus, Humanity as such only exists in the mind as an abstracted concept.
That's important, because it leads to the question of why abstract concepts can exist in the mind as abstract concepts, and not as concrete instances. The answer involves the presupposition that matter is the instantiating principle. From there, it follows that the mind is only able to entertain so-called "universals" because it is itself not entirely a material faculty. Concepts are not images (as images are concrete and exclude all by the instance they express), and if we want to avoid incoherent and retorsion arguments, the concept must be understood to be a sign whose content is not mere representation. It must be intentional, or else knowledge becomes impossible. Indeed, conventional and iconic signs become impossible without a final intentional terminus.
This isn't to say that I don't think brain studies are worthless. On the contrary. I simply caution against allowing sloppy materialistic metaphysics to corrupt properly empirical conclusions.
Same concept in LLMs as referenced in this video by Chris Olah at Anthropic:
https://www.reddit.com/r/OpenAI/comments/1grxo1c/anthropics_...
also see: https://distill.pub/2021/multimodal-neurons/
The authors of the second piece specifically said this was not the same thing: the fact that they weakly fire for loosely-associated concepts is very different from (and ultimately shallower than) concept neurons:
The "turtle+PhD" artificial neuron is a good example of this distinction: it is just pulling together loosely-related concepts of turtles and academia into one loose neuron, without actually being a coherent concept.I wonder how this maps to symbolic AI
LLM can concept cells as well. Claude artificially amplified neuron responsible for the Golden Gate Bridge, the resulting LLM, called 'Golden Gate Claude', would mention the Golden Gate Bridge bridge when ever it could.
https://www.anthropic.com/news/golden-gate-claude
It wasn't a neuron but a cluster of neurons, according to the article, and I believe this kind of stuff is generally best done and talked about at the level of latent space, not actual neurons. It's already been shown that the LLMs encode concepts along the dimensions of the (extremely highly dimensional) latent space. "King - Man + Woman = Queen" is old school; there's been demos showing you can average out a bunch of texts to identify vectors for concepts like, say, "funny" or "academic style", and then say have LLM rewrite some text while you do the equivalent of "- a*<Academic style> + b*<Funny>" during inference to make a piece of scientific writing into more of a joke.
I'm surprised we don't hear more about this (last mention I remember was in terms of suppressing "undesirable" vectors in the name of "alignment"). I'd love to get my hands on a tool that makes it easy to do this on some of the SOTA OSS models.
skos:Concept RDFS Class: https://www.w3.org/TR/skos-reference/#concepts
schema:Thing: https://schema.org/Thing
atomspace:ConceptNode: https://wiki.opencog.org/w/Atom_types .. https://github.com/opencog/atomspace#examples-documentation-...
SKOS Simple Knowledge Organization System > Concepts, ConceptScheme: https://en.wikipedia.org/wiki/Simple_Knowledge_Organization_...
But temporal instability observed in repeat functional imaging studies indicates that functional localization constant: the regions of the brain that activate for a given cue vary over time.
From https://news.ycombinator.com/item?id=42091934 :
> "Representational drift: Emerging theories for continual learning and experimental future directions" (2022) https://www.sciencedirect.com/science/article/pii/S095943882... :
>> Future work should characterize drift across brain regions, cell types, and learning.
The important part about the statements in the drift paper are the qualifiers:
> Cells whose activity was previously correlated with environmental and behavioral variables are most frequently no longer active in response to the same variables weeks later. At the same time, a mostly new pool of neurons develops activity patterns correlated with these variables.
“Most frequently” and “mostly new” —- this means that some neurons still fire across the weeks-long periods for the same activities, leaving plenty of potential space for concept cells.
This doesn’t necessarily mean concept cells exist, but it does allow for the possibility of their existence.
I also didn’t check which regions of the brain were evaluated in each concept, as it is likely they have some different characteristics at the neuron level.
So there's more stability in the electrovolt wave function of the brain than in the cellular activation pathways?