User:Graeme E. Smith/Modularity Revisited

Modularity Revisited Three Phases of Modularity in the Cerebral Cortex Graeme E. Smith, GreySmith Institute of Advanced Studies  http://en.wikiversity.org/wiki/Portal:GreySmith_Institute   http://en.wikiversity.org/wiki/User:GreySmith_Institute  grysmith@telus.net In his book The Modularity of Mind Jerry Fodor introduced the idea of Modules in the Cerebral cortex that acted like databases specialized for a particular domain. His conception of these Modules was quite strict, and suggested a innate knowledge about the subject matter of the domain. Annette Karmiloff-Smith, in her book Beyond Modularity: A developmental perspective on Cognitive Science rebutted the innate nature of the Modules, clearly showing that some of the data in the modules was generated over the early life of the individual during developmental stages. Since then, there has been considerable discussion about some of the other limits that Fodor placed on the modules. In this Article I explore the nature of the Attention System, and posit a theory of three separate modular Phases that are all to be found within the Cerebral Cortex.

Modularity in the mind, is a seductive concept, that seems to favor the implementation of Functional Consciousness, by dividing the functionality of the mind, into byte sized pieces that can be analyzed and transferred to a computer once they are fully understood. The originator of this concept is Jerry A. Fodor, who wrote about it in The Modularity of Mind: an Essay on Faculty Psychology now in its 11th edition. He believed that such modules were started with innate knowledge, were relatively lightweight having shallow processing, were not selected by attention but competed for output, were defined by data-encapsulization and so on. Proponents of Massive Modularity chafe under the restrictions he placed on his modules since they consider even high-end processing to operate in the same manner   While I am not a proponent of Massive Modularity, I find that I must also chafe under the restriction that the modules are not selected by attention. In my attention model, I have room for three phases of modularity one of which is passively selected by attention, and one of which is actively selected for by attention. Fodor's Modularity only captures the first phase of modules in this theory, because of the limitations he places on the definition of modules   It is interesting that my reason for deciding that Fodor's rules of modularity don't fit, came partly because of Fodor's own work in a later book The Mind doesn't work That Way!: the scope and limits of Computational Psychology in which he discusses the limitations of Computational approaches to phenomenal psychology and the Computational Theory of Mind, as defined by Alan Turing. Personally I take more David Chalmers point of view in his article A Computational Foundation for the Study of Cognition in which he indicates that computation is not limited to Alan Turings concept of "Truth Preserving Functions" today, but includes many other forms of mathematical calculation because it is a general term. At the time that Fodor wrote his book on the limits to Computational Psychology, scientists were just beginning to deal with the need to convert from a Phenomenal form of storage such as neural networks might create, to an explicitly addressable form of storage, such as might be needed to isolate specific memories. It is because of the work of  David LaBerge, that I have determined the nature of the process needed to make such a conversion. However in doing so, I have learned of caveats on the nature of the element being converted, that cause me to hew closer to Qualia Theory than I would normally have considered because of my computer background. The problem as Jerry Fodor pointed out, in his 2000 book, is that neural networks are phenomenal in their output, and as such, you cannot separate out specific memories from a pure neural network model. in fact, if David Marr, was correct, the natural mode of the Cerebral Neocortex is a content addressable memory. Marrs comments, and my own thought experiments quickly indicated to me, that the output of such a memory had to be a redundant field of data, rather than a differentiable memory as such. The only way to narrow the field, it seemed is to reduce the number of outputs by suppression at the pyramidal Neuron level, or filter the outputs in some manner so that only outputs with a specific tag characteristic are allowed. Because neural networks are non-local in their storage, prediction of the location of a memory within the memory matrix is problematic. However Jerry Fodor in his 2001 book, suggested that somehow it was thought that the column architecture of the brain might have something to do with locating memories. In fact the theory of Neural Groups, is that groups of a hundred or so neurons get together in a cluster to promote a single neuron as the output for the whole group. The Center Surround Theory suggests that in fact the 6th layer pyramidal neuron, acts to localize these neural groups by suppressing neurons that are not as strongly activated, and promoting the centroid of the set as being the strongest neuron in the neural group. Evidence shows that the neuron being promoted to fire, often changes over time, suggesting that individual neurons might be overcome with fatigue, but the neural group can continue to output a value.((indent|8}}In Neural Darwinism: The Theory of Neural Group Selection Dr. Edelman suggests that neural groups are interchangeable. This means that at the neural group level, no common mapping can be found for even individuals of the same species. However in Self-Organizing Maps, Teuvo Kohonen discusses attempts to map motor signals to arm movements that have successfully used Neural Probe Arrays, and Artificial Neural Network SOM's to detect and map which neural groups activated which arm muscles, well enough for a monkey to guide a robotic arm in order to give himself a treat. The main requirement seems to be that the SOM has to learn the characteristic activation patterns for a particular individual monkey, before it can do the job. This interchangeability of neural groups means that there is no species specific mapping of neural Groups that could be used to access a specific memory, the organism has to learn its own unique pattern of Neural Groups in order to access memories. This in turn means that the conversion from content addressable memory to demand memory, is not simple or straight forward like it would be in a computer where we would have to implement the content addressable memory on top of an already place-code addressable memory, since each memory element in a computer has its own unique address. In essence this means that even if we wanted to, we cannot demand a memory from the Cerebral Cortex content addressable mode memory, until we have done some mapping. this is a tricky concept for some, because it seems so obvious that all we need to do is activate a specific column and we will get a specific memory. There is no doubt that such is true, because brain surgeons during open brain surgery, have shown that anytime a stimulus happens on a column, a specific memory is triggered. Columns invoke specific memories. The problem is in knowing which column to activate. This is where Dr. Laberges Apical Dendrites from his Attentional Control Brief and Prolonged article come into play. they are a way of activating a sub-unit of a Column, he calls a Mini-column. Mini-columns in turn cause the pre-activation of content addressable neurons in Layer 2/3 of the cerebral neocortex, and these in turn, activate their local neural groups. In other words, pre-activation by a Cortico-Thalamic Connection to the 5th layer pyramidal neurons that are the hub of the mini-column, causes the content addressable memory to output a field of data. We don't know what exactly that data will be, at this point, but we have a method of addressing the content addressable memory, at a level below the neural group level. Furthermore we know that this method should be capable of addressing specific memories, if only we knew which were which. Obviously to make a demand memory work, we need an index to the mini-columns so we know which columns to activate for which memories. If we could memorize the specific mini-column addresses for a particular field of data, we could simulate that field of data, by pre-activation of the selected mini-column addresses. Now I am going to stick my neck out, and say that the main problem in mapping a field of data to a particular set of mini-column addresses is to isolate the specific field of data. Once you have done that, you can simply compare each mini-column address with the field of data and if they are both active, you have found a mini-column address that is part of the specific field of data. The problem here is simply that neural networks do not lend themselves to isolation of specific memories so that the field of data you must work with, is whatever is left after you shut off everything else at the 2/3 layer pyramidal Neurons and filter for some tag element. I call this highly redundant field of data a quale, because it is phenomenally implicit memory and there is no fine control over what you get out of it. The problem I suggest is not how you end up with such a rich form of data, but how you filter out any significant portion of it, or make sense of all the redundant data that comes with it. This inverts the philosophical quandry of Qualia, which is a good thing. By inversion the problem becomes how do you make sense out of a Quale, which is a lot more approachable than asking what the color red feels like. Once you have mapped a clump of mini-column addresses to a specific Quale (Redundant Field of Data), The problem of how to make sense of it, still rears it's ugly head. If you rehearse the whole clump, you are going to get something a lot like the original quale. Now if Jerry Fodor's Module theory is at all correct, the output of the quale includes the outputs of all the modules that have been applied automatically to the data, unless somehow they were suppressed by suppression of the individual pyrimidal Neurons in layer 2/3. In David Marr's "A theory for the Cerebral Neocortex" he notes that a certain type of neuron in the 4th layer called a basket neuron arbors on the soma of the Pyramidal Neuron. At the time he wrote his theory, it was considered derigor that neurons were doing numerial calculation on the data. Since then we have found that there is too much chaos in the connections for such a prosaic function to work, so I can suggest that instead of being a division step, it is likely that the basket neurons are soma shunts capable of turning off the pyramidal neurons without affecting their inputs. If this is true we have a mechanism for shutting off portions of the quale, we just need a method of activating these neurons   In David LaBerges "Attention, Awareness and the Triangular Circuit, he draws a circuit of three areas of the brain that are interconnected. We have already heard how the Thalamus might be involved in addressing the mini-columns, We already have an idea of how the cerebral cortex connects the mini-column address to the content addressable memory, so the missing element seems to be the third area of the brain that is involved in LaBerges Triangle, the Prefrontal Cortex. The Prefrontal Cortex, connects to the cerebral neocortex and seems according to Dr. Laberge to have a suppression mechanism that is used to select with.    One of the problems that scientists have with Dr. Laberges theory is that he holds it out as the theory of attention, yet it fails to capture nearly half the functions that are thought to belong to attention. In other words it is not a complete model, but it is, a useful model, especially for understanding explicit memory. Explict memory, in my lexicon, is that memory that can be addressed using place-code addressing involving the Apical Dendrites of the 5th layer pyramidal neurons activated by cerebro-thalamic connections. One of the problems with Dr. LaBerges theory, is that it doesn't explain how the selection is acoomplished or what the effect on the datafield is. One of the things that Dr. LaBerges Triangular Theory of attention fails to deal with, is the binding of cross-modal sensory data. Evidence of the encapsulation of the Fodorian Modules, seems to suggest that sensory data, is processed in a stove-pipe fashion where each sensory modality is processed separately. Dr. Edelman in The Remembered Present: A biological Theory of Consciousness notes that the binding effect probably has to do with the creation of Functional Clusters, where clusters of neurons distributed across the cortex, resonate in synchrony at about 40 hertz. These Gamma Sychronized Oscillations (GSOs) seem to link across modal stove-pipes which would suggest a binding effect. In Wider than the Sky: The Phenomenal Gift of Consciousness Dr. Edelman notes a similar oscillation in the Thalamus, suggesting that perhaps either the thalamus is influencing the Cerebral Cortex GSO's or the Cerebral Cortex is influencing the Thalamus GSO's. Because the Thalamus is part of the Reticular Activating System, which has been implicated in distributing brain waves, it makes sense that the thalamus would be influencing the Cerebral Cortex. This of course does not explain how the Synchrony is coordinated since the same signals must pass through networks of different lengths to reach different areas of the cerebral cortex, nor does it explain what function the GSO's might have other than binding cross modal data. But if we look at our model two possible roles might be found for GSO's, they might, tag data, or filter it. A hypothesis that I have been looking at, is that the GSO's tag data so that different functional clusters can be separated, by filtering only for a particular GSO frequency. If so, it is probably the role of the Anterior Cingulate Cortex, in the Vento-lateral PreFrontal Cortex, that uses the GSO frequency to suppress all but a single GSO. This tagged and filtered functional cluster, then defines the Quale that I have already discussed. This would not work exactly like a band pass filter, in that any signal that synchronized with a particular frequency reference signal would be passed, rather than passing only those frequencies carried on a particular band. There are two problems with this hypothesis, first there is no frequency loss associated with the signal, like might be expected in a radio-like band-pass filter, and secondly, it has been shown that there is no separation in signals between the foreground and background at this level. If we expected to have the foreground separated from the background at this level, we would have quite rightly doubted the role of the Anterior Cingulate Cortex, in filtering the Quale. However, the nature of the processing before this level, seems more to suggest that what we are filtering for, is a consistent Point of View, rather than any detail in the output. Further, we don't necessarily expect a radio type band-pass filter, if only because the nature of the signal is more of a tag added to the 2/3 Layer activations, rather than a modulation of the signals from those Layers. In actual fact, we would expect in this model that the pre-activation of the mini-columns with a frequency, would simply add a resonation to the already existing data frequencies, which is more like the effect that has been observed. However this model also works for the matching role for determining a clump of mini-column addresses, If the mini-column connection resonates at the same speed as the Quale, it must be part of the Quale. And as I like to say, the functional cluster creates a quale that is information rich, but organizationally poor. The problem with this type of signal is that it has no meaning, since the information cannot be easily factored out, until we have an index. But how to build such an index, Aye there's the rub. If we rehearse our clump of mini-column addresses we get something similar to the original quale, but because the network has cnanged in the meantime, we don't get the exact same quale, we get a second quale that resembles the first, but includes new information. In fact this is exactly the nature of implicit memory, is that memories get added information, when they are rehearsed. In The Remembered Present Dr. Edleman called this re-categorical memory. To get some meaning out of the Quale, we need to reduce it to smaller clumps of mini-column addresses. One way we might do this, is if we edit the set of mini-column addresses before we rehearse the quale. At first we would have to do this at random, if only because there is no significance to the individual mini-columns at first. One way we could do this, is to suppress some of the mini-columns. Dr. LaBerge, has noted a connection between the PreFrontal Cortex, and the Thalamus, and other researchers have suggested that it flows through the striate cortex, and a special element in that cortex called the Nucleus Accumbens. Then it flows into a sort of inverter sheaf around the thalamus called the Nucleus Reticularis Thalmi. What this suggests is that the PreFrontal Cortex can influence which mini-column addresses are pre-activated by actively suppressing a subset of them. All we would need is some inference engine that could sense whether or not a particular sub-clump arrangement of mini-column addresses were more or less significant than the Original Quale, and we would be on the way to mapping significant areas of the Cerebral Cortex. Now where the original Fodorian Modules would be limited to the unselected qualar output, and what they can analyze from that, a secondary set of Modules could be monitoring the Qualar output of these sub-clump arrangements, and analyze them for even more information. Further, one of the things we would want to do was to map the significant sub-clump arrangements so that we can find them again in a similar quale. this seems to be a role that is made for the striate cortex, where the prefrontal cortex signals and the Thalamic Signals are both available. I have recently heard of connections from the subiculum, the hippocampus and the entorhinal cortex, and possibly the parahippocampus all also dumping into this area, and have hypothesised that their roles involve similar types of connections. Of course the science does not reach that far into this area of the brain yet, if only because the tools we have for studying it are so expensive. As we learn more about the brain, we will be able to confirm or deny many different hypothesis and of course this is just one. The last modular elements I would like to discuss, are those that are recruited by the Dorso-Lateral PFC. This area of the cortex, hss the ability to selectively activate areas of the cerebral cortex that are not usually active, An example of this is the Fusiform Facial Area, an area on the Fusiform Gyrus, that lights up when something that might be a face is detected. Another example is a motor neuron that is only recruited when you want to point your finger at something. These Modules are actively selected by the Attention system, and recruited to do a specific action or function with a specific piece of data. Like the Fodorian Modules these other two layers of modules I have described, are relatively shallow and encapsulated However both of these two arrays of modules are to some extent directed by the attention system, the second array, by selection of sub-clump arrangements of mini-column addresses, and the third array by specific selection via the Attention System. In fact the third type, because it is process oriented, is done relative to the data produced by the second type and the sub-clump arrangements. This forms a strange type of attention I call Complicit attention because it depends on another form of attention for selection of its data.((indent|8}} In fact, I suspect that Complicit Attention defines a value-function tuple that in aggregate defines a processing language of some sort. It is therefore the nature of this third type of module, that defines the content of high order processing. While I cannot speak to the functions that might be practical what would be needed in order to define such a value-function tuple was two generators that generated the clump address for first the data element and then the Function, to form the value-function tuple. While I do not know personally of the existence of these generators, I am assured that short term memory is dependent on centers both in the frontal lobes and in the hippocampal area. Either center can be removed or silenced chemically with only a short term loss of short term memory, while the brain relearns to use the other center to accomodate. However if both centers are lost or silenced, short term memory is also lost. While I can't reference the actual documents in which these experiments were written, I thank Dr. Clayton Dickson of the UofA in Edmonton Alberta for bringing them to my attention.