Psycholinguistics/The Mental Lexicon

Introduction: What is the Mental Lexicon?
Psycholinguistics is about how language works in the brain. A specific question that one might ask on this topic is, "How are the words we use connected to the thoughts they serve to express?" The answer to such a seemingly simple question is actually quite complicated and far from conclusive. In order for one to transform his or her abstract thoughts into physical words (spoken, written, or signed), these words must first be mentally represented and organized in a systematic, easily accessible way. We call this systematic organization of the words represented in our minds the mental lexicon. The mental lexicon is necessary because without it, linguistic production would be long and labourious and would not accurately represent one's thoughts. An analogy that is often used to illustrate the concept of the mental lexicon is that of a printed dictionary, which is similar to a lexicon. This analogy breaks down very quickly, however, in that the use of language in humans is very multi-facted and does not occur in a robotic, dictionary-like fashion. Dictionaries only allow one to access words by their alphabetically ordered spelling, which is often accidental in a language and does not allow for them to be accessed by any of their other properties (e.g., their meaning) (Fellbaum, 1998). What the more flexible models of the mental lexicon try to do is explain the patterns and regularities that underlie people's knowledge and (sometimes irregular) use of words. Since these rules are not always explicit, there is a plethora of different models and approaches that have been created over the years which try to account for them. We will discuss some of these models and their associated issues here.

Developing a Model of the Mental Lexicon
One of the central issues in developing a model of the mental lexicon is whether the form of a word in the lexicon (e.g., phonological, /kat/, or orthographic, “cat”) is represented with its meaning (e.g., the idea or concept of a cat) in the same lexical entry or if they occupy separate entries (Rapp & Goldrick, 2006). A related question is, “How are lexical entries organized and connected to each other in the lexicon?”  The answer to this question might be fairly straightforward if there is only one type or level of lexical entry, but it gets more complicated if there are many layers of lexical entries for a word (e.g., its meaning, its form, its morphology etc.). Connections within and between layers would have to be considered in this case.

Another key issue when it comes to developing models of the mental lexicon is which experimental task to use in order to test the predictability of the model. Two types of tasks have often been used to do this are semantic categorization tasks and lexical decision tasks (Sánchez-Casas, Davis, & García-Albea, 1992). A semantic categorization task involves presenting two words to a participant on which he or she must make a “yes or no” decision in terms of their semantic similarities. For example, “Is a bird an animal?” A lexical decision task involves deciding whether or not a “target,” presented word is actually a word or not, and this target is usually preceded by a “prime” word, about which no decision is made. (This priming paradigm can also be used in semantic categorization tasks). Reaction times to decisions are then used to infer how closely two words are organized in the mental lexicon, with faster reaction times indicating a closer organization. Other types of tasks used for developing models can be found at the “Words in the Mind, Words in the Brain Project” website.

A final issue to think about is whether studies concerning how words in the mental lexicon are accessed can be used to infer how they are organized in the mental lexicon (Marslen-Wilson, Tyler, Waksler, & Older, 1994). Indeed, the experimental tasks described in the previous paragraph all concern lexical access, the results of which are typically used to map the mental lexicon. It is possible that there may be more than one path to access the words stored in the lexicon, and that the way they are stored may not necessarily match up perfectly with the routes by which they are accessed. For example, one can take the highway to go from one major city to another, but that does not mean that any smaller cities, connected by smaller roads, are not organized closer to the big cities than they are to each other. These are important issues to keep in mind as we look at some of the models of the lexicon.

The Hierarchical Network Model
Early researchers of the mental lexicon viewed it within the greater framework the organization of semantic memory. According to this approach, the forms of words and their meanings do share the same lexical entry, with emphasis placed on their meaning rather than their form. The foundational example is Collins and Quillian’s hierarchical network model (Collins & Quillian, 1969). According to this model, all concepts are organized in a pyramid of interconnected “nodes,” or lexical entries. The most general concepts are found at the top of the pyramid, with specific instances of each concept found one level below it on the pyramid. For example, the concept “bulldog,” and any other instance of “dog,” would be found in a distinct group of nodes on a lower level of the pyramid. Each node in this group is directly connected to the node for the more general concept “dog,” on the level above (see Figure 1). “Dog,” would be found under the more general concept “Mammal,” which would be found under the even more general concept “Animal.” Furthermore, the attribute that distinguishes each concept from the concept above (or concepts beside) it is also noted under its node. Collins and Quillian thought this to be more cognitively efficient because the attribute “has legs,” for example, would not have to be represented under each level of the hierarchy. The central principle is that the more direct connections there are between two concepts, the longer it takes to make decisions about the relationship between them.

This model breaks down, however, in many ways. First, studies with semantic categorization tasks have shown that it takes longer for people to decide whether or not a “dog” is a “mammal” (lower level) than it does for them to decide that it is an “animal” (higher level) (Smith, Shoben, & Rips, 1974). Also, an attribute like “wing,” would have to be stored under two nodes, “bird,” and “bat,” and there is no organization for a “wing” of itself (e.g. Appendages>Wings) ((Collins & Loftus, 1975) rectify this). Finally, the task itself used to test this model, a decision about the meanings of words, rather than the words themselves, is biased towards supporting a meaning-based model (if there is more than one layer of lexical entry).

Figure 1 – Hierarchical Model (adapted from Collin’s and Quillian (1969))

The Semantic Feature Model
To address some of the shortcomings of models like Collins and Quillian’s (1969), Smith and colleagues (1974) developed a model that viewed the meanings of words as sets of semantic features or attributes (Smith et al., 1974). These features can be broken down into two types: characteristic and defining. Defining features are ones that are essential to distinguishing a concept from others (e.g., their most salient feature), while characteristic features are ones that are not essential to this. For example, a defining feature of “robin” is that it is “red-breasted,” while a characteristic feature is that it is "small.” The more defining features concepts share, the closer together they are organized in the mental lexicon.  Thus, looking at the defining features shared between “bird,” “robin,” and “ostrich,” we see that “robin” and “bird” share 3, while “ostrich and “bird” share 2, meaning that “robin” would be grouped closer to “bird” than “ostrich” (see Figure 2).  A hierarchical model, by contrast, would organize “ostrich” and “robin” equally close to “bird.”  Thus, this model allows for more flexibility and levels in connections between nodes (lexical entries).  Another key aspect of this model is that the more concrete a concept is, the more defining features it has, and the easier it is to make a semantic decision about it when compared with another concept.  For example, when performing a semantic categorization task, it is easier to make a decision about the question, “Is a dog a toaster?” than, “Is an animal a thinker?” Figure 2 – Semantic Features Model (adapted from Smith et al., 1974))

When making semantic decisions about words, Smith and colleagues (1974) suggested that the brain first compares general lists of their meanings (both defining and characteristic). If these lists are ambiguous in their similarity, making a “yes or no” decision difficult, then only defining characteristics are used to make a decision. Participant reaction times to a semantic categorization task did support this model, however, there were some problems with it. For example, certain category words showed inconsistent reaction times when paired with other words, and that for very large categories, reaction times tended to be longer than one would predict. An example that fits both of these is the word “animal.”

The Spreading Activation Model
Collins (and Loftus) also addressed some of the shortcomings of his (and Quillian’s) (1969) earlier hierarchical model (Collins & Loftus, 1975). They explained some of the misconceptions people had about this model and adapted it to make it more flexible. This involved breaking down the rigid hierarchy so that direct connections could be formed between any two nodes (lexical entries). While doing this had the effect of making the model look more like the Semantic Features Model of Smith et al., (1974), it was different in that it did not rely solely on semantic feature comparison. In this revised model, objects (e.g., fire truck), features (e.g. red), verbs (e.g., eat) and even the links between all of these are treated as concepts, with distinct nodes. Any two words can be linked together, without any intermediate nodes, and the thickness (or sometimes length) of the link determines how closely organized together those concepts are (see Figure 3).

Figure 3 – Spreading Activation Model (adapted from Collins and Loftus (1975)

The principle idea behind spreading activation network (as opposed to feature) models is that when the node for one word is activated (i.e., when one hears or sees the word), a “pulse” of activation spreads out along its links to other nodes, which are then activated, sending out the “pulse” to through their own links etc. Activation weakens over the length of the links and at each node it passes through until it completely dissipates. This model is very useful in describing how “priming” works in general for many models of the lexicon, with concepts linked to each other (based on various criteria) priming each other through spreading activation. One problem with it is, however, that, according to it, the ordering of the mental lexicon becomes very idiosyncratic from person to person. Another key issue with using this model, and all of the models mentioned thus far, to illustrate lexical organization is that it fails to take into account aspects other than the meaning of words. No separate lexical entries, nodes, or representations are allowed for aspects of words such as their phonology, grammatical class (syntax), or morphology. To account for these factors, Bock and Levelt (1994) proposed a revised spreading activation model that had separate levels of lexical entries for them. This model did, however, continue to emphasize the semantic aspects of words (see Figure 4) (Bock & Levelt, 1994).

Figure 4 – Revised SAM (adapted from Bock and Levelt (1994))

The ACT and WordNet Models
There are several more recent models of the mental lexicon that are computationally based. Like the previous models, the main contention of these models is that words are primarily organized based on semantics in the lexicon. These models are different, however, in that words and their meanings (concepts) are considered to be separate. This is because the words we know are constrained by the concepts we have in our minds, and it is possible to have a concept without a word, but not a word without a concept (Fellbaum, 1998, p.8). Furthermore, knowledge about a concept and how it relates to other concepts is constrained by the environments and contexts in which it occurs most.

The Adaptive Character of Thought (ACT) Model differs from the previously mentioned semantic models in that is does not solely use associations in the declarative knowledge, or factual meaning, of words to organize them. In addition to this, the ACT Model uses procedural knowledge, or how words relate to each other in terms of their function, to organize them. A computer using the ACT Model learns how likely words are to occur together and then activates “chunk” structures based on their shared functional context. For example the word “game” is organized closer to “play” because it appears in the same context as “play” quite often. Thus, the ACT Model organizes words together based on their real-world, practical relationships with each other, not just abstract meanings (Anderson, 1996).

The WordNet electronic lexical database organizes words into “synsets,” which are further organized into a hierarchical network model. A sysnet is basically a list of all the lexical entries, or synonyms, that can be used to articulate a particular concept in a particular context. For example, the word “shot” can be used to articulate the concepts “drink,” “injection,” and “pellet,” all in different contexts. There would be no single synset, however, for “shot” that would include all the other synonyms at once (Fellbaum, 1998). This lexical entry would be included once for each synset, making only one specific concept itself, or “gloss,” the defining feature of a synset. Not all lexical entries have exact synonyms, however, and this causes one to ask how sysnsets could be linked together to form a complete hierarchy for all words. In the case of nouns, this problem is overcome using the notions of hyponymy and hypernymy, types of non-exact synonymy. A in the word pair “robin/bird,” for example, “bird” would be a hypernym of its hyponym “robin.” Hyponyms are grouped under hypernyms, similar to the Collins and Quillian’s (1969) hierarchical model, with the hypernyms at the top of noun hierarchies called “unique beginners.”  This model has a similar problem to that of Collins and Quillian’s, however, in that concepts that are functionally related in certain contexts, such as “net,” racquet,” and “ball,” cannot be organized together. This model would not predict semantic priming between these words, which is not what occurs in reality, and this is known as “the Tennis problem” (Fellbaum, 1998). Thus, these “discourse semantics” cannot be accounted for as effectively by this model as it can be by ACT and spreading activation models.

The Logogen and Autonomous Search Models
Though primarily concerned with lexical access, the autonomous search model proposed by Forster (1976, 1989), and the logogen model proposed by Morten (1969, 1982), both incorporate distinct levels of representation in the mental lexicon for the form (phonology and orthography, aural and visual) and meaning of a word (Marslen-Wilson et al., 1994). See the section on lexical access for more information on these models.

The Connectionist (Associative) Approach
The models presented thus far assume that words are organized together in the mental lexicon based on their shared meanings alone. It is important to consider that there may be no localized “mental lexicon” per se and that knowledge about words may be treated like any other type of knowledge, as Seidenberg and McClelland (1989) suggest. Their connectionist model is a spreading activation model (or Parallel Distributed Processing model), in which a word’s lexical representation and information is not localized in any one node (as in Collins and Loftus, 1975), but distributed across many nodes. As they put it:

“Knowledge of words is embedded in a set of weights on connections between processing units encoding orthographic, phonological, and semantic properties of words, and the correlations between these properties” (Seidenberg & McClelland, 1989, p. 560).

Thus, when any of the properties they mention (phonology, orthography and meaning) are activated at once the connections between them become stronger, like neurons firing and wiring together in the brain (R. E. Brown & Milner, 2003). These connections are mediated in a bottom-up process via a small number of “hidden units,” which are connected to the much more numerous “input units” that represent orthography, phonology and meaning. The hidden units cluster together inputs that co-occur (or “fire”) together (see Figure 5). When connectionist models are tested on the computer, as the weights between units are refined over time, they tend to group words based on categories such as “noun,” “verb,” “animal” etc. (Elman, 2004). This logical grouping of words that occurs via a completely bottom-up process flies in the face of the models discussed previously, which tend to operate in a more top-down manner. This approach, therefore, suggests that words are organized purely by associations between words as they are encountered in the world, with no "hard-wired" rules for organizing them in the brain, as suggested by the next approach we will discuss. Figure 5 – Connectionist Model (adapted from Seidenberg and McClelland 1989))

The Morphological (Rule-based) Approach
One aspect of words that is often neglected by models of the mental lexicon is morphology. Some models seem to simply acknowledge that morphology is there, lurking in the background, while others (e.g., connectionist models) discount the need for such a rule-based system for organizing words (Pinker & Prince, 1988). Though Bock and Levelt (1994) did include morphology in their model, they did not elaborate on how distinct a level of representation it can be.

Morphemes are the smallest (formal) units of meaning in a word. For example, the word “cleaner” is composed of two morphemes “clean” (verb) and “er” (one who performs verb). These form-meaning overlaps are consistent and generally operate in a rule-like way, causing some to suggest that the mental lexicon may be organized (at least in part) by these rules. The first study to support this idea was by Stanners and colleagues’ (1979) (Stanners, Neiser, Hernon, & Hall, 1979). They reported priming, or faster reaction times, for words that were preceded by their inflected past tense, “ed,” forms compared to those that were preceded by their irregular past-tense form. Then, in in a series of lexical decision, masked priming tasks (Forster & Davis, 1984), Marlsen-Wilson and colleagues (1994) found aural-visual priming for root words preceded by one of their (morphologically) derived forms compared to orthographically and semantically related pairs. The conclusion from studies such as these is that words that share a morphological root may be organized under that root in the mental lexicon, at least for monolinguals (see Figure 6). Some researchers are even beginning to ask now whether this morphological mental organization holds true across languages for bilinguals, but results are still far from conclusive (Voga & Grainger, 2007). As for words that appear to share a root, but actually do not (e.g., “corn” and “corner”), it is suggested that these are organized in their own, separate lexical groups. Finally, irregular past-tense forms of verbs are also supposed to be organized separately, not under the present-tense form of the verb, meaning that irregular words must be memorized as whole-words (Pinker, 1991).

Figure 6 – Morphological Model (adapted from Voga and Grainger, 2007)]

As with the semantic models, which are supported by semantic categorization tasks, the tasks used to provide support for the morphological organization of the mental lexicon may be biased towards this conclusion. Seidenberg and McClelland argue that lexical decision tasks do not adequately tap the semantic components of words, meaning that other aspects, such as morphology, are favoured (Seidenberg & McClelland, 1989). Furthermore, Seidenberg and other connectionists believe that morphology should not have a distinct input unit or layer in the model, and that, like words in general, it is represented in a distributed fashion in the mental lexicon. They describe morphology as “the consequence of the interactions in a dynamic system that maps meanings onto forms and vice versa” (Gonnerman, Seidenberg, & Andersen, 2007, p. 341), or simply put, the overlap in form and meaning that is an inherent component of morphology.

A number of recent studies have suggested that this is not the case. These studies have shown that morphological priming effects can be seen as greater than the summed effects of orthography (or phonology) and semantics (Marslen-Wilson, Bozic, & Randall, 2008; Feldman, 2000; Rastle, Davis, Marslen-Wilson, & Tyler, 2000). There is something about the morphology of words that is goes beyond just the simple form-meaning overlap that is inherent in morphologically related words, suggesting that it is distinctly represented in the lexicon. In sum, whether morphology can be said to be most basic level of organization of the mental lexicon is still hotly debated, but there at least seems to be solid support for it to have a distinct representation in the lexicon.

Neuroimaging and the Mental Lexicon
Up to this point, we have been discussing theoretical models of the mental lexicon that focus on connections between nodes and inferring the “spatial” organization of the lexicon through these connections. While these models may look good as neatly drawn diagrams on paper, is there any basis for them in the actual physiological structure of the brain? Several neuroimaging studies suggest that the various components of the lexicon are distributed widely throughout the language areas of the brain. They also suggest that properties by which words are grouped in the brain are somewhat different than the models mentioned above. For example, using fMRI and syntactic and semantic violation tasks, Newman and colleagues found differences in brain activation during syntactic and semantic processing (Newman, Pancheva, Ozawa, Neville, & Ullman, 2001). They found that certain areas of the frontal lobes were more active during syntactic processing and that certain areas of the temporal and parietal lobes were more active during semantic processing.

There is also electrophysiological evidence for hemispheric distinction between open-class or content words, such as nouns and verbs, and closed-class or function words, such as conjunctions and prepositions. Specifically, a larger N400 wave is elicited over the (frontal) left hemisphere for closed-class words (C. M. Brown, Hagoort, & ter Keurs, 1999). This suggests that perhaps closed class words are organized in the left hemisphere, while open-class words are more predominant in the right hemisphere.

Learning Exercise, Part A – Develop Your Own Model
One concept that is hard for many to grasp is that of what exactly a lexical entry in the mental lexicon is. Understanding this concept is crucial to understanding the mental lexicon because, simply put, the lexicon is made up of lexical entries. Furthermore, the precise criteria used to define a lexical entry have a direct impact on how they can be organized. For example, if you think that a change in word meaning is all that is necessary to create a new lexical entry, it follows that its phonological properties cannot be used to organize it with other words in the lexicon. Think of the analogy of a printed dictionary previously introduced to illustrate this point, and answer the following questions on a piece of paper. Click on the link below each question to find sample answers.

Q1. How are lexical entries organized within a dictionary and what sort of information do they contain?

- Mental Lexicon/Learning Exercise A/Answers

Q2. Are these properties organized similarly in any of the models discussed so far?

- Mental Lexicon/Learning Exercise A/Answers

Q3. As mentioned in the introduction, the analogy of a printed dictionary breaks down very quickly for many reasons. Try to name a few.

- Mental Lexicon/Learning Exercise A/Answers

The weakness of the analogy of a printed dictionary, which is orthographically based, caused early researchers to propose semantically based models.

Q4. Which of the following analogies would you use to describe the semantic models of the mental lexicon? Why?

a.	Dictionary (printed)

b.	Lexicon (printed)

c.	Thesaurus (Printed)

- Mental Lexicon/Learning Exercise A/Answers

Q5. What about the morphological account? Which analogy works best for it?

- Mental Lexicon/Learning Exercise A/Answers

Q6. Is it appropriate to use the analogy of a printed book for connectionist models? What about the neuroimiging evidence?

- Mental Lexicon/Learning Exercise A/Answers

Q7. Think about these questions: Are these models missing anything? Do you think other aspects of words should be accounted for the mental lexicon? What would it look if each aspect of a word were represented in separate entries as opposed to a single lexical entry? What do you think the most important features of a word are (which ones occupy there own node)? Would models based on these features be more or less efficient than other models? What are the implications for how these aspects are connected and organized? Draw a diagram of how you envision the lexicon being organized, keeping these questions in mind. Also, include predictions about words, their access and organization that your model makes.

- Mental Lexicon/Learning Exercise A/Answers

Learning Exercise, Part B – Create Your Own Task
Now that you have developed your own model of the mental lexicon, it is time to test its practicality. There are a number of things you need to do and questions you need to ask to do this. To illustrate this point, we will create a task to evaluate the morphological account of the mental lexicon. Sample answers to questions can be found in the following sub-section.

Q1. First, come up with some hypotheses or predictions that would arise from this model if it were valid. This involves creating a list of factors and associated variables that can be manipulated to distinguish participants’ responses during your task as supporting or refuting your hypothesis (see example below for answers).

Q2. Next, you choose a task to test your hypotheses. Should you use a lexical decision task, a semantic categorization task, an L1/L2 translation task etc.?

Q3. Now, create the specific items to be used for eliciting responses from participants in your task. These responses will be used to evaluate your model.

Q4. Finally, think about what types of responses you want to elicit from participants in your task and what your results might look like if they are to both support or refute your hypotheses.

Example Task: Lexical Decision
A1. For example, if you believe that the mental lexicon is organized morphologically (by root word), you might predict that people find it easier to make decisions about words that share the same root-word than words that do not. Therefore, one factor that you will manipulate is the morphological relatedness between words.

A2. You could use a variety of tasks to test your model’s hypotheses, such as a Picture-Naming Task, whereby a word (prime) is presented on a computer screen followed by the picture of a second word (target) that shares the same root as the previous word. The participant’s task is to name the picture as quickly as possible. Reaction times to morphologically related word-picture pairs could then be compared to those to non-related pairs, with faster reaction times indicating closer organization between the two items in the lexicon. You could also use a lexical decision task or semantic relatedness task in a similar fashion, all with the aim of eliciting response from a participant by asking them to make a decision about words.

A3. For your items for a picture naming task, you could create twenty sets of word-picture pairs that are morphologically related and twenty sets that are not (for balance). You could also have the same primes, but use words rather than pictures for targets for a lexical decision task. In creating item lists, you would also need to consider the other (competing) ways that the lexicon could be organized and create items for them as well for comparison. For example, the lexicon could be organized phonologically, semantically, or both! Equal numbers items that represent all of these possibilities would need to be created for balance.

Below is a mock wordlist for a lexical decision task. Notice how equal numbers of items have been created for each type of relationship and identical categories and numbers of items have been created for both the word and non-word categories. Also, semantically related non-words are impossible to create, thus, filler words are have been created for consistency. Finally, in reality, more words than these would be needed to create an average reaction time for each participant, which is then further averaged across participants to create the final results. Again, faster reaction times would indicate closer organization in the lexicon.

Prime-Target Pair List, by Relationship Type

The Mental Lexicon, Learning Exercise, Lexical Decision Task

WORDS

NON-WORDS

A4. Here is an example of a mock lexical decision task used to test the hypotheses from a morphological account of lexical organization. Here, participants must respond as quickly and as accurately as possible to if the target word (in caps) is a real word or not. This task has been adapted and simplified from the ones created by Marslen-Wilson and colleagues (1994) and Voga and Grainger (2007). The word list presented above has been used to create the stimuli for this task and Excel was used to randomize the order of all the item-pairs and put them into the PowerPoint file for the task. Click on the link, Example Lexical Decision Task, and follow these instructions:

1. Click in the lower right-hand corner to maximize the slideshow. Follow the instructions in the first slide.

2. Rather than providing input in the manner described, you can just click the "play" button as though you were making a lexical decision to advance through the slides.

3. You should be able to download the actual file for yourself.

4. You can use PowerPoint to create your own task, setting the timings and user inputs for each slide as desired.

(In reality, specialized software other than PowerPoint would be used to present items and capture input from participants. Typically, significant differences in presentation speed and reaction times are in the hundredths of a second, and PowerPoint is not designed to operate at such high speeds.)

Results:

For the morphological account to be supported, reaction times would need to be significantly faster for morphologically related word pairs than for other types of pairs. When reactions times to word-pairs in one group are faster than to word-pairs in another group, as in this case, this is called “priming” (the opposite of this is “inhibition”). Morphological priming supports our model. Incorrect and non-word responses would not be used in the analysis, and typically many more items than the ones used here are used to elucidate these differences.

Conclusion
The mental lexicon is a fascinating topic of study, an understanding of which is very helpful for language researchers because it touches on all aspects of psycholinguistics. There are many angles from which to study the mental lexicon and it is certainly much more complicated and flexible than any printed dictionary, for it is the interface between what we think and what we say. Accordingly, there is still a lot of debate about which model is best and whether it is even possible to create a universally applicable model. It is important to remember, however, that no model is without its merits and each model discussed here has contributed to our understanding of the mental lexicon in a unique and invaluable way.