User:Atcovi/Spring2024/Psyc 410 Human Cognition/Ch. 3

Chapter 3 will be discussing perception and mental imagery.

https://quizlet.com/878716703/ch-3-human-cognition-flash-cards/?new

3.1 - Perception as a Construction of the Mind

 * Aim: Explain why visual perception is a construction and why it is so challenging. Construction does not refer to just accessing or receiving data [this is the 1st stage]. We take this info and use all knowledge to construct the situation that is causing the sensation. The cause is NOT in the data, it's from us figuring out and coming up with our conclusions.

We will be discussing perception, but using vision as the example, because: Perception is an information process/cognitive ability, by which we recognize and interpret the information of the senses (these senses are sending info from our environment). We take that data and match it up to our long-term memory and figure out what the data could be.
 * 1) Easier to demonstrate.
 * 2) Easier to study in the lab (present stimuli, measure reaction times).

Sensation is simply light reflecting off an object, hits your eye, gets inverted, hits your retina and those lightwaves get different receptors in the retina to start firing action potentials. You basically figure out the color, then the space, then the shape, and then... have I seen this object before? After figuring this out, THEN you are ready to interact with your world. This is the 'detection' phase. Essentially eyes [for example] sending action potentials back to the brain.

Transduction is the conversion of physical stimuli electrochemical signals used in neurons. What is the actual chemical process going on in the sensory nerve cells that environmental signals --> action potential firing?

An example of perception vs. sensation is our optic nerve (or blind spot). There should be a part in the eye where we have no understanding of stimuli, but our perceptual processes are filled in by our mind before consciousness.

Visual illusions are useful because they show where our mind is doing detective work and filling gaps in for us.

Bottom-Up Influences on Perception

 * Perception depends on what information the mind chooses to use from our five senses.
 * Bottom-up information (data driven information: info driven by the environment/stimuli/computer system [inputs]) --> Sensory information from such sensory system, from receptors to the primary sensory cortex for that sensory system. Any part of a perceptual system that is driven by the data = bottom-up information.

Figuring how those sense organs work as it tells us what type of data is coming in. We take in the information and process the stimuli according to our existing knowledge.

Stages (according to graph)


 * 1) Electromagnetic waves coming from a light source, hitting ink which is absorbing most of the waves in the visual spectrum. Absorbs all wavelengths except red.
 * 2) Light gets collected on a light sensing region of the eye, aka the retina (where all the sense organs that react to light are located, rods and cons are in the back of the retina). The retina takes a 2-D, or flat, image of the world. Most of our color sensors are located in the fovea (what is right in front of us, gets processed in the fovea). When light goes through the lens, it gets inverted. All the images processed are upside down! Our optic nerve is shaped oddly, wher ehte neurons are axons are closed to the pupil than the brain.
 * 3) Signals sent to the brain, but in different sides. On the left side of both eyes, it is known as the left visual field. All senses coming from the left visual field are being processed in the right hemisphere, and vice versa. Image is split in half and processed in independent regions of the brain.
 * 4) Goes to our occipital lobe which is dedicated to visual processing and perception. In the occipital lobe, it is the home of the primary processing of the visual information (feature processing). For example, if its a line, then the line neural detectors in the occipital lobe sends this information to other parts of the brain.

Top-Down Influences
Top-down processing is when the process is being driven by knowledge and experience. A lot of processing we do IS top-down processing. This is not conscious, these are unconscious inferences. More of the decision-making process. A necker cube is a three-dimensional transparent cube at a specific angle.
 * Knowledge and expectations that influence and enhance our interpretation of sensory info.
 * Feedback connections may resolve 'oppenness to view' of a course feedforward signal subject to multiple interpretations.


 * 1) The bottoms, lines, and ankles are result of bottom-up processing. We can feel the faces switching in and out. Incorporating top-down processing would solidify our interpretation. The front faces are easier to see since light is being incorporated.
 * 2) Is the image a duck or a rabbit? If you saw the beak first, you'd think of it as a duck. If you saw the eyes, you would've thought of it as a rabbit.
 * 3) Its sort of like the so-called "inappropriate pictures", where an image looks to be inappropriate/sexually explicit, but is not. Your bottom-down processing system sees the image for what it is, according to its light/dark patches, edges, and color transitions (& the order of all of these!). Then it calls you out for having a "dirty-mind" ("an image of a sexual act"), which is the knowledge-driven top-down processing aspect of your information-processing system. We are applying our knowledge of... well... what we may have seen/know about before. Here's the example. In fact, two people could view the same stimulus and think differently (man reading a book vs. oral sex) - and what one may see as a book, is a top-down driven process where you rely on your previous knowledge of books to get to.

Context really affects what we are seeing. What environment are we in? What are we doing? For example, when you are in school, you can recognize Mrs. Sims - but at the gun range, you're probably not going to recognize Mrs. Sims. Top-down information facilitates object recognition.

Experience shapes perception. Experience tells us objects are illuminated from above. Shadows are powerful for depth and spatial relationships. These inferences are... Unconscious inference refers to educated guesses about visual clues. We don't scientifically reach these conclusions we make in milliseconds. Problem-solving works with what we percieve is out there, not necessarily what's in the environment.

For example (Figure 3.4), all the circles are 2-D, yet they look like they are jutting out at us or they are caving inwards from us. This is because of light. Light comes from above and the shadow is being made at the bottom portion of the circle, therefore giving an outwards appearance (you don't put lights at the bottom of the floor vs. the cieling, do you?). Light comes from above and the object is inset, it gives the inward appearance since the shadow is showcased on the top portion and the light is in the bottom portion.

Actually... the images are the exact same, its just the circles have been rotated 180 degrees. The retina is giving the exact same bottom-down processing information, its the top-down experience/perceptual system that is making the circles look 3-D (the "knowledge-driven process").

Predictive Coding
The 'visual brain' makes predictions. Predictions are made faster when people are cued into what they perceive. AI predictions are hard because they do not have the context needed. For example, if you hear a language you've NEVER heard before, then you won't be able to discern between words and you would think a sentence is a fully continous stream (hence why people who are new to Arabic cannot listen to the Qur'an and know when the next verse will play).

Predictive coding predicts what input the eyes are about to receive. Predictions and expectations guide the mind to the most-likely interpretation. Context in prior experiences is crucial for perceptual predictions.

'''Is Perception 'Cognitively Penetrable'? --> How far does this top-down inference go?'''


 * Some researchers believe that beliefs, knowledge, and motivations can change perception.
 * Some researchers believe that some studies confuse perception with people's fragments about what they have seen.

Cognitively impenetrable perception is NOT influenced by beliefs, knowledge or motivation. Perception is just people's fragments, according to this.

Top-down processing is very much involved in perception (how else do we discern a blurred object as a hairdryer or screwdriver? Context!) - but is there a limit?

This topic is a dig at consciousness. Are we using our conscious beliefs to guide our perception? It could be both!
 * 1) For example, if you really believe in ghosts, do you actually perceive ghosts? Do you perceive information differently? Was the light a ghost or just light from the moon? This is the 'limit' thing we are looking at!


 * Rotating snake = perception is NOT based off of our beliefs, knowledge or motivation. This image is NOT an animation, its an image.
 * Size of 2 otters = perspective/relationship of the otter to the columns. All images are 2-D, so the depth comes from: angles of hallway. Our perception in this image means the center of the picture is farther away, therefore the otter that's 'farthest' is massive while the otter that is 'closer' to the image is smaller.

3.2 - Challenges of Perception: It's Not Easy Being Seen

 * Aim: Describe how perception is a combination of sensory stimulation and the mind's detective work. The brain will make assumptions/fill in the gaps and provide us data, in fact, we may not be working with reality, we are working with our brain's interpretations!

''The feeling on your fingers, taste on tongues --> neural firing --> brain takes that info, turns it into sounds/pictures --> what decision do we make based on this? [VERY COMPLEX])'' Object segmentation is visually assigning the elements of a scene to separate objects and backgrounds.


 * To recognize objects and their locations, you need to separate them from the background.
 * Bottom-up cues can be insufficient for a computer to distinguish objects.
 * Animals use camouflage to take advantage of the lack of clear boundaries between objects.

[[File:Rubin's Vase.png|thumb|Rubin's Vase: Is it a vase or two faces?

It is hard to discern between background and object and our perceptual system jumps back and forth between 2 faces and 1 vase. A 3D effect from a 2D image.|236x236px]] Our perception system needs to figure out the figure-ground organization.


 * An aspect of object segmentation where it's not always evident which side of the boundary belongs to an object (or figure) and which side of the boundary belongs to this background (or ground).
 * This ambiguity is demosntrated by the Rubin vase, where you can see two faces looking at each other or a single vase.

What cues do we use to assign figure and background?

See Figure 3.11 image for added detail.


 * 1) Enclosure - If an object has a fully enclosed area and it is contrasted with an area that is a different hue/lighting/texture, the enclosed area will be seen as an object.
 * 2) Symmetry - Many things in nature are symmetrical, so things that lack symmetry are more likely to be assigned as the background of an image. The black objects in the picture are symmetrical, to which if you cut them in half, the left and right pieces of the object are identical - whilst the white object is NOT symmetrical.
 * 3) Convexity - Something is curved outward (object), while concavity is something that is curved inwards (background).

Though, all these drawings are incorrect, there isn't a background or object to any of these pictures.

Another challenge to seeing objects within their context despite not seeing everything. This is occlusion. For example, when we see someone's arm extended off picture, we don't think the arm magically got 'sawed off'. The boundary extension, or noting that pictures are extended beyond their edges, are evident. The brain 'fills-in' the missing pieces, known as amodal completion.

Perceiving a 3D World
Inverse projection problem


 * Input to our [flat] retina 2-D and must be converted to a 3-D representation.
 * Multiple 3D representations are possible from a single 2-D image. An example is a rectangular paper being held at an angle, where our top-down processing has to make the inference that one point is further away than the other point.
 * Top-down information is needed to disambiguate, or differentiate, the possiblities.

Visual cues underlying perception of a 3D world

Monocular depth cues: cues that your mind uses to construct a 3D understanding of the 2D image cast on your retinae. Relates to object segmentation and figure-ground orientation.
 * Binocular disparity - the closer something is to you, the greater the difference between what your two eyes see. For example, if you close on eye while opening your other eye whilst observing your single finger right in front of you and bounce back between which eye is open/closed, it looks like your finger is bouncing back and forth. The distance between these two images is the queue of depth. This isn't the case for objects 20+ ft away from our eyes.
 * Binocular depth cues require both eyes to be effective.

The reason we see the sky as blue is because blues have a lower frequency/longer wavelength, so they are much less likely to bounce off the atoms in the air. The further we are away from the sky (pretty far), the more blue light we see and the less red/yellow light we see [we see this in art as the "atomspheric perspective"].

Percieving a 3D world is based on the principle that the objects and size of objects are consistent. Object constancy is the ability to recognize objects despite different orientations. Size constancy is the stability of perceived sizes of objects despite radical differences in ther image size on the retina. The ant in the image is so well detailed that it does not appear close, it appears massive.

We use the apparent size as a cue to help determine its distance.

Other assumptions include color constancy, when our visual system factors in differences in illumination when shaping our color perception, and lightness constancy, when we factor in illumination conditions when perceiving the brightness of things.

When light shines on dif. objects, it leads to dif. wavelengths of light. Green in the light vs. green in the shade example (light reflecting more off top of shirt than bottom of shirt. This is cruicial to be able to parse what an object is and tell its the same object in dif. lighting conditions.

If I shine a flashlight in a black room, its going to be MUCH brighter than if we used the flashlight in a white room.

Object Recognition
Object recognition is difficult because of ariation in shape, orientation, and lighting conditions. Agnosia is when you have trouble recognizing objects and matching to correct categories and labels. That's a chair, that's a bird. Trouble with bottom-up processing.


 * Apperceptive agnosa - impaired early vision, where people cannot perform even the simplest visual feature tasks. They cannot combine the features of an object together. They cannot detect shape.
 * Associative agnosia - impaired late vision, where people cannot name or categorize objects. They can't match up the object to a label. They cannot name.

Perception is a long-term process, where we take cues from our environment and match it up to the knowledge stored in our long-term memory. Our ability to recognize perceptual images is dependent on matching them with stored representation.


 * Trying to decide if two images are of the same thing depends on familiarity.
 * For facial recognition: familiarity allows recognition despite differences in photo quality, lighting conditions, and facial expressions. The low-level processing would be: yes, this person has eyes and a mouth - then the last part is, this is Mary, right?

'''...what is the cognitive process? How are the end results of the basic perceptual processes of object segmentatin and figure-ground orientation match up to our identification (this is a cup)?''' View-based and structural approaches to object recognition.


 * View-based approach claims that we match images to representations that are like two-dimensional pictures or 'templates'. Focus on overlapping of images with long-term memory (stimuli & long-term memory), and if the right amount of overlapping is evident, the neural firing will go off in identification.
 * Template matching - representation that fully describes the shape of an object. Natural images (various different shoe types) may not befit (an example that WORKS is the numbers on a mail letter). Anything more than numbers & lines are WAY too complex for AI. For the shoe example, you would need a billion templates of a shoe to recognize any random shoe, as a shoe!
 * Structural descriptions - turning 2-D images into 3D models in our long-term memory. Its these 3D models we are trying to match up and move around in any angle: does the 2D representation match? We are trying to process a 3D world even though what we are processing is 2D information. Items in nature are made up of a limited number of 3D shapes (cones, squares, rectangles, etc.). In structural approach, we are recognizing different geons (primary components of 3D shape) since all shapes are stored as sets of geons. These can be viewed and matched up from any angle as they are 3D representations. An example is a coffee mug is just a bunch of cylinders... anything that overlaps the image there will be recognizable as a coffee mug.
 * Holistic perception: Do we recognize objects as a whole? Or do we process things like geons (structural) approach? Answer is mixed. For example, people can process parts of cars but have trouble processing faces and judging parts of the face (eyes, for ex.) as if they are the same or not.

Birding
Experts can discern between which and which in a specific category (bird watchers, sports car fanatics, etc.). Fusiform gyrus was activated in faces and in the area of expertise. So... the fusiform gyrus is not just for faces, but also for anything we are doing grain wholistic property processing. So instead of a face, that's Mary. Instead of a bird, that's a woodpecker. Instead of a car, its a Nissan Sedan 2006.

Look at Figure A in Research Focus 3.1: ''Is the fusiform gyrus just a face area that is being co-opted by other things by other things when we learn them? Or do we just share facial, holistic expertise since that's an expertise all humans share?''

This distinction in our ability to use our knowledge to drive perception is not only key in understanding how we humans/animals overcome the challenges in perception, visual, and other domains, it guides our search to make machines that can perceive (AI).


 * Deep learning - form of AI that uses deep neural nets, brain-like algorithms that analyze images to process, categorize, and label natural images. For example, text-to-speech software is an example. How is this doing this? We train computers to have that knowledge. You can input nodes (distance between eyes) --> series of networks that activate other nodes (distance between mouth and lip) --> otuput node: decision to make! Is this someone that I know or not? We are training a neural network, not just computer programming. Wrong answers get network connections turned down, whilst right answers get network connections turned up/strengthened.

AI is getting better through trial-and-error. We are getting more realistic neural network models and better ways to train them in order to get top-down models that are necessary for perception.

Now we transition from identifying and recognizing an object, we will see how perception guides our action!

What, Where, and How [perception guides our action]

 * The perception pathway determines what is located where and goes from the visual cortex ventrally to the temporal cortex. This is the "what" pathway. That's a red, bouncy ball!
 * The action pathway uses perceptual information to guide ongoing actions. This is the "where" or "action" pathway since this goes to our parietal lobe (sensors from our skin come in/plan our actions and how to move our body). This information is getting processed parallel [separate] from other regions of the brain. This is responsible for the 'fast reflex' used in an incoming object (when a ball is chucked at you and you identify it only after you swat it away).

'''How do we know about the action pathway? What's the evidence? MONKEY STUDY!'''

...from research done on animals. A lesion study was done.

In a set of subjects, the inferior temporal cortex was damaged. The temporal lobe is for categorizing (recognizing) while the occipital lobe is for basic task perceptions (detecting features). In another set of subjects, the posterior parietal cortex was damaged.

In the tasks, the monkey were trained to find food in different situations. They would've learned that under a certain object, food was present. Object discrimination was essential here. In the location discrimination experiment, the food was in a foodwell that was closer to the object placed. These two processes are parallel and separate.

The monekys with damage to their inferior temporal cortex were NOT able to practice object discrimination. They cannot identify was objects are. The monkeys with damage to their posterior parietal cortex, they could NOT do the location discrimination task. This is known as the double disassociation task.

In fact, when it comes to action pathway, its not only the "where" information that gets damaged, but also the ability to line up and do actions. Figuring out the orientation of an object is PART of object location, since we need to match up an object even if it is at different orientations.

3.3 - Mental Imagery

 * Aim: Describe the relation between perception and mental imagery. What are the mental processes behind imagining a song/past event?

What are we doing when we imagine things that are not there?

Moving away from perception and going into imagination...

Mental imagery is an act of forming a percept in mind without sensory input. Imagination is a cognitive process and is important in our decision-making and problem-solving. Some people have vivid mental imagery while others don't. Some people see things as a movie while others are unable to form a picture of it.

Aphantasia is the inability to form images in mind. When we are thinking, do we only think in words? Or are we thinking a 3D world?

Mental rotation is the ability to compare and match rotated images. This task shows dif. 3D objects at different orientation, and ask: Is the second object the SAME object or a mirror image? In this task, it seems every participant is using mental spatial images + spatial imagination because their reaction times all increase with mental rotation.

The Long, Philosophical Debate
Depictive vs. propositional explanations of mental imagery are so...

Will someone that loses their perceptual pathways also lose their ability to do spacial mental reasoning?
 * Depictive explanation - The brain represents mental images like it represents real images coming through the eyes (Kosslyn). When we do spatial tasks, we always do it with certain regions of the brain. It doesn't matter if those regions of the brain get activated by sensory input or by attention/long-term memory, we are still using the same perceptual information in both tasks. Imagination = perception without the object. An example of a study Kosslyn did is the island study. "Analog", like an analog clock vs. digital clock, is basically images.
 * Propositional explanation - Mental images are held in a post-perceptual, abstract way, more like a linguistic description than a picture (Pylyshyn). The whole point of perception is just to activate our conceptual information. Looking at a chair/hearing the syllables of the word 'chair'/sitting in a chair, for example, would activate our conceptual knowledge (all previous info) about a chair. When we think of a chair, what does this theory believe? Mental images of a chair [for ex.] are post-perceptual (abstract information). This is more 'parsimonous' since some people don't think in images (object to the left, for ex.).

The two models can make information models and do spatial reasoning. This highlights parismony, the simplest explanation should be accepted until contradictory evidence.

Imagery and Spatial Neglect
Spatial neglect is when patients cannot visually attend to objects on one side of their visual fields. Damage to the perception processing ('what' and 'where' pathway) parts of the brain had shown spatial neglect as well.

Dissociations between 'what' and 'where' in mental imagery mimic the what/where disassocations found in perceptual abilities.

This suggests that perception is really a process of not only taking info from our senses from stimuli and accessing the categoral information, but it's the system where we activate our categorical info of things NOT in our environment and use the spatial knowledge of that categorical info. ''If I move this couch to the other wall, will it fit? How can we make our world a better place?''

Mental imagery elicits stronger emotions than verbal thinking does.

Why Do We Care?

This type of thinking between thinking in words vs. images aids us in assisting those that may have brain damage and how to elevate them with certain difficulties they may be facing + see why one may change their behavior dramatically and treatments needed. When we imagine things, emotions are STRONGER.


 * For patients with PTSD/anxiety, images are rated more real than in people without those disorders. For example, nightmares of an attack/rape that took place beforehand.
 * With depressed people, imagining suicidal acts is associated with an increased risk of suicide.
 * Learning to imagine positive futures may help with depression. Used in cognitive-behavioral therapy.
 * Playing video games distracts people from rehearsing trauma in their minds, leading to weaker, less vivid emotional memories.