Evaluation Theories/Contingency Theories of Evaluation

Announcements: (13:02:58)

Class on Contingency Theories, Evaluation Design and Methods; and implications for practice. (J: how can this class be on contingency theories when the assigned reading consisted of two tiny case studies in social program evaluation?)

‘’’Outline of this Class:’’’


 * 1) Who has informed your knowledge about those domains?
 * 2) What recommendations?

Intro by Silvana B.
There isn’t one right answer in this field, which can be uncomfortable, because in school we’re here to find, “what is the best practice?:
 * 1) Evaluation is a growing field that has yet to be perfectly conceptualized
 * 2) There is a lot of diversity.

So this is a challenging topic.

We’re going to look at what this can look like. Why do we need contingency theories of practice

Diversity:
 * 1) Clients
 * 2) People
 * 3) Sectors
 * 4) Funding & Accountability Requirements
 * 5) Disciplinary Approaches

(J: 1) How does this connect to “diversity”? 2) Are these mutually exclusive and collectively exhaustive (Ethan Rasiel, 1999, 'The McKinsey Way', p. 6))

We will look on the “Practice” component of Evaluation

Contingency Theory of /Practice/
 * 1) /Social Programming/
 * 2) /Knowledge/
 * 3) /Values/
 * 4) /Use/

Shadish, Cook, & Leviton in last section provide a great discussion of how we know about these (p. __ to __?)

Social Programming
They had this idea that was not necessarily realistic, and over time we’ve developed a better picture of how social programs change and how

We have to think about wanting to create social change with a realistic perspective: that we don’t expect all this change to occur just because we’ve provided somebody with evidence.

Our evaluation practice can come from having a really good understanding of how social change occurs.
 * 1) Political Context
 * 2) Organizational Structures
 * 3) Not all parties will be interested in change
 * 4) Voices besides evidence are vying for the attention of decision-makers.

We don’t yet know how to fix this, but there’s general agreement on these tenets:
 * 1) Change is hard
 * 2) ameliorating social problems happens slowly
 * 3) (see ppt)
 * 4) (see ppt from PSYCH 315z – Comparative Evaluation Theory 2014-04-02)

Q: Are we drawing bounds around social programming, or defining A: I’m looking at social programming. ..

They pose these questions for evaluation of social programs:
 * 1) ‘’’Is the public interest better served by radical or incremental change?’’’
 * 2) ‘’’Should the evaluator identify and work with change agents or merely produce results and leave decision-making to others. ‘’’

Your opinions on questions like this will dramatically shape the way you practice. (J: Essay: what is your opinion on these questions? (500 words each, or 700-1000 combined if there is duplication).

Q: Do you ever experience multiple types of theorists in a scenario where not only one is selected, what would the evaluation results look like, in for example the four different theorists? (J: They wouldn’t look at the same things in the first place, because of their very different interests and perspectives.) A: In a perfect world, all of our research on evaluation would involve unlimited research where we can test the value of these techniques in comparison to each other. The questions we can ask are, “Do we get a better understanding of the evaluation approach of different users?”

Q/C: It almost feels like we’re scientists trying to discover gravity. . . and saying, “but its really impossible to prove, but hopefully we’ll find a way to all get on the same page. . . “ (J: You didn’t keep up with the

Use
This has received possibly the most attention within the field of (Program) Evaluation.

Some areas that we can agree on are:
 * 1) In the beginning we focused on a small definition of use
 * 2) Over time, we have found that there is a broad continuum of /Use; Instrumental; Conceptual; Enlightenment. (J: what is the continuum?)
 * 3) Conceptual and Enlightenment use is more common than instrumental use.

Given what we learned about issues of political context of social programming and ___, as a practitioner, you need to be aware of this kind of challenge between teh two.

Educate your stakeholders when you’re working on them - how can this evaluation change our thinking; how can we incorporate data -drivven decision making into our corporate structure in order to use this over time.

According to Shadish, Cook, & Leviton these are the things unanswered:
 * 1) Should conceptual or instrumental use have priority?
 * 2) Should the evaluator identify and attend to intended users? Which users?
 * 3) What increases the likelihood of use?

Q: Is identifying primary intended users contingent on context? (J: yes - blue sky evaluation; or “evaluation for the sake of evaluation” might not have intended users (though there are intended beneficiaries)) A: One said, “no, it wasn’t part of the context” — Parties that have investment in vs. intended users is contingent; who qualifies as a user also depends on how you define “/use”.

Values
This is a hotly debated topic: some see valuing as done by evaluators; others see it as forming an opinion on good or bad depending on what you see inside the program,. . . but there is this general idea that evaluation can not be value-free: as a practitioner, where do you stand?


 * 1) Are you responsible for valueing?
 * 2) Whose opinions will you bring into the process?
 * 3) How do you form / are you going to form value judgements?

People lean towards the idea that the more perspectives you investigate the more you’ll find an understanding of the program that is as exclusive as possible.

(References Post-Positivist and Constructivist - but I believe these are unhelpful in developing a contingent evaluation theory (at worst), or, at best, incomplete - very incomplete).

I think it’s unlikely that we’ll reach agreement on these unresolved questions. (J: I don’t. We can reach a contingent agreement. The only thing in our way is knowledge of all the contingencies, so if that is too great, then it depends on the definition of “we” and the definition of “agreement”)

Knowledge
We have hundreds of years of study going into how you form credible evidence for knowledge, but there are a lot of traditions: if you look at psychology vs. anthropology

We will have an entire week on credible evidence. (J: Will it include larger interdisciplinary work, or be mired in the social sciences?)


 * 1) How knowable is the social world?
 * 2) What priority should be given to different kinds of knowledge?
 * 3) What methods should evaluators use, and what are the key parameters that influence that choice?

Q: Are there categories of different kinds of knowledge? A: there’s categories of how we construct knowledge. The rule-books for what is credible evidence in one field vs. another. (J: this needs a lot more work on the epistemology side - it’s pretty one-dimensional understandings. . .)

Practice

 * 1) What should the role of the evaluator be?
 * 2) What values should be represented in the evaluation?
 * 3) __ see the ppt.

Contingency Theory of Evaluation In action:
 * 1) Aligned to context
 * 2) Responsive to stakeholders
 * 3) Use broad knowledge-base
 * 4) Flexible and adaptable

Would help practitioners anticipate these challenges and respond to them appropriately.

Choosing A design:

Contingency theories are question driven and characterized as method-neutral.


 * 1) Context
 * 2) Stakeholder Values & Concerns
 * 3) Questions
 * 4) Design
 * 5) Methods

(J: Your knowledge of the available techniques influences your questions? But it shouldn’t - you shouldn’t be bound to knowing what the tools can do, because if you ask the right questions, you can create the tools that you need for it. That’s innovation.)

People may be really good at running an RCT: they have that hammer and they bang it on everything. (J: With a /CTE, they have a theory for why that hammer and nail works; and cases that it theoretically will not work in. (Granite, for example. . .you’d need a very specific hammer, and very specific nail. . . for other things you might need a stapler)

Bunche-Da Vinci Case Revisited: Contingency Theories
(Tiffany Berry Section)

What are the sacrifices of Henry’s approach from last week? What are you willing to give up, and can you give that up? But what are the opportunities?

Why didn’t anyone last week pick Greene? (We had a stakeholder meeting to “choose” our evaluator) A1: One parent wanted her; it was Greene/Donaldson. A2: Parent perspective; you’d expect them to have children that would still be in the school at the end of the evaluation, so there would still be a change that was good for their children? A2.1: Are you prioritizing your first child over your last child? A3: Greene didn’t talk about involving children, Donaldson was the only one that talked about getting ___ and looking at misalignment.

(J: This seems to be an example of using other people’s works and good project management / team building over being an expert in a particular area (like, “Education evaluation”)

When people say, “We’re ready for an evaluation!” I ask them, “are you really?”

Jill Wholly (Sp?) talked about developing Evaluation Viability Assessments. If you go in with the stakeholder values and concerns; as the practicing evaluator, if it’s critical for their involvement. And if you don’t have time now, then maybe we should do it in six months: spending a significant amount of time before you get to the questions. . .how much time is spent with client to figure out their Stakeholder Values & Concerns; and Questions: once you get those, it is “easy” to map on the questions. The hard part is, “What will be the question in the first place?” Those questions are completely dependent on the context and stakeholder value/concerns.

Sometimes doing that pre-work, you argue yourself out of doing an evaluation: they realize they don’t need evaluation, they need an org. dev. consultant to come do some strategic planning. And we have in the community a lot of charlatans who haven’t been trained in evaluation with the same depth and rigor that we train here. (J: what’s the comparative effectiveness of these “charlatans” vs. the “trained evaluators”?)

Q: In terms of methods, could it be a threat to the validity of what you’re doing, if you’re only selecting people really motivated and have a capacity of what you’re doing: are you just choosing the cases that are already going to be better off to begin with? Mentor: What do you think about that? A: If you’re measuring the program, the results won’t necessarily be good. . .(J: I don’t think people care about sick organizations as much as they care about sick people) A: If they are not going to use your results, it’s not a good spend of resources. Why lose three years of your life to an organization that won’t be able to use your results?

Q: Is it common practice in evaluation to assess primary and secondary stakeholders before the evaluation process starts. (J: how do you define the “Evaluation process”? Any project has pre-planning. (<- need to figure out what this is really called)) … Take the PMPI Stages for a Project:

Mentor: To get to the questions, you often have to talk to a lot of people, not just take the CEO’s questions. I value organizational improvement: maybe what the CEO wants to focus on are not really related to what the needs of the organization are.

As to one of those questions: Apply to the PhD Program in Eval, and try to study that! (J: I think you can study that without being part of a PhD program, so rock on Wikiversity / Edx / every other innovative learning mechanism students!) - Because I think you’ve seen similar things in the social service: I talked about this in my other class; various welfare reform; different types of aid: but there was another group that got to chose. The motivation group; they got to chose which group they got to be in; and they were the ones with the most positive effects. People are not randomly assigned in the real world to interventions. (J: there was just a evaluation of AA) - working with clients that want me there is such a nice thing to have: working with clients that don’t want me there is a painful experience every day.

Who was most “contingent” in that they cared about context, were method neutral, talked about the things they would still need to find out about? A: Donaldson. He was honest about saying, “this is what was on my mind.” A2: I was surprised that King wasn’t more “contingent”

(J: definitions from New Oxford American Dictionary: contingent |kənˈtinjənt| adjective 1 subject to chance: the contingent nature of the job. • (of losses, liabilities, etc.) that can be anticipated to arise if a particular event occurs: businesses need to be aware of their liabilities, both actual and contingent. • Philosophy true by virtue of the way things in fact are and not by logical necessity: that men are living creatures is a contingent fact. 2 (contingent on/upon) occurring or existing only if (certain other circumstances) are the case; dependent on: resolution of the conflict was contingent on the signing of a ceasefire agreement. noun )

Q: ___ A: Fit within this evaluation context.. .. Q: ___ A: I would, but they are all contingent on the place in which they are done in. If you think about Mathematica: they have a repository of evaluation reports with a particular bent. They get multi-million dollar projects that allow them to do random assignment and assess that causality piece. When you think about evaluation firms, they list the different reports they’ve done. Then you have the “What Works Clearinghouse” with the Department of Education that look at internal validity kinds of causal questions. A2: [link?! Listing of award] winners for the AEA’s best evaluations each year.

If there was a place that we were required to submit reports, and then there was funding to review those reports, I think there would definitely be value-added.

Group Activity: Evaluation In Action: Interviews with Greene & King
We want four people in each group; so count off to 12: (there are 60 people in the class… maybe 48 people in the class, having now done the math:)

Between Greene and King, what was the evaluated: what were the types of evaluation questions; Strengths & weaknesses; What contextual issues were particularly salient in their approach; what might you have done differently in these two chapters.

Okay: just cluster in groups of 3-4 when you get back and we’ll assign you a chapter.

(Break 14:19:21 to - 14:36:12…)

‘’’Instructions:’’’ Talk about these 8 Questions, then spend the majority of the time on Questions 4 through 8. We have 20 minutes for this piece

Questions:
 * 1) What program was being evaluated?
 * 2) What were the evaluation questions which guided their project and how were they chosen?
 * 3) What type of design and methods were used to answer the question?
 * 4) What were some of the strengths and weaknesses of the evaluation approach?
 * 5) Do you think the design and methods were a good match for the context and questions?
 * 6) What contextual issues were particularly salient in their evaluation?
 * 7) How did the evaluator respond to those issues?
 * 8) What might have you done differently in this evaluation and why?

King Questions

 * 1) What was going right? (p. 194)
 * 2) What was going wrong? (p. 194)

King Methods approach:

 * 1) Data Collection Team
 * 2) Evaluator Team
 * 3) (Herself
 * 4) Social Psychologist
 * 5) __
 * 6) Self-Study Team
 * 7) ≈70% school employees
 * 8) ≈ Parents, people from advocacy groups, community members

What sort of experiences are the students having in their classes? Are the paras overstepping their bounds? Are parents getting good communication on a regular basis?

King Strengths

 * 1) Took time to identify limitations; work through ways to address limitations
 * 2) Innovative ways to hit multiple things at once: Buy-In; Stakeholder Engagement; And at the same time Processed data.
 * Q: Do you think having that many people involved helped the data analysis process? (J: Distributive processing (She said
 * J: Key question is
 * 1) Project could have come from “who are these random parents doing your data analysis” - but they put in all these
 * 1) Project could have come from “who are these random parents doing your data analysis” - but they put in all these

In a self-study, they may have to have a whole slew of data behind

Salient Contextual Issues
Responsive to Contingency Theory
 * 1) Highly political context; implemented in response to suite of complaints from parents; responded to that with a way for them to engage with each other.

Greene
Lived experience, exactly what the process is; what the program is; how it’s being implemented; ways it succeeded; how it fell short.

She used mixed methods; mainly case-studies; and a pilot program

What was the National Research Leadership Program? A: A program training people working with environmental issues on trying to __ (J: take page and citation)

Methods Approach

 * 1) Quantitative
 * 2) Interviews
 * 3) Best-Case Scenario Studies

Mentor: when you’re looking at impact you see

Greene Strengths & Weaknesses
Strengths:
 * 1) Tried to build Rapport with interviews; the pre-work; phone calls; then actual interview; - that seemed well-thought out. (J: Good stakeholder management)

Weaknesses:
 * 1) Not a lot of elaboration on what went into the standard of “impact”.
 * 2) In her interview, it seemed that she was very vague in how she went about giving credible evidence for her claims.
 * 3) She discussed barriers to real world implementation; but we would have liked to see more prescriptive things.
 * 4) The language; possible bias from her knowing the people so well that she worked with. Mentor: Yes, what does that mean for the objectivity of determining the success of the program?
 * 5) Thresh-hold for success based on “best cases” -

Salient Contextual Issues

 * 1) Could have used more qualitative methods.
 * 2) Rich understanding gave her rapport and relationships; the ability to move forward from that pilot study.

Contingency Theory in Action
This is how I feel: I’m trying to ride a unicycle while being responsive to concept, adapt my design so it gets used in the most effective way possible; you’re constantly engaged in this juggling act.

Any one thing can come in and throw you off course and make you eat it on your unicycle! Pressures; politics: a CEO gets fired in the midst of your evaluation: all of these things have to be re-juggled, and re-prioritized.

It’s super-fun, but it’s a challenge. For all my Pos. Psych people who know about flow: challenge - skill ratio; you have to understand all these different pieces that are
 * 1) Interconnected
 * 2) Transactional: they change over tiem depending on teh context in which they’re embedded.

Next week we’ll talk about politics and interpersonal skills. Contingency theories; political issues in practice, and how you deal with them: Does it compromise ethics? What do you do?

Appendix A: Example Group: Jean King
1. What program was being evaluated?

A: The SPECIAL EDUCATION PROGRAM AT THE ANOKA-HENNEPIN SCHOOL DISTRICT (p. 183)

2. What were the evaluation questions which guided their project and how were they chosen?

BROADER: “dual intent: to provide a broad array of perceptual data collected from numerous stakeholders using diverse methods and, simultaneously, to create a process for continued evaluation activities.” p. 185

Metaanalysis: This question is perhaps too focused: King started from a political situation, and very broad questions; and they created

3. What type of design and methods were used to answer the question?


 * 1) Data Collection Team:  (3 Evaluators)
 * 2) Facilitate.  -
 * 3) Type up
 * E
 * 1) Self Study Team:

A: The TEAM:

infrastructure for the self-study included three components

(1) process planning by the Data Collection Team;

(2) a team of the three evaluators and the district special education administrators, which met twice a month (the Data Collection Team);

(3) a large self-study team with representation from as many stakeholders as we could identify, which also met once a month (the Self- Study Team) (p. 186) - “the Self-Study Team framed issues and concerns for the study, reviewed instrumentation, analyzed and interpreted data, and developed commendations and recommendations.” (p. 186) See the Self-Study Process

4. What were some of the strengths and weaknesses of the evaluation approach?

Strengths:
 * 1) Incredible Distributed-Computing!
 * 2) Ongoing meetings and involvement helps with buy-in
 * 3) Moving beyond opinion using evidence.
 * 4) Meetings had childcare etc. : really going all the way on logistics

Weaknesses
 * 1) Overall approach: how does she engage people who don’t want to spend the time to be involved in the Table Teams?
 * 2) In sense of Self-Study? Usually outcomes are important.
 * 3) Did it address the needs of the self-study?

Discussion: Data collection and data analysis: - Even if you said instead of calling it, “data analysis activity” what if you called it “making sense out of our findings”? Are there other ways she could have included them?

5. Do you think the design and methods were a good match for the context and questions?


 * 1) Yes; given the political process.

6. What contextual issues were particularly salient in their evaluation?


 * 1) The political issues.

7. How did the evaluator respond to those issues?


 * 1) Created “DATA DIALOGUE” FOCUS GROUP TECHNIQUE:

8. What might have you done differently in this evaluation and why?

Given what we know here:
 * 1) How to get parents more involved?
 * 2) Small groups;