[bit.ly/dsia02c] Part C of 2nd Lecture on Descriptive Statistics: An Islamic Approach [DSIA L02C]. We continue our study of Malcolm Gladwell’s (MG) article on ‘College Rankings’. We will consider the questions of “How are the Numbers Computed” and “What do they Mean?” The 15m video is followed by a 1400 word writeup.
MG starts by noting that the Purpose and Audience for College rankings have changed over time. It was Initially meant as a rough guide for “consumers” (students choosing colleges). It was not imagined that Colleges would use these rankings as benchmarks of performance, proof of good management, status markers in the rivalry among colleges. It was not imagined that educational policies would be used to engineer a rise in ranking, As we have discussed, changing purposes require changes in measurements, and the rankings have NOT been changed to suit the changes in purposes, with harmful results.
Going into the methodology of the ranking itself, it is based on seven major factors:
- Undergraduate academic reputation, 22.5 per cent
- Graduation and freshman retention rates, 20 per cent
- Faculty resources, 20 per cent
- Student selectivity, 15 per cent
- Financial resources, 10 per cent
- Graduation rate performance, 7.5 per cent
- Alumni giving, 5 per cent
MG registers Two Major Complaints about the ranking process:
- Universities being ranked are extremely diverse (heterogenous) – How can you compare apples and oranges?
- Each university is extremely complex and multidimensional – dozens of departments, campuses, programs – how can a SINGLE number be assigned to this?
I will explain that there are many other issues worth considering about this ranking later on. For the moment, we note that many, many questions arise from this description of the ranking process:
- Why these SEVEN factors? Why not others? What is the basis for selection?
- Why these weights?
- WHAT do these factors MEASURE?
- How are the factor scores computed?
We start the discussion by asking a simple question: Does a number actually measure what it is supposed to measure – that is, Are Numbers Accurate? To explain this, MG considers the example of the Suicide Rate. Here the Target IS Measurable – that is, there really is a NUMBER which measures the number of people who committed suicide in the past year. But no one knows what that number is. The statistics which are available are distorted by many biases. It is extremely difficult to guess intentions of a dead person. Someone classifies a death as a suicide, and this person varies greatly by country. Depending on culture and customs, classification could be done by police officer, family, doctors. Whether or not it is reported on official statistics is again a separate matter. Because of this diversity, it would be a hopeless task to compare suicide rates across countries with any degree of confidence. INSTEAD, one should ask the PURPOSE of the comparison. If quality of life is the target, then more direct measures based on surveys of welfare may give better results.
In addition to criticisms by MG, I would like to focus on the issue that when we look at a number, it is essential to be clear about the TARGET – WHAT is that number trying to measure? So, when I am given a number measuring the Quality of a College, I must ask “What do you MEAN by Quality of College?” One way to specify this quality (and there are many other possible definitions) is to consider student learning: Student ENTERS with knowledge and skills, and EXITs with MORE knowledge, skills. The DIFFERENCE between the two is the Educational Outcome, what the college contributed to the learning process of the student. Of course, this is a Multi-Dimensional quantity – learning and skills occur on many different dimensions which are not comparable with each other. It is hard to reduce multidimensional performance to a single number unless some clear and specific purpose of education is specified. For example, if we consider how well the education provides medical skills in terms of ability to treat patients, it might be possible to come up with a single number which aggregates the contribution of all dimensions to the single purpose. This is a complex issue, which will arise in many different contexts.
A SECOND ISSUE of importance, in terms of IMPROVING how we do statistics: From VAGUE & IMPRECISE measures of INPUTS, move to measures of OUTPUT. Stiglitz-Sen-Fitoussi recommended moving to consumption, which directly measures what human beings get, instead of production, which measures goods produced that COULD potentially get to the consumers. To illustrate this idea, consider one of the seven factors used in the rankings: Faculty Resources. According to the reasoning given for this factor, Student Engagement with faculty is an important part of the educational process. Instead of directly measuring Student Engagement (which is vague and qualitative, and hard to define and measure), we use PROXY measures, which are INPUTS which go into producing Student Engagement. These proxies are:
- Class Size
- Faculty Salary
- Proportion with Ph.D.
- Student Faculty Ratio
- Proportion of Full-Time Faculty.
It is true that these factors all have the POTENTIAL to create a better student educational experience. These are INPUTS into the educational process. But how effective are they? Do they actually achieve this potential? Do these factors really matter?
Suppose we specify the TARGET of our quality measure as before: “How much students “GROW” in the educational process?”. MG cites Educational Research by Terenzini & Pascarella Meta-Study of 2600 papers, which finds NO RELATIONSHIP between student engagement AND the standard list of variables used in nearly all methods for measurement of quality of colleges: educational expenditures per student, student/faculty ratios, faculty salaries, percentage of faculty with the highest degree in their field, faculty research productivity, size of the library, [or] admissions selectivity
If these INPUTS do not matter, than what DOES matter? It turns out that the key variables are the ones which are qualitative and non-measurable, or SOFT Variables: Educators engage students when they are: purposeful, inclusive, empowering, ethical, and process-oriented. For a summary of take-aways from the Terenzini and Pascarella in-depth studies, see: Pathways to Success: Student Engagement.
Focus on what can be measured takes attention away from the important qualitative factors, which often cannot be reduced to numbers. For an example of this (not discussed in the MG article) consider the question of “Do SAT scores predict academic success?”. There is a huge Controversy about the issue but the facts are clear. SAT is solidly correlated with First Year Performance. Correlation weakens with time. BUT the effects is very small. For practical purposes, it is reasonable to conclude that we should NOT use SAT for college admissions. WHY? Again the key factors which lead to success are not measurable. Research shows that Student Characteristics strongly correlated with Success are Drive, Motivation and Perseverance. These are character traits which are not measured by SAT’s. Another way to think about this question is to ask: “Can we take students with low SAT’s and turn them into Super Performers?” The answer is YES, and there is a lot of evidence that teachers who motivate & inspire can take students from any background and turn them into star performers.
Concluding Remarks: It is helpful to look at the bigger picture. The rise of Logical Positivism in 20th Century led to an extreme emphasis on the observable and measurable and a complete neglect of the qualitative and unmeasurable aspects of our lives. This has led to the drive to MEASURE everything. But the Most Important Things in life are not measurable. We live our lives without measuring in numbers those things which matter the most to us – loving and being loved. This ability to deal with qualitative and unmeasurable phenomena needs to be extended to the bigger world of education and management. Even when complex, multidimensional phenomena ARE measurable, multiple measures CANNOT be reduced to one number. False philosophies lead us to PRETEND that a single number can MEASURE the quality of colleges. This type of confusion arises from failure to think clearly about the TARGET – What is being measured and Why? To improve statistical analysis, we must learn to think clearly about the bigger questions, instead of confining attention to the numbers alone.