Arbitrariness of Rankings

[bit.ly/dsia02b] This is part B of 2nd lecture in “Descriptive Statistics: An Islamic Approach” (DSIA02b) considers the issue of comparing two numbers to decide which is higher. Even though this task is trivial from the statistical point of view, it is very complex when we follow through to try to understand the real world context in which the numbers are being compared. This is illustrated through an example involving ranking of cars.

One of the best sources of learning is reading articles and books. Good articles and books encapsulate deep wisdom, which authors have gathered from their life experiences. Ultimately, the only source of knowledge is life experience itself. Since we have only one life to live, we can only gather a small amount for ourselves. Reading gives us access to the fruits of the life experiences of millions of scholars, throughout the centuries of written works. It is essential to be selective in this reading. This is because the amount of false and misleading information is vastly greater than that which is useful and relevant. Furthermore, even the useful and relevant material is so extensive that we will only have a chance to read a very small portion of this in our entire lifetime. One of the tasks of a good teacher is to provide guidance in this regard. Having read thousands of articles, select the few that stand out for students. If the teacher can point the student to one article that summarizes the wisdom of 1000, he has not only guided the student to a useful article, he has also saved the student the time required to read the other 999, and arrive at judgments of their relative worth.

One of the best articles which explains the meaning of comparing numbers in the context of college rankings is the following:  Gladwell, Malcolm. “The order of things. What college rankings really tell us” The New Yorker 87.1 (2011): 68-75. Downloadable copy: Gladwell Rankings PDF.

In this lecture, we will read the article together. I will provide some simple and clear explanations of what is being said, so as to enable the student to read the original article. Although the article is about college rankings, it starts by illustrating the ranking problem in the context of cars. The goal of the article is to show that all rankings are deceptive – it is just one of the ways to “lie with statistics”. Even though the Car and Driver magazine comes up with a clear winner in their rankings, the winning car is NOT the best in any clear sense of the word. In fact, the question itself is meaningless. It is impossible to rank cars without consider the PURPOSE of the ranking. In this lecture, I will provide a simplified and detailed explanation of the material in the article, to enable students to read and understand the article itself.

Suppose that there are three dimensions along which cars are evaluated – Appearance, Engine, Price. Let us put aside the issue of how we come up with numbers for the subjective categories, even though this is also important. Let us suppose a panel of experts can judge, on a scale of 1 to 10, the objective rankings of cars on these three dimensions. The first concerns external appearance, style, attractiveness. The second concerns engine performance judged by many different criteria. We have omitted one of the criteria used in the Gladwell article: “the subjective feel of driving,” which has to do with how the car handles when it is driven in different situations. We have replaced this by the price, which can be evaluated objectively. Here is a set of hypothetical numbers which evaluates three cars along these three dimensions.

Car Name Appearance Engine Price
Porsche 6 9 3
Lotus 8 7 6
Chevrolet 5 5 9

 

Note that high numbers mean high ranking, so the ranking of 5 given to Chevrolet means that it is the cheapest car, having the best price among the three cars being evaluated.

Note that each of the three cars is best in one of the three dimensions. Lotus is best in appearance, Porsche has the best engine, while Chevrolet has the best price. How can we find out which is the best car overall? The CORRECT answer to this question is that we CANNOT do this. The ranking between the cars depends on the PURPOSE for the evaluation – WHY are we trying to rank the cars. Without specifying a purpose, we cannot rank the cars. The standard methodology in use is deceptive – another illustration of “How to Lie with Statistics”. It assigns weights to all three factors to come up with a combined score. Let us look at how this is done. I will use C&D to be a hypothetical version of the Car & Driver magazine which is discussed in the actual article. The statements below about C&D correspond only roughly to Gladwell’s article, and are meant to simplify a more complex discussion. With this warning, we consider how C&D comes up with a ranking of cars, even though this is impossible to do without considering purpose of ranking.

C&D editors feel that what is inside the car, the engine, is the most important factor. They assign it a weight of 50%. Because they are car enthusiasts, they find that a sleek and stylish appearance is very important, and the price is not so important. So, they assign a weight of 40% to appearance, and 10% to price. Once these weights have been assigned, the score for each car can easily be calculated. Multiplying by 10 to avoid decimals, we find that, with these weights, Lotus gets 73, Porsche gets 72, while Chevrolet gets only 54. The message from this ranking is the Lotus and Porsche are close to each other and both are distinctly superior to Chevrolet. The numbers create and OBJECTIVE feel – this is not a matter of personal tastes of the C&D editors, but and objective evaluation of the characteristics of the cars.

This message is completely wrong. The rankings are created as a MIXTURE of subjective weights and objective characteristics of the cars. To bring this out, Malcolm Gladwell argues as follows. He says that Car and Drivers editors used the SAME weights for this evaluation that they do for SUV’s (Sports Utility Vehicles). Now SUV’s combine elements of practicality with a sporty feeling, but the cars being evaluated are high class luxury cars. He says that the typical buyer of sporty luxury cars is a lot more interested in the APPEARANCE of the cars, as compared to what is inside the engine. These cars are brought for show. If we change the weights to 50% on appearance and 40% on the engine, with price still at 10%, then Lotus emerges as a clear winner. The scores are now: Lotus 74, Porsche 69, and Chevrolet 54. Putting even more weight on appearance would put Lotus even further in the lead.

Next consider a buyer who has a modest income, but great love of luxury. He would be very happy to buy a sports luxury car, if only he could afford one. As long as the car is classified as a luxury car, it is all the same to him. He is maximally concerned with the price. If we put weight of 50% on the price, and 25% each on Appearance and Engine, the Chevrolet will emerge as the winner with 70, while Lotus and Porsche lag behind with 67.5 and 52.5 respectively.

So depending on the tastes of the buyer, and the purpose for which the car is being bought, the ranking would be different. Malcolm Gladwell explains that there are two situations in which it is possible to come up with an objective ranking. One situation is when we focus on only one factor. If we look only at price, or at power of engine, or at appearance, then we can evaluate two cars X and Y and decide if X is better than Y in appearance or not.

According to Malcolm Gladwell, the second case in which objective rankings can be done is if all of the cars are similar to each other on the dimensions being ranked. He thinks that it is the diversity of the cars being ranked that leads to the sensitivity of the ranking to weights. This is a mistake. Even if the cars are similar to each other – homogenous, in Gladwell’s terminology – the problem of sensitivity to weights will remain exactly the same as in a heterogenous group. According to Gladwell, the problem arises because Car and Driver tries to cover the field and rate a very diverse group of cars. This is not true.

The source of the failure lies in the failure to specify the PURPOSE for which the ranking is being done. When we explain WHY we want to rank the cars, then we can correctly specify the weights for the different factors. The purpose is subjective – it depends on the person who is buying the car. For example, someone might allocate budget for the car to be $20,000, and then say that he wants to get the most sporting car that he can for this price. He can then assign his personal subjective preferences for external attractiveness and engine quality to come up with a ranking. Or, he need not convert qualitative information to numbers. He could just look at cars  within his budget and classify them as A,B,C – extremely attractive, attractive, and average looking – in appearance. Then, depending on his personality, he might check the engine characteristics of the A-rated cars to ensure that they are satisfactory for his purposes, and buy the most attractive one. Or he might go for a compromise between Appearance and Engine. None of these methods of choosing cars correspond to creating a ranking by numbers of the cars.

This brings us to the META-QUESTION: Why are we discussing numerical measures of car quality? This is because there has been a huge emphasis on measuring things and assigning numbers to qualitative concepts. The idea of “measuring” intelligence by a single number – the IQ – was invented in the 20th Century. But this is NOT a good idea. Complex multidimensional characteristics like “intelligence” cannot be reduced to a single number. In order relate knowledge to our life-experiences, we have to break knowledge out of the boxes to which it has been confined in the West. It is exploring these meta questions that leads us to the understanding of world we are living in, which has shaped our ways of thinking. It is this understanding that offers us liberation for the boxes to which education confines our thought.

This entry was posted in Uncategorized by Asad Zaman. Bookmark the permalink.

About Asad Zaman

BS Math MIT (1974), Ph.D. Econ Stanford (1978)] has taught at leading universities like Columbia, U. Penn., Johns Hopkins and Cal. Tech. Currently he is Vice Chancellor of Pakistan Institute of Development Economics. His textbook Statistical Foundations of Econometric Techniques (Academic Press, NY, 1996) is widely used in advanced graduate courses. His research on Islamic economics is widely cited, and has been highly influential in shaping the field. His publications in top ranked journals like Annals of Statistics, Journal of Econometrics, Econometric Theory, Journal of Labor Economics, etc. have more than a thousand citations as per Google Scholar.

5 thoughts on “Arbitrariness of Rankings

  1. Reblogged this on WEA Pedagogy Blog and commented:

    Cross-Post from Islamic Worldview Blog. Shows how rankings always involve mixtures of facts and values when they are multidimensional. This is illustrated in the context of ranking of cars by Car and Driver magazine.

  2. Pingback: Values Embodied in Factors & Weights | An Islamic WorldView

  3. Pingback: Corruption Rankings | An Islamic WorldView

  4. Pingback: Global Corruption Rankings | WEA Pedagogy Blog

  5. Pingback: Computing Life Expectancy from Mortality Tables | An Islamic WorldView

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s