Beaufort and Rubrics

A quick post about the Beaufort scale for wind speed, see below, as paradigm of a rubric.

Rubrics are really important in evaluation.

The Beaufort Scale is just a beautiful example of clearly defined, easily observable criteria. I love the way different features like smoke (lower wind speeds) become important at different wind speeds. At force 6, umbrella use becomes difficult - umbrellas are not mentioned elsewhere but it gives you a great idea of what force 6 feels like and so helps to anchor the scale.

In particular, I love the way the individual statements are mostly relatively objective, or rather, inter-subjective, i.e. they are likely to be understood by different people from different backgrounds in a similar way. In the social realm it is often hard to be so objective and so rubrics in project evaluation often include formulations like effective”, reasonably good overall” or just about adequate” which are again not inter-subjective and so in a way beg the question.

This must have been a big improvement for the Navy before the advent of mechanical windspeed measurement.

Another fascinating thing is that it manages to break a quantity like wind speed into no less than 12 different levels. In evaluation, we mostly see rubrics limited to 4-7 levels.

Wikipedia says the Beaufort scale was extended in 1946, when forces 13 to 17 were added, but they are only used in typhoon countries.

I always thought it would be hard to really distinguish 10, 11 and 12 though. And on the other hand, numbers are still needed beyond 12 especially in the tropics.

I wonder if that is because the scale is linearly anchored to windspeed. I would have thought that perception of windspeed, like most other things like light and sound, would be logarithmic so that there should be more frequent divisions at the lower ends and bigger jumps at the higher extremes. If the number/windspeed ratio was logarithmic, 10 11 and 12 would cover ever increasing spreads of windspeed and would neatly extend to cover just about any hurricane.

An example of where assuming a linear relationship with a physical quantity has unfortunate consequences. The developers of the scale would probably have been better advised to ignore the physical windspeed and concentrate on what can be inter-subjectively distinguished. It is not even obvious that the span of each rubric in terms of physical windspeed, or its logarithm, or indeed in terms of anything else, need to be equal at all - even subjectively. That all depends on what the rubric-based rating is going to be used for.

Beaufort number Description Sea conditions Land conditions
0 Calm Sea like a mirror Calm. Smoke rises vertically.
1 Light air Ripples with the appearance of scales are formed, but without foam crests Smoke drift indicates wind direction. Leaves and wind vanes are stationary.
2 Light breeze Small wavelets, still short but more pronounced; crests have a glassy appearance and do not break Wind felt on exposed skin. Leaves rustle. Wind vanes begin to move.
3 Gentle breeze Large wavelets. Crests begin to break; scattered whitecaps Leaves and small twigs constantly moving, light flags extended.
4 Moderate breeze Small waves with breaking crests. Fairly frequent whitecaps. Dust and loose paper raised. Small branches begin to move.
5 Fresh breeze Moderate waves of some length. Many whitecaps. Small amounts of spray. Branches of a moderate size move. Small trees in leaf begin to sway.
6 Strong breeze Long waves begin to form. White foam crests are very frequent. Some airborne spray is present. Large branches in motion. Whistling heard in overhead wires. Umbrella use becomes difficult. Empty plastic bins tip over.
7 High wind, Sea heaps up. Some foam from breaking waves is blown into streaks along wind direction. Moderate amounts of airborne spray. Whole trees in motion. Effort needed to walk against the wind.
8 moderate gale, Moderately high waves with breaking crests forming spindrift. Well-marked streaks of foam are blown along wind direction. Considerable airborne spray. Some twigs broken from trees. Cars veer on road. Progress on foot is seriously impeded.
9 near gale High waves whose crests sometimes roll over. Dense foam is blown along wind direction. Large amounts of airborne spray may begin to reduce visibility. Some branches break off trees, and some small trees blow over. Construction/temporary signs and barricades blow over.
10 Gale, Very high waves with overhanging crests. Large patches of foam from wave crests give the sea a white appearance. Considerable tumbling of waves with heavy impact. Large amounts of airborne spray reduce visibility. Trees are broken off or uprooted, structural damage likely.
11 fresh gale Exceptionally high waves. Very large patches of foam, driven before the wind, cover much of the sea surface. Very large amounts of airborne spray severely reduce visibility. Widespread vegetation and structural damage likely.
12 Strong/severe gale Huge waves. Sea is completely white with foam and spray. Air is filled with driving spray, greatly reducing visibility. Severe widespread damage to vegetation and structures. Debris and unsecured objects are hurled about.

My forthcoming book: Learn to speak Evalian”

I have been working for a while now on a book provisionally entitled Learn to speak Evalian”. This little post is just a place-holder with information about the book; I will update it when there is more to say.

I intend to publish it open-access, i.e. it should be primarily available online free of charge, perhaps with on-demand printing too. There will be a beta version and comments will be very welcome. It may even stay in a kind of permanent beta phase as I update and add to it.

If you are interested, drop me a line (steve AT pogol.net) and I will let you know when the first version is ready.

The book is not meant as an introduction to monitoring and evaluation (M&E)1. It will make most sense to people who have some practical experience and who are familiar with the theoretical outline of more than one well-known approach to these topics such as the logical framework approach (LFA) or Outcome Mapping.

The book might be of interest for social scientists in general but I am writing in particular for monitoring and evaluation (“M&E”) professionals - people who are responsible for monitoring and evaluating the success of and processes within many different kinds of projects and programmes, for example in international development, education and so on.

What’s Evalian?

The book is disguised, in a light-hearted way, as an introduction to Evalian”, the fictional language of a fictional country, Evalia. Evalian is a language very like English in which M&E concepts can be more easily expressed than in English. In particular Evalians are more precise in the way they express Variables” and their possible (factual and counterfactual) Values”, and the way in which these can be linked up into more or less deterministic Mechanisms” which they describe with Theories” about those Mechanisms. So concepts like Effectiveness” and Impact” are quite easy to define and use in Evalian. What’s more, many of the key ideas of a wide range of evaluation approaches from Outcome Mapping to RCTs can be quite neatly expressed in Evalian.

As a bonus, well-formed Evalian can be pasted into the Theory Maker web app to produce corresponding diagrams.

Many of the key ideas are based on the theory of causal networks set out in Pearl (2000)2

So, this book will teach you to speak Evalian, the fictional language of a fictional country, Evalia.

Why a fictional language?

Because the Evalians manage to communicate with one another about things without the misunderstandings which plague English and (I assume) the other languages which M&E staff use today. They can recite a Theory of Change or a project proposal as if it was an amusing story or an inspiring poem, and no-one argues about what it means or the words it contains. They often conduct Randomised Controlled Trials, almost for fun, but they also love Outcome Mapping and many other approaches to evaluation; and they see no contradiction in this.

So, learning Evalian” helps us to introduce some words, concepts and conventions which we can use to think and write about projects and programmes from an evaluation perspective. Using these conventions, we can have a fairly standard way to communicate how we think projects work and what effects can be attributed to them.

So we will define some slightly new way of making causal statements, as well as some key words like Variable, Value of a Variable, and so on, for talking about these statements. We will do it in a way which is (I hope) compatible with most approaches in social science textbooks, but which is a bit different from them too:

  • Evalians, although they are skilled at mathematics, are very comfortable with fuzzy and vague formulations and are very good at drawing precise conclusions from vague premises; they do not feel they are being sloppy when they don’t express a Variable in terms of numbers
  • Evalians are particularly good at describing Differences between factual and counterfactual states of affairs; the concept of Difference” is very important in evaluation.

Preview of first section

One of the most convenient things about Evalian is that it is very like English, with just a few tweaks. When Evalians are speaking about evaluation matters - variables, efficiency, outcomes, etc - they use a special kind of intonation which they capture in writing by using a special grey background like this:

This sentence is in Evalian.

For everything else, Evalians just use English. They do use quite a lot of Evalian though, because they like to pass time in the long evenings talking about things like attribution and contribution, bless them.

Written Evalian should be pretty easy for non-native speakers to read and write. However, it can be difficult for non-native speakers to speak.

For example, this piece of Evalian:

Teacher skills
 Teacher presence on training course

says something like Teacher presence on training course contributes to Teacher skills”. The second line is indented by one space, and Evalians have a special kind of intonation for this which is difficult for non-native speakers to hear or say themselves.

In Chapter xx, we discuss a bit more how to speak Evalian. Right now, let’s concentrate on learning the written language. As unfortunately many Evalians are hard of hearing, they also use diagrams which exactly mimic statements in Evalian. We are grateful to them, because non-Evalians too often find diagrams easier to understand (especially with longer texts).

Another convenient thing about Evalian is that we English speakers can just use bits of it without learning the whole language. Hopefully this will lead, at least piecemeal, to better understanding between English speakers when speaking about things that matter in evaluation. So just as you can sparkle at a party (sometimes) by throwing in a bit of French, you can make a mark with your M&E colleagues with even just a few phrases of Evalian.

Theorymaker.info, a website which understands Evalian

The book is a companion to Theory Maker, at theorymaker.info, a (free and open-source) web app for quickly sketching out many different kinds of Theory of Change and other kinds of diagram like project theories and evaluation plans relevant to M&E.

Theory Maker speaks Evalian! At the Theory Maker website, you will see a text window on the left which you can type into. If you type in Evalian, a corresponding diagram is produced on the right.

Below most of the Evalian phrases in the book, you will see a diagram, the same one you would get at theorymaker.info if you typed it into the text window. In the book, you will see an edit” button below each diagram. If you are reading the electronic version of this book, when you click the button you will be taken to the website where the text will be conveniently pasted in for you. There, can play with the text to see how it works, and perhaps adapt it to your own needs.

  1. Sometimes M&E” is used to mean a relatively low-level function, namely the mere monitoring of projects and programmes, contrasted with the more illustrious discipline of evaluation. I make no such distinction and refer to both as M&E”.

  2. Causality: models, reasoning and inference. Cambridge Univ Press.

Everything should be evidence-based - if only the evidence would make up its mind already

This question came up on an Evaluation mailing list and was forwarded to non other than Andrew Gelman, and for our purposes it can be boiled down to: what do we really know about financial motivation in organisations? Do we know enough to be able to say something like in this case use this reward system, it will bring you optimal results”?

Reading through the illustrious answers it seems clear that there are dozens of different theories and variations of theories, each with some research evidence, but nowhere anything like an evidence-backed consensus.

Policy, project and programme design, …. everything should be evidence-based - if only the evidence would make up its mind already.

Motivation and reward are comparatively easy to conceptualise and research. If we can’t get consensus here, how are we going to get consensus with hundreds of harder real-world problems like how much should cash programmes after a natural disaster target women? - single women? how much? family size? conditionality?”

How can we construct Theories of Change if there is such a lack of consensus about what leads to what?

