Popular Posts

Caveat Emptor

The opinions expressed on this page are mine alone. Any similarities to the views of my employer are completely coincidental.

Wednesday, 4 November 2015

The Strange Case of G. E. Bartlett - Part 1

This one has two of my favourite ingredients, numbers and a detective story. The time is 1929 the place is the London School of Economics. Hubert Llewellyn Smith is Directing the New Survey of London Life and Labour  (NSLLL) and Arthur Lyon Bowley is in charge of sampling London households.  In the field are more than 150 interviewers collecting information on household income.

Fast forward to 2015. I want to categorize the occupations recorded in the 1931 Census. The NSLLL contains information on the occupation and earnings of the household residents. I figure it could give me some guidance about the similarities between occupations. The NSLLL was probably the largest social survey carried out in Britain during the inter-war period and more to the point it is, to my knowledge, the only one that has (mostly) been digitised. Even more to the point, I happen to have a copy of it on my hard disk.

The question is: can I trust these data? Why not? I hear you ask. Well, mainly because of the activities of one of the interviewers a certain G. E. Bartlett who appears to be responsible for conducting not far short of 20% of all the interviews. Bartlett regularly clocked up over 400 interviews a month and in October 1930  managed 600, 20 a day if he worked 7 days per week. Strictly speaking we don't know for sure that Bartlett was a 'he' but the evidence on the surviving handwritten cards suggests it was so. He certainly had an incentive. He was making more than a shilling per  interview, and £30 for a months work was roughly 3 times the median working-class earnings level.

Bartlett's Stakhanonvite workrate and more importantly seeming peculiarities in portions of the data he recorded certainly made Simon Abernethy  - a Cambridge history postgraduate - suspicious. In a very interesting paper he argues, very plausibly, that though it is unlikely Bartlett literally sat at home and made the data up, the evidence is consistent with him estimating a large portion of the earnings data he was supposed to be collecting. On first reading I found Abernrthy's account pretty convincing. But then I started to wonder. 

Admittedly things look bad for Bartlett, but was 600 interviews a month as implausible as it sounded? Bear in mind these were nothing like modern survey interviews. Very little data was actually collected - the interview probably lasted no more than 10-15 minutes and the addresses were heavily clustered. Most of Bartlett's households were in Battersea, Camberwell, Lambeth, Southwark, St Pancras and Wandsworth and the sampling fraction was about 1 in 50. Someone who knew the areas well could probably make fairly rapid progress. Perhaps 20 interviews a day for someone working on it full-time was not as surprising as it seemed.

Then I had another thought. If you believe  that someone is guilty and that proof of that guilt is to be found in unusual data patterns then unless you carefully specify before peaking at the data which unusual patterns you are looking for then you are bound to turn up something. There are a very large number of observations in the NSLLL and plenty of scope for sub-group analysis. Seek and ye will find. We all know about the "look elsewhere effect" don't we? Perhaps what Abernethy finds is nothing more than extreme values that are due to chance (regardless, so to speak of what the p. values say). Indeed, what he does is looks at small subsets of the data - particular occupations for example - and shows that data collected by Bartlett differ in some ways from data collected by the rest of the interviewers. 

But what happens if we look at all of the data and, to provide a bit of comparison, distinguish from the rest the second, third and fourth most industrious interviewers. J. Hopkers, J. Ludgate and A. N. Winter though not in Bartlett's league  were each responsible for surveying more than 800 households. Is there any evidence that they produced unusual results too?

Time for some data analysis. I'm working with the public release version of the data which is restricted to the 'working class' households (Abernethy  has also digitized a large portion of the so called ' middle class cards' but these data are not yet in the public domain). My sample consists of everyone who is either employed or self-employed and aged over 13. Observations are clustered within households and the primary variable of interest is weekly earnings expressed as shillings per week (12 pence to a shilling, 20 shillings to a pound). I exclude a few cases where the earnings that are reported are in some sense 'joint' and not attributable to a single earner.

For all sorts of reasons interviewers differed in the level of earnings they reported. About 10 percent of the total variation is between interviewer variation.  To put that in context the table that follows also gives numbers for  some other salient sources of variation.
The amount of variation attributable to interviewers is roughly  similar to the amount attributable to household membership and very roughly double that attributable to the fact that people with similar incomes tend to live in proximity to each other. The heavy hitter here though is occupation - which is good for me given that this is what first brought me to these data. Of course interviewer, household, geographical area and occupation are confounded with each other so we can't read these numbers as unique contributions to the total amount of earnings variation. Interviewer "effects" will undoubtedly shrink once we control for other sources of variation.

But before we do that let's examine the data a little more closely. In the figure below I superimpose a histogram of the earnings information (in units of 1 shilling) collected by Bartlett (white bars with black borders) on top of the earnings information collected by all the other interviewers (in green).

The question is: are these distributions different? The answer is (obviously): yes. But so what? You wouldn't expect them to be exactly the same. They share some features and  differ in some ways.

 Actually the stand out feature is the heaping of observations on certain values, multiples of 20 shillings for instance.  Heaping is to be expected for at least 4 reasons. Firstly  it might reflect reality - employers paying an hourly rate calculated to deliver a nice round number for a standard working week. Secondly respondents may be rounding their actual wages up or down to a particularly salient value - say £3 a week. Thirdly, the interviewers might be rounding what they are told. Fourthly, the interviewers might be using their own estimates (perhaps based on good local knowledge) rather than actually asking the respondents about their earnings.

Any and all of these things could be happening. We don't know and it is very difficult to draw conclusions from just looking at the  the distributions. Where most of the interviewers heap, Bartlett also heaps.  Sometimes he heaps a bit more sometimes a bit less. A lot of his data is crowded into the 50-80 shillings range which might be taken to suggest that something untoward was going on. It might also just reflect the fact that he interviewed in particular areas with particular concentrations of occupations that received similar wages. To get any further we need to impose more structure on the data.

The basic idea is to estimate some regressions that control for a lot of stuff. We could do this in a number of different ways but I'm going to keep it simple. The dependent variable is weekly income in shillings and I include fixed effects for the 433 occupational groups and the 36 areas. There is a dummy for gender and for whether the respondent is employed or self-employed. Hours of work are controlled as are age and age squared.  Finally I distinguish four interviewers (Bartlett, Hopker, Ludgate and Winter) and a residual group and the dummy indicators for these are interacted with gender. The estimated coefficients for this interaction are of central interest.

The table below gives the average deviation for each separately identified interviewer from the conditional mean earnings level recorded by the other interviewers taken as a group. This deviation is expressed separately for male and female wage earners. There are of course a number of different ways to parameterize this interaction, but for our purposes this seems to be the most enlightening.
What this suggests is that Bartlett's numbers  on average had working class respondents earning around 2s and 6d to 3 shillings more than the figures obtained by the majority of the interviewers. The difference between his male and female figure is statistically significant, but of little substantive importance. 2s and 6d is, as my grandmother would say, a lot of money if you don't have it but it's also just about 5% of the median working class wage and a somewhat lower percentage of the median male working class wage. If Bartlett was guessing or "estimating" he was, on the whole, actually doing a pretty good job! He was also not alone in his "inaccuracy".

 Hopker appears to have erred in the opposite direction, underestimating both male and female earnings while Ludgate, though he does a good job for the males, seems to find particularly well paid women. Of the 4 only Winter is, as it were, consistently on the money.

The point is I'm pretty sure that if I looked at the next 5 most prolific interviewers I could find differences of this magnitude  and probably also  if I continued looking all the way down to where the group size is so small that it would take truly massive differences to produce statistically "significant effects". My conjecture is that these results on their own don't reveal anything particularly unusual about Bartlett's  modus operandi.

In Part 2 I'll look at a different indicator of the unusualness of Bartlett's data - the variance and I'll tell you who I think this man of mystery was and how he managed to do all that interviewing.

No comments: