Monday, 16 September 2013

Promises promises....

Interesting developments over at Stratification & Culture Research Network. Mike Savage has posted a statement outlining the future research agenda of the Great British Class Survey research group. I'm looking forward to the article he promises that will deal with (refute?) the arguments of the GBCS's critics. In fact I'm wondering how they are going to manage it within the constraints of the 8000 word limit that Sociology insists on. There is just so much to say in order to do justice to the points the critics raise. Still, I've made it easy for him to address my particular criticisms by listing them in the form of questions at the end of my critique and it will be easy for the interested reader to keep a tally of how many are actually seriously addressed.
Also of interest is more information about the archiving of the GBCS survey data. This will eventually be deposited at the UK Data Archive though we are asked to be patient because of the large amount to work that will have to be put into cleaning it. Fair enough, my guess is that there are lots of issues to do with data format and coding and half a million cases is quite a lot to chew on even if there aren't that many variables.
What is less understandable is the delay in archiving the GfK survey. This has only 1026 cases and (I assume, though we are not told this) about the same number of variables as the GBCS. There surely can't be any complex issues involved in archiving a data-set of this size and no reason to delay depositing it. I think I know a little about archiving data from a depositor point of view and quite a lot from a user point of view. To clean, document and archive a data-set of this size, even if it just consists of columns of asci  number, is about the work of a weekend for someone that knows what they are doing: let's be generous and say a week to allow for the unforeseen.
And the point is this. The class schema generated by the GBCS team is entirely dependent on the data from the GfK survey, not the data in the GBCS survey which plays no effective role in defining the GBCS class categories. The GfK survey is the foundation of the whole enterprise. Why delay in disseminating these data? The GBCS team may prefer us to believe that it is the GBCS data that people should be interested in, but these are, as I pointed out elsewhere, just the mountain of bad data sitting on the molehill of (relatively) good GfK data.
I can see the slightly ludicrous prospect of the GBCS team getting into print to "refute" the claims of the as yet unpublished critics who will have had no chance to actually examine the data on which the original set of claims was made. It seems to me that there is a serious disproportion between the amount of effort put into puffing the GBCS and the amount of effort put into facilitating the assessment of the science. I find it difficult to think of any reasons why the GfK data should not already be in the public domain.

