Andrew Gelman has an interesting piece on his blog about the politics of Wikipedia edits. The scientific point that is at stake here is that prediction before you have peeked at the data (fitted a model) is a completely different thing to prediction after you have fitted a model and it is...err...essentially dishonest to pretend that these are one and the same thing or are of equivalent scientific value. Think about it this way. Fit your favourite model for a binary outcome - discriminant function, logistic regression or whatever - to a sample of data and define a decision rule to calculate how many you got in the right box. Now apply that same model with the same parameter values to a new set of data. You won't do anywhere near as well because first time round you capitalised on chance. It's multivariate analysis 101, or at least it should be.
Monday assorted links
-
1. AI is discovering new Nazca lines in Peru (NYT). 2. New paper on
long-term asset returns. 3. AI agents as researchers. 4. More on NIH
reform. 5. Scott...
45 minutes ago
No comments:
Post a Comment