In a recent post about the ecological fallacy (definition here), Andrew Gelman cites an earlier post of his on what he dubs the 'secret weapon'. It's a simple enough technique: running a regression on data for several years and plotting the estimated coefficients over time. Yet too few economists use it.
For example, most cross section panel studies eschew such visual methods in favour of more fancy econometrics. I see the two approaches as complementary, and personally am more impressed at charts which can clearly show me what is happening with the data or the regression over time than yet another new econometric test based on yet another zilion Monte Carlo simulations.
Many econometricians I have worked with don't even bother looking at the data (a simple scatter plot, for example) before trying to fit a model to it. They simply assume the data is fine, without looking for patterns or checking the outliers. (This reminds me of the joke about econometricians which a Treasury Secretary once told a business audience... for another day, I think.)
Or fitting the first half of the data and predicting the second half to see how well it holds up.
Posted by: Lord | Monday, March 06, 2006 at 07:57 PM
I'm guessing I'm in the minority on this, but when it comes to trying to understand the underlying structure of a set of data, by far the most productive approach for me has been visualization first, then econometrics (and then repeat several times).
Posted by: Marc Shivers | Monday, March 06, 2006 at 08:25 PM
I'm with you, Marc - it's an iterative process.
Posted by: New Economist | Monday, March 06, 2006 at 08:28 PM
It's tricky to do what Marc suggests without curve fitting, especially when you iterate through repeated attempts to predict the data as well as you can. There's a good argument for enumerating your theoretical beliefs about the data generating process before you dive in.
Posted by: Naunihal Singh | Wednesday, March 08, 2006 at 04:59 AM
Some econometricians (fortunately few) use variables to fit their theories rather than testing an unbiased model to allow new theories to fall from the data. Also, I was taught a scattergram (or other graphs) is part of the methodology. I agree with Naunihal that a sound model, i.e. that has all of the most relevant factors, is most important.
Posted by: Arthur Eckart | Wednesday, March 08, 2006 at 06:10 AM
Sound model i.e. with the most relevant factors tested for serial correlation, hetroskedasticity, omitted variable bias, etc.
Posted by: Arthur Eckart | Wednesday, March 08, 2006 at 06:24 AM