Tuesday, July 22, 2014

Decompose that Denominator, Bro

Noah Smith wants to challenge your silly preconceptions about your invincibility. Good on him. His job would have been a lot easier though if he'd have taken the simple step of decomposing his denominator.

His formula:


So far so good. But note that the denominator isn't very illuminating.

A better way of stating the case:


Look at what this formulation does. It obliges you to calculate the relative probability of Type I and Type II errors. This expanded version explicitly forces you to consider the quality of evidence, both for and against.

Of course, Noah's critique of unknown (unknowable?) alternative hypotheses remains salient. Mankind's persistent epistemological blind spot is non-ergodicity. There will always be missing terms in that denominator.

But we do the best we can with what we've got.

#phronesis

Wednesday, December 11, 2013

The IQ Horseshoe

Part of the paper I'm writing with Alex Nowrasteh over at Cato has bits in it on opinions on specific policy issues. If you're reading this (and I have every reason to assume you are reading this), there's a pretty good chance that you're familiar with my fondness for Stata's MARGINSPLOT command. You may also be familiar with my fondness for ordered probit regressions using GSS data. It's a wonderland, people. Combine the two, and I'm like Jack Skellington in Sandy Claws gear, filled with dreadful Christmas cheer.

Take a look at these, would you?




What doth these? Margins, of course. Specifically, margins of this regression:

oprobit natcrime i.partyid i.immcat i.polviews i.female i.race i.age i.degree i.wordsum loginc, cluster(year)

Of the form

margins wordsum#female, predict(outcome(n))

where n \in \!\, [1,2,3], obviously.

Basically what I've done is asked the GSS how folks feel about (in this case) federal spending on crime prevention and looked at the margins based on their IQ proxy, which is the ability to get vocabulary words right. I then, for reasons not entirely germane to this post in particular, wrung two separate series out of each margins command, divvied up by sex. Nothing especially fancy.

But the results are very curious, and they seem to more or less hold up across a multitude of these spending questions. At the low end, we see noise (indicated by the big error bars), which is what we expect of folks with low IQ: they have inconsistent, often irrational beliefs. No big news there. But the hell of it is in the point estimates. What's with the horseshoe? Someone who only gets 2 of 10 vocabulary words right is about as likely to think that the government spends too little on crime as someone who gets 9 of 10 words right. But someone who gets 5 right is about 7% more likely to hold the same opinion. Now, I haven't done pairwise significance testing, but the eyeball check shows point estimates outside rival error bars, so it's probably teasing the 95% threshold at least, but even without that, I find the shape of these curves compelling.

And I'm not entirely sure what to make of it. For the crime stuff anyway, we might imagine that low-IQ folks are of the "I don't trust the police" type and the high IQ folks are in the "violent crime has empirically decreased in the past century" camp, with the middling folks picking up the slack, but that's just a wild guess.

Anyway, I thought it was a cool little snippet of empirical evidence I thought would be fun to share. Enjoy the rest of your day, and drive safely.

Friday, November 22, 2013

Medical Professionals, Political Preferences, and Party Identification

It started with this tweet from Matt Y.
My good friend Bryan Caplan remarked:
 Click through for the rest of the conversation. The upshot is that I got to wondering how well self-reported political preferences predict party affiliation among doctors. If you're stopped by here before, you might recognize this question as one I've applied to immigrants.

For medical professionals, there are different effects than with typical immigrants. For doctors, one of the big treatments is education. There's also a bit of natural teamsmanship that happens when a person is a member of various professional organizations (AMA, eg), so my hunch is that political preferences (liberal vs conservative) would do a worse job predicting partisan affiliation for docs compared to everyone else. Let's see if I'm right.

A quick note on variable definitions

For this, I defined medical professionals as GSS respondents who had Census occupation codes related to the medical profession, and let me tell you, it was kind of frustrating, since there are three code batches in the data coming from 1970, 1980, and 2010. I can't fault the census for not making up its mind, since the types of jobs out there have changed a lot in those 40 years (shout out to my man Izzy K!). So the 'medicalprofessional' variable isn't strictly physicians, but includes dentists and diagnosticians and all that. Since the original question was about medical professionals, I judged this to be fair. If you're unhappy with these inclusions, please feel free to write me a referee report and I'll adjust my approach in the final published piece (snicker).

Econometric specification (I can't be bothered to write out the equations, so forgive me my sins, gentle reader):

oprobit [Party Identification] [Political Views] [Medical Profession Dummy]

margins [Political Views]#[Medical Profession Dummy], predict(outcome([0-7])

marginsplot

The graphs below are the margins plots for each of the seven results, starting with strong democrat, all the way to Strong Republican, plus a bonus, Other Party. Please note that for these plots, the original regression is uncontrolled, and it has no fancy standard errors. It's as plain-jane, bare-bones as an ordered probit regression gets. Which is a weird way to describe it, since ordered probits are pretty abstruse to non-specialists.

But please don't be intimidated. It's just a nice, clear way to express probability. You want to know what the odds are that a randomly-chosen moderate doctor will identify as a Democrat? I got you covered, bro. Check out graphs 1-3. Easy! You can also get a pretty good idea of base rates of docs vs non-docs by just looking at whether or not the series are above each other. Also easy!


So avoiding further ado, here are the Democratic Party margins plots
Fig 1: Strong (D)

Fig 2: Not Strong (D)

Fig 3: Independent, Near (D)


Okay, so what are we looking at with these? First off, in overall terms, medical professionals (the series in red) tend to be less likely overall (compared to all other respondents) to identify with the Democratic Party, with the excepetion of liberal docs going independent with (D) tendencies. So, even if doc voters are leaning (D) in the past few elections, they're still not identifying (D) in response to GSS surveys. That's fine, it doesn't prove Caplan wrong. Indeed, when you look at voting patterns, it shouldn't surprise you that well-educated people buck party lines in the voting booth. Not a particularly noteworthy result, ladies and gentlemen.

But that little crossover there sure is interesting, isn't it? Let's see what the probabilities for straight Independents are:
Fig 4: Independent

The effect is even stronger. Compared to other respondents, liberal docs are more likely to be independent than their conservative pals. Good gravy. Of course, strictly speaking, the 95% intervals cross, so it's not a statistically significant effect, especially considering that I didn't use robust standard errors in the original regression and my eyeball checks suggested some heteroskedasticity (anyone else have to look the spelling of that word up every damn time?).


That leaves the Republican Party margins plots.
Fig 5: Independent, Near (R)

Fig 6: Not Strong (R)

Fig 7: Strong (R)

Medical professionals are more likely to be Republicans. But pay attention to where the largest departures are. Liberal/very liberal docs are more likely than liberal/very liberal ordinary citizens to go the independent/(R)-leaning route. Very conservative docs are even more likely than regular citizens to identify strongly with the Republican Party.

What's all that about? Well, to find out, let's run a controlled regression and see if results change.

oprobit partyid i.medicalprofessional i.polviews i.immcat i.wordsum i.female i.race, vce(robust)

You have no way of knowing if this is true, but I am liveblogging these results. The stuff above I ran before I started writing this post, but from here on out, I am composing this blog post without first looking at the results. How daring of me! Whatever, here's the ordered probit results in handy-dandy HTML table form no:


Iteration 0:   log pseudolikelihood = -47539.228  
Iteration 1:   log pseudolikelihood = -45102.581  
Iteration 2:   log pseudolikelihood = -45100.282  
Iteration 3:   log pseudolikelihood = -45100.282  

Ordered probit regression                         Number of obs =      24342
Wald chi2(23) =    4100.19
Prob > chi2 =     0.0000
Log pseudolikelihood = -45100.282                 Pseudo R2 =     0.0513


partyid           Robust
       Coef.     Std. Err.   z      P>z          [95% Conf. Interval]
1.medicalprofessional    
      .037221   .0372723     1.00   0.318 -.0358314    .1102734
                    
polviews 
2    -.0042378   .0572077    -0.07   0.941 -.1163628    .1078872
3     .2543847   .0558183     4.56   0.000 .1449828    .3637866
4     .4465486   .0541506     8.25   0.000 .3404153    .5526818
5     .7458233   .0557274    13.38   0.000 .6365996     .855047
6     1.073725   .0568146    18.90   0.000 .9623707     1.18508
7     1.071495   .0686654    15.60   0.000 .9369131    1.206077
                    
immcat 
1     .0759376   .0297705     2.55   0.011 .0175884    .1342868
2    -.2237079   .0358137    -6.25   0.000 -.2939016   -.1535143
3     -.094237   .0322473    -2.92   0.003 -.1574406   -.0310334
                    
wordsum 
1    -.0183601   .1146955    -0.16   0.873 -.2431592    .2064389
2    -.1406865   .1074688    -1.31   0.191 -.3513215    .0699486
3    -.1088001   .1033581    -1.05   0.293 -.3113783    .0937782
4    -.0730759   .1012778    -0.72   0.471 -.2715768    .1254249
5    -.0292433    .100435    -0.29   0.771 -.2260924    .1676058
6     .0542896   .1000425     0.54   0.587 -.1417901    .2503694
7     .0885638   .1003957     0.88   0.378 -.1082082    .2853357
8     .1268125   .1010346     1.26   0.209 -.0712117    .3248366
9     .1211953   .1016877     1.19   0.233 -.078109    .3204995
10     .0712821   .1027608     0.69   0.488 -.1301253    .2726896
                    
1.female   -.1058552   .0134334    -7.88   0.000 -.1321842   -.0795261
                    
race 
2    -.8595719   .0219832   -39.10   0.000 -.9026582   -.8164856
3    -.2088881   .0355149    -5.88   0.000 -.2784961   -.1392802
/cut1   -.7557187   .1122173 -.9756606   -.5357768
/cut2   -.0160698   .1121541 -.2358878    .2037481
/cut3    .3430329   .1121672 .1231892    .5628765
/cut4    .7309348   .1122396 .5109493    .9509203
/cut5    1.022414   .1123512 .8022098    1.242618
/cut6     1.72903   .1128808 1.507788    1.950273
/cut7    2.782055    .116267 2.554176    3.009934

You'll want to do yourself a favor and look at that on a proper computer monitor. For obvious reasons. Category definitions are what you'd imagine. Medicalprofessional = 1 for medical professionals; polviews are from 1=extremely liberal to 7=extremely conservative; female = 1 for female; race: 1=white, 2=black 3=other. Anyway, let's redo the Independent affiliation margins to see if it's any different (are you excited? I'm excited! Let's go!)

Fig. 8: Independent, with Controls


Wow, that took a long time to run. I have more important things to do than run all of these over again, so I won't. Anyway, let's see what we've got... hm. Same results, with less statistical significance. How anti-climactic.

Maybe we can salvage this pig though. How does medical profession stack up against, say, gender for predicting party affiliation? In other words, let's find interactions of [medical professional]#[political views]#[gender]

Fig 9: Independent, with Controls, by Gender

Okay, so that's not so easy to see, but the red and yellow series are for females and the blue and green are for males. Note that the professional gaps (red-to-yellow and blue-to-green) are modest compared to the gender gaps (blue-to-red and green-to-yellow). I did not expressly choose the colors, if that matters.

So what's the takeaway of this last graph? Well, from where I sit, there's nothing particularly compelling about the medical profession when it comes to party identification, at least not compared to gender (and there are a few other horse races I could run, provided I could muster the interest, which I can't).

But that does not mean that docs necessarily agree with the general population on specific issues. That might be worth a closer look. I imagine I could chase this down a pretty deep rabbit hole, which is why I think I should stop for now.

Which is exactly what I'm doing now. Stopping.

Wednesday, October 16, 2013

More on the Political Externalities of Immigration

This is a follow-up to a post I wrote a while back on the attitudes of (rather than towards) immigrants. There, I asked whether or not migrant status predicted party affiliation (it does) and how well the interaction of immigrant status and political views predicted party affiliation (there's a strong bias towards the Democratic party among immigrants). Today, I'll take a look at how these variables interact to predict specific policy positions.

Monday, April 8, 2013

Coming to the Party in the USA, then Dancing with the Stars on the Ceiling in the Streets

Hello friends. You might be wondering why I'm posting this here and not over at Euvoluntary Exchange, where I'm typically found. The short answer is that this is off-topic. The long answer is that this is off-topic and I can't think of an easy way to shoehorn it on-topic. So here we are. I'm putting a fold here because this post will be gigantic thanks to all the tables and graphs I'll be using.

Wednesday, August 1, 2012

Supply et Demand

Recently, there was a mongo power outage in India. 350 million+ people were sans electricity in a country not especially well known for having an especially reliable grid even under the best conditions. While this may not be a tragedy comparable to the Haiti quake or the Japanese tsunami, wholesale power loss threatens hospital patients and can compromise secure nuclear locations if recovery is sufficiently delayed.

Analysts blame the power losses on a lackluster monsoon season: farmers were running well pumps longer and harder than usual and there was insufficient river flow to crank the hydroelectric plants. That's interesting, but the reporting on it this morning on NPR was cringeworthy. Either the host or the guest (I have a short commute, so I wasn't listening long enough to nail down reliable identification) described it as a "problem with supply and demand; demand increased and the supply wasn't there."

That's actually paraphrased, so the quotes are inappropriate. Please accept my lazy editing for the purposes of making a broader point.

And that broader point is simply this: in economics, the terms "supply" and "demand" are elements of a story about prices and quantity. The problem with Indian power distribution isn't one of a spike in demand and a lag in a supply response, it's one of interference with price signals, unimpressive property rights, and more than all else, an insatiable leviathan who feeds on the entrepreneurial spirit. Starting even an ordinary retail business is tough enough in India. Improving the power generation and distribution network is next to impossible. Describing this as some kind of technical failure or blaming it solely on the weather is intellectually dishonest.

I understand the appeal of looking just at proximate causes and assigning blame. It's easy and it's a decent way to grind an ideological axe if that's your thing. It's poor reasoning though. You probably don't have to go back to the first protozoa crawling out of the primordial soup, but you can at least point to significant contributing factors, especially when they've got far-reaching consequences. In this case, the Indian regulatory apparatus is a tight throttle on the prosperity of the Indian people. Why not report on that?

Saturday, March 3, 2012

In Quaerere Parentum

If Buchanan is right about parentalism, the implication for welfare reform is that social control will shift to non-pecuniary margins. The sort of warping of choice sets under these conditions could exacerbate sorting effects.

A negative income tax could imply social fractionalization the likes of which even Paul Atredies himself can scarcely imagine. Could the United States balkanize? Yes--this is my new most shocking prediction.