Monday, April 8, 2013

Coming to the Party in the USA, then Dancing with the Stars on the Ceiling in the Streets

Hello friends. You might be wondering why I'm posting this here and not over at Euvoluntary Exchange, where I'm typically found. The short answer is that this is off-topic. The long answer is that this is off-topic and I can't think of an easy way to shoehorn it on-topic. So here we are. I'm putting a fold here because this post will be gigantic thanks to all the tables and graphs I'll be using.
There's a bit of a puzzle in my ongoing investigation of immigrants' policy preferences. In a whole host of ordered probit regressions, I keep finding a few stylized facts:
  1. First generation immigrants hold very similar policy preference profiles to native-born Americans, particularly when controlling for political party affiliation.
  2. Second-generation immigrants buck party lines on a wide range of questions.
  3. Third-generation immigrants are almost completely indistinguishable from the native-born population.
But what of party selection? The expected trope is that liberals identify with the Democratic Party and conservatives identify with the Republican Party. The misfits that identify with the Libertarian Party are too few to be reliable in the relatively small sample size of the GSS. Is this trope grounded in good empirics? Let's find out!

The Plan
To get at what's going on, I'll do what I can to answer the following questions:
  • Does immigrant status predict party affiliation? (this is a set of uncontrolled regressions)
  • Do political views predict party affiliation?
    • For everyone (uncontrolled and controlled)
    • For 1st, 2nd, and 3rd gen immigrants (uncontrolled and controlled)
All data for this investigation were taken from the most recent release of the General Social Survey and analyzed using Stata 12. Background music used during analysis was my custom Spotify channel "Other Industrial", which I should really rename, since I've got, like, Erasure and Bloodhound Gang and Metallica and whatever else catches my fancy in there now. For in-text notation purposes, I'll put all the GSS variables in parenthetical all caps, but I'll do my best to switch to plain English only in the tables and suchlike. Clarity is important.

First Steps
First off, let's see how strongly it is in general that political views (POLVIEWS) predict party affiliation compared to other covariates. This is an important econometric point, since it's way easy to get snookered by mere statistical significance. Depending on how often you referee, you may or may not get sheaves of empirical papers that leave out something as basic as beta estimates. Statistical significance tells you if the effect is different than zero, but it doesn't say bupkis about whether or not the effect is particularly strong. I owe it to you, gentle reader, to peer into the swerve power of relevant independent variables, to standardize the elephant vs the shrew in this horserace. That's what the 'beta' option is for. It converts your coefficients into z-scores. Cool, huh?

Here's the first regression result. I used the same covariates that Caplan and Miller used for the piece that eventually ended up as part of The Myth of the Rational Voter. The chief explanatory variable I want to draw to your attention is respondents' political views (POLVIEWS). This runs in seven categories from extremely liberal to extremely conservative. The dependent variable, party affiliation (PARTYID) runs in eight categories from strong democrat to strong republican, with the eighth as "other party". The standard tale would imply a pretty little monotonic relationship 1-1, 2-2 all the way up to 7-7. When you look at the crosstab, this is mostly what you get. Observe:

think of self as
liberal or
conservative
party affiliation

strong Dnot
strong
D
ind,
near D
indind,
near R
not
strong R
strong Rother partyTotal
extremely liberal476232243168435351561,322
liberal1,7491,382934633250380168705,566
slightly liberal1,1201,7371,154759448679208606,165
moderate2,6414,3942,3503,3571,6222,91690022418,404
slightly conservative7251,3587128931,0272,049812907,666
conservative6718254016917911,5342,070907,073
extremely conservative22216688156137188506341,497
Total7,60410,0945,8826,6574,3187,7994,71562447,693 
Table 1

The diagonal is as thick as you'd expect. Not a whole lot of extremely liberal strong Republicans. Warning for those viewing on the blogspot website: the table seems to be truncated in the preview, so you might want to switch to an RSS feed to see the whole thing in its HTMLish glory. Fixed

So let's hop right to a linear regression. I freely acknowledge that there are some severe problems with using an OLS approach to this data. We have no ex ante reason to expect that these self-reported (and unnaturally constrained, I might add) categories to track monotonically or to boast normal underlying distributions or anything like that. OLS isn't great to use here, but for a quick and dirty look at magnitudes, it's not embarrassingly terrible. Here's the specification (writing out the model equations gilds the lily for a blog post, and I'm not sure I want to bother to learn how to do so in HTML anyway, so please bear with me). Naturally, all commands will be compatible with Stata. Note that I omit "other party" in the OLS regression. Yes, it screws up the standard errors, but we're not super concerned with statistical significance anyway, so I won't wring my hands about it too much.

reg partyid polviews loginc female age age2 i.race i.finalter i.joblose if partyid!=8, vce(robust) be


Linear regressionn= 16084
(OLS)F( 13, 16070)= 357.24
Prob > F= 0.0000
R-squared=0.2127
Root MSE = 1.792
party affiliation
Coef.
Robust
Std. Err.
t
P>|t|
Beta
pol. views.5091.010946.720.000.344
(log) income.1017.01815.620.000.043
female-.2206.0287-7.680.000-.055
age-.0444.0069-6.460.000-.278
age^2.0004.00014.970.000.214
race
2 (black)-1.5265.0402-37.960.000-.252
3 (other)-.6308.0608-10.380.000-.070
financial
situation alter
2 (worse)-.1180.0397-2.970.003-.023
3 (same)-.0812.0326-2.490.013-.019
prob. of job loss
2 (fairly likely)-.0185.0875-0.210.833-.002
3 (not too likely)-.0242.0707-0.340.732-.005
4 (not likely).0802.06791.180.238.019
5 (leaving l.f.)-.0208.8050-0.030.979-.000
constant1.1378.21135.380.000
Table 2

As more or less expected, political views beats the pants off of everything else (.344 standard deviations). The closest contenders are age (the older you get, the more likely you are to be a Democrat, though with notable nonlinear effects) and race (if you're black, you're more likely to register (D)). Yet, even holding all that stuff constant (and notice how puny income is), political ideology dominates, just plain dominates. So far, so good. Caplan-Miller holds when we use the most recent release. No surprise there.

Does Immigrant Status Predict Party Affiliation?
This is a very basic question. Without controlling for anything else, if we throw the crib door wide, let the people crawl inside, what will happen to the prosthetic foreheads?  I want to be a little more careful about my assumptions for this question. I'll abandon (well, more relax than abandon) the OLS conditions and go straight for an ordered probit.

If you're not all that familiar with LDV models, don't be put off. The intuitions are still the same. The independent variables have some marginal effect on the probability of an outcome. The advantage to using this approach is that we can very carefully parse each margin to find discontinuities. It's not enough to just cram a dummy for immigration into an OLS regression, even with an interaction term since this approach lacks precision and discipline. Crane stance, Daniel-san.

Since generational fade-out or assimilation is something people seem to care about, we'll want to compare margins for a) first generation, b) second generation and c) third generation immigrants. I constructed these variables from BORN, PARBORN, and GRANBORN. I consider a respondent to be an immigrant (IMM) iff not born in this country. I consider a respondent to be second generation (SG) iff born in this country and both parents not born in this country. I consider a respondent to be third generation (TGO) iff born in this country, both parents born in this country, and all four grandparents not born in this country. 

Perhaps this is a bit more restrictive than is necessary for analysis, but if these empirics are to be deployed in an immigration debate, I think it's better to err on the side of caution. Few people would spend a lot of time objecting that a person who has one grandparent from Romania, but the rest from Iowa is in any way distinguishable from any other cornhusker. Split-parent homes are a slightly different matter. That one's ambiguous enough to omit. My wife is Lithuanian and my daughter was born in Virginia. I'm from dirtkicker Oakie stock, so what does that make my little girl? Immigration policy (and probably public opinion) is already pretty favorable towards jus soli citizenship and immigration rights by marriage, so I think it's fair enough to call one-immigrant-parent respondents full-blown Americans. Running the regressions with the oddballs omitted doesn't meaningfully change the results anyway, so we've got that going for us, which is good.

So here is the first regression to run: an ordered probit with Political Party as the dependent variable (PARTYID) and immigrant status (IMM|SG|TGO) as the one and only one independent variable. After that's been run, I crank though marginal results for each DV category running the spectrum from "Strong Democrat" to "Strong Republican". Here's the code in case you want to run these guys at home:

forvalues aa=1/7 { 
quietly oprobit partyid i.imm if partyid!=8 & sg!=1&tgo!=1,vce(robust) nolog
quietly estadd margins imm, predict(outcome(`aa')) eststo mar`aa' 
 } 
 estout mar1 mar2 mar3 mar4 mar5 mar6 mar7, cells(margins_b margins_se) 

forvalues aa=1/7 { 
quietly oprobit partyid i.sg if partyid!=8 & imm!=1&tgo!=1,vce(robust) nolog 
quietly estadd margins sg, predict(outcome(`aa')) eststo mars`aa' 
 } 
 estout mars1 mars2 mars3 mars4 mars5 mars6 mars7, cells(margins_b margins_se) 

forvalues aa=1/7 { 
quietly oprobit partyid i.tgo if partyid!=8 & imm!=1&sg!=1,vce(robust) nolog 
 quietly estadd margins tgo, predict(outcome(`aa')) eststo mart`aa'
 }
estout mart1 mart2 mart3 mart4 mart5 mart6 mart7, cells(margins_b margins_se)

Obviously, you'll need to have estadd installed. Obviously. There's more than one way to skin a cat though, so you might have luck with some other approach though. Here are the tabulated results. Please note that, yes, the sample changes a little in each run, so the native-born margins jump around a little, but the magnitude of the effect is piddling, so don't sweat it too much, says I.

Str DNot Str DInd Near DIndInd near RNot Str RStr R
marginsmarginsmarginsmarginsmarginsmarginsmargins
Native-Born.2037.1161.1536.0902.1642.1025.0140
(.0020)(.0015)(.0017)(.00147)(.0018)(.0015)(.0006)
First-Gen.2117.1173.1521.0876.1556.0931.0120
(.0027)(.0016)(.0018)(.0015)(.0026)(.0025)(.0007)
Second-Gen.2290.1177.1362.0818.1398.0760.0078
(.0031)(.0016)(.0019)(.0017)(.0034)(.0031)(.0006)
Third-Gen.2092.1175.1425.0893.1660.1036.0159
(.0034)(.0015)(.0016)(.0015)(.0034)(.0036)(.0010)
Table 3

I have to admit I had to do a few double-takes (multi-takes?) at these results. The unpartitioned margins show incontrovertible marginal effects of immigrant status on party affiliation, but when we look closer, the picture isn't quite so simple.

I suppose it would help if I explained how to read this table. The marginal effect of immigrant status is presented without parentheses, the (robust) standard error is presented with parentheses. So to get the 95% confidence interval for the actual marginal effect (holding nothing else constant) of being native-born on identifying as "Strong Democrat", add and subtract 0.0020 to 0.2037, meaning that the actual marginal effect of being native-born on party identification is between 0.2017 and 0.2057. Now, there are rigorous statistical methods to tell us whether or not the contents of one cell are actually different from that of another. I could do a big ol' hypothesis-testing pairwise comparison of native-born to the other categories in each column, but this is an in-depth blog post, not a superficial journal article draft, so I'll spare you (and myself) the tedium. Instead, let's just eyeball what we've got and see what jumps out.

And what jumps out is decidedly on the right. Margins for second-generation immigrants on all of the Republican cells are considerably weaker against native-born than both first- and third-generation respondents. It's almost as if first-generation immigrants have similar political party preferences as the native-born population, but then their kids jump ship for hard-left Democratic Party affiliation and then their grandkids mosey back over to the status quo. Not to wax prophetic, but I do wonder if this helps establish a pattern we might see later on.

How Does Immigrant Status Interact with Political Views to Predict Party Affiliation?
Let's now look at carefully folding political ideology into this here cake batter. The skeleton form of the ordered probit that will let us look at our margins is like so:

oprobit partyid [i.imm|i.sg|i.tgo] i.polviews [loginc female age age2 i.race i.finalter i.joblose i.year] if partyid!=8  [sg!=1&tgo!=1|imm!=1&tgo!=1|imm!=1&sg!=1], vce(robust)

The year fixed effects make these output tables, well, let's call them unwieldy. They also contains not much in the way of useful information. Probit and logit coefficients can't be interpreted the same way as OLS coefficients. Margins can though, more or less. That's what I'll share, almost the same as I did above. For these regressions, there's a lot more information, so I'll show you the same things... in graph form! Note also how I changed POLVIEWS to an indicator variable. You must do this to get all the interaction effects you're looking for. 

Let's first do the uncontrolled regressions, shall we? Here's an example of the model:


oprobit partyid i.sg i.polviews if partyid!=8 & sg!=1&tgo!=1, vce(robust)


And an example of the margins command:

margins r.polviews@imm, contrast(nowald) predict(outcome(1))

And here's how I built the graph:


marginsplot, name(sg1uc) nolab title("Adj. Predictions of Pol. Views on Party with 95% C.I.") ytitle ("Pr(Str. D)") xtitle("{&larr}  More Liberal   More Conservative {&rarr}") legend(order(1 "Native-Born" 2 "First-Gen"))


Note the oddity in the "legend" option. It's different from what you might be used to in a twoway graph. If you've little experience with marginsplot, it's a good little bit of trivia you might want to sock away for future use. Also note that it'll take your computer a while to crank through all 21 of these suckers, so caveat um... computor. Another little tidbit I found is the {&larr|&rarr} commands. Those will give you left arrows and right arrows, respectively. You can find a complete list of symbolic characters at http://www.stata.com/bookstore/pdf/g_text.pdf.

So here are the seven trios of graphs. What you're looking at within each trio is from left to right, top to bottom are the margins profiles for, in order, native-born vs. first-gen; native-born vs. second-gen; and native-born vs. third-gen. The x-axis is your standard liberal-to-conservative spectrum, the y-axis is the partial probability of identifying with the selected party. Just like in Table 3, the party identification runs from Strong Democrat through Independent to Strong Republican. It might still be a wee bit confusing so I'll comment along the way. Here we go!

Fig. 1: Immigrant Margins on "Strong Democrat" (Uncontrolled)

Right off the bat, we have the response curve we expect. Generally speaking, the folks we expect to see identifying as "Strong Democrat" are also folks who tend to identify as more liberal. But take a peek at the second-generation graph in the top right. Across the belief spectrum, second-generation immigrants are more likely than native-born folks to identify as the bluest of blue. Then in the bottom graph, the third-gen folks are right back to looking no different than native-born. Sound familiar?

 Fig. 2: Immigrant Margins on "Not Strong Democrat" (Uncontrolled)

Basically the same story. Second-gen immigrants are drawn to the Democratic Party in disproportionate numbers relative to native-born folks. Third-gen folks have assimilated.

Fig. 3: Immigrant Margins on "Independent Near Democrat" (Uncontrolled)

Ditto, but look at how the story is a hair different. Now we have second-gen folks who consider themselves conservative actively embracing identification as blue Independent. The effect is absent for liberal-leaning respondents. How very curious.

Fig. 4: Immigrant Margins on "Independent" (Uncontrolled)

The effects we saw in the last graph are moderated here just a little bit. Conservative-leaning second-gen folks are still disproportionately identifying as Independent. Let's see if we round the corner on the next graph...

Fig. 5: Immigrant Margins on "Independent Near Republican" (Uncontrolled)

Here we go. Second gen folks look similar to native-born folks when it comes to leaning over the right side of the fence. Very interesting. Let's step a little further into Republican territory (if we dare).

Fig. 6: Immigrant Margins on "Not Strong Republican" (Uncontrolled)

Well, this is the mirror of what we saw earlier. Native-born conservative Americans identify Republican. Second-generation conservative Americans don't. Hm. I can't imagine why.

Fig. 7: Immigrant Margins on "Strong Republican" (Uncontrolled)

Ditto. It's almost as if, now hold on a minute here... it's almost as if there's something about the Republican Party that alienates second-generation immigrants. How interesting. 

Now it could be that there are other explanations besides the xenophobia associated with the political kayfabe tied to the (R) brand. maybe there are some personal characteristics of respondents that can explain the gulf. With that in mind, let's add in some control variables and see what falls out. Here's a modified regression specification:

oprobit partyid i.tgo i.polviews loginc female age age2 i.race i.finalter i.joblose i.year if partyid!=8 & imm!=1&sg!=1, vce(robust)


Now, I could pretty easily do the margins on each of these indicator variables, and I will in the real-deal, proper journal-worthy investigation, but it's gilding the lily here. Let's just stick with what should be the biggest predictor of party affiliation: ideology. Let's get right to it then, shall we?

Fig. 8: Immigrant Margins on "Strong Democrat" (Controlled)

Well this is a little bit different. We see the same basic margins curves as in Fig. 1, but without the second-generation divergence. Interesting.

 Fig. 9: Immigrant Margins on "Not Strong Democrat" (Controlled)

Okay, take a deep breath. Both second and third generation respondents more likely to identify as "Not Strong Democrat" after controlling for stuff like income, age, race, change in financial status, job loss and survey year fixed effects? Good gravy. Could it be that ideology was perhaps masking some other effects in the uncontrolled regressions?
Fig. 10: Immigrant Margins on "Independent Near Democrat" (Controlled)

Again, we see some flight from the extreme party affiliation we saw in the last graph. More conservative-identifying second- and third-gen respondents have a marginally higher propensity to identify in the Independent range.

Fig 11: Immigrant Margins on "Independent" (Controlled)

Ditto, with the effects concentrated again on the conservative side.

Fig 12: Immigrant Margins on "Independent Near Republican" (Controlled)

So far, we've been seeing 2/3G immigrants piling up on the Independent pile. Could this be the crossover category?

Fig 13: Immigrant Margins on "Not Strong Republican" (Controlled)

Yes it could. We now see immigrants' children and grandchildren shying away from the Republican Party after controlling for a few demographic and economic variables.

Fig 14: Immigrant Margins on "Strong Republican" (Controlled)

And the same holds true for strong identification with the Republican Party. Immigrants just ain't having any of that nonsense.

So there is some difference with party identification, and by the time we get to the third gen folks, the main explanatory variable, political ideology has its roots in other causes. What those might be deserves further investigation, but the naive generational convergence suggests to me that the story is a heck of a lot richer than what you usually hear, which is something like, "if we let a bunch of immigrants in, they'll come here and vote themselves a welfare state." Since it seems, if anything, immigrants are more likely to side with Independents, this claim now has at least a little more empirical shade over it.

I'll continue this investigation in the future. Stay tuned.




No comments:

Post a Comment