Methodology and Data Harmonization

How to Model Parental Education Effects on Men and Women’s Attainment?  Cross-National Assessments of Different Approaches 

Research in social stratification shares the assumption that social origin operates through assets embedded in the family structure, yet scholars’ opinions of how resources get transmitted intergenerationally vary significantly. The result of this variation in opinions is a range of measures for family background, and distinct empirical models. A simplified schema yields three main methodological approaches: (a) one parent’s characteristics models; (b) models using characteristics of both parents; and (c) models accounting for specific effects of social origin depending on gender. In this paper we analyze how models of each type perform when applied to cross-national data from the European Social Survey (Round 3). We focus on the impact of parental education on children’s success, while controlling for parents’ social class position. Individual success is conceptualized primarily in terms of educational attainment, but also of occupational standing. Although our analyses do not disclose consistent patterns across all studied countries — neither of the models performs uniformly better, or worse, in majority of countries – some regularities are noticeable. In particular, with respect to explaining educational attainment, we find that it is generally preferable to include measures for both parents’ education, rather than use one parent’s characteristics models. The best fitting model – in terms of explained variance – is that combining father’s and mother’s education by including an interaction term of these variables. In the case of occupational standing, we generally consider the model that accounts for father’s and for mother’s education as the preferred solution – at least when direct effects are statistically significant. In addition, the hypothesis that the intergenerational transmission of parental education affects men and women differently is, in light of these outcomes, supported only in some of the countries.


Decomposition Of Long-Term Changes In Political Opinions According To Group-Specific Markov Processes

In this paper I use longitudinal data for Poland to test the assumption that political opinion change through time is not entirely due to some universal and time-constant processes; rather, it depends on the initial conditions in a person’s state. Information on Poles’ evaluations of the past socialist regime available for repeated intervals, and over a sufficiently long time period—ten years—allows me to decompose long-term changes in assessment of socialism into short-term change, and the reliability of responses according to group-specific Markov processes. I obtain three types of stochastic matrices: Mt, t+10, Mt, t+1, Mrel = R , where M refers to a matrix of opinions in time t by opinions in subsequent time, t refers to specific years, and R is the reliability matrix from the measurement of opinions in one-month period. To assess the fit of the observed transition matrix for the 10-year period as a linear combination of matrices Mt, t+1 and Mrel, I apply the random effect maximum likelihood function in STATA, with the bootstrap option for obtaining the standard errors of the coefficients. Results demonstrate that Markovtype processes do not have significant explanatory power for long-term change in opinions about socialism. Substantively, this means that the ‘subjective’ legacy of the past, namely peoples’ views of the former regime, matters.

Representation of Post-Communist European Countries in Cross-National Public Opinion Surveys (from 2006)

The democratization of political systems and the change to market economies that people in various parts of the world have experienced over the past twentyfive years have resulted in increased general interest in the state of public opinion. The heightened concern with understanding public views on socio-economic and political transformations has led to a rise in the number of cross-national surveys, both academic and commercial. However, the participation of countries in cross-national research on public opinion is very uneven, not only because of economic factors but also for political and cultural reasons—the well-developed countries of the Northwest participate more often, whereas there are many laggards in the South and East. To the extent that the excluded or under-represented countries are systematically different from those included, comparative studies are likely to encounter serious problems. Substantively, knowledge will be limited, impeding the ability to legitimately generalize findings and interpretations beyond the included regions. Methodologically, in research that treats countries as the framework for attitudes and behavior, and uses the techniques of contextual analysis and hierarchical modeling, results may be seriously biased in that the under-represented countries distort the distribution of macro-level variables. With this in mind, the present discussion focuses on the post-communist countries of Europe and examines their representation in cross-national research projects to determine the extent to which the coverage is uneven, and what factors account for the inequality. Contrary to studies that examine the growth of public opinion research from the perspective of data quality and comparability of the countries included, this study analyzes why some countries are repeatedly left out or under-represented, whether the omitted countries differ systematically from those included, and what consequences are likely to occur in comparative research. To address these issues we (1) describe major cross-national public opinion surveys, indicating the share of the European post-communist countries, (2) provide information about data availability from these surveys, and (3) discuss methodological issues relevant to comparative analyses in the social sciences in general, and especially in sociology and political science.

Democratic Values and Protest Behavior: Data Harmonization, Measurement Comparability, and Multi-Level Modeling in Cross-National Perspective (from 2014)

This article describes the research project Democratic Values and Protest Behavior: Data Harmonization, Measurement Comparability, and Multi-Level Modeling. This survey data harmonization project engages with the relationship between democracy and protest behavior in comparative, cross-national perspective by proposing a theoretical model that explains variation in political protest in light of individual-level characteristics, country-level determinants, and interactions between the two types of factors. Methodologically, the project requires data with information at both the individual- and the country-level that varies over time and across space. While the social sciences have a growing wealth of survey projects, the data are often not comparable. This project selects variables from existing international surveys for ex post harmonization to create an integrated dataset consisting of large number of variables with individuals nested in countries and time periods. Throughout this process, focus is on three important and well-defined fields of methodology, namely data harmonization, measurement comparability, and multi-level modeling.

Harmonization of Cross-National Survey Projects on Political Behavior: Developing the Analytic Framework of Survey Data Recycling (from 2016)

This article describes challenges and solutions to ex post harmonization of survey data in the social sciences based on the big data project “Democratic Values and Protest Behavior: Data Harmonization, Measurement Comparability, and Multi-Level Modeling.” This project engages with the relationship between democracy and protest behavior in comparative perspective by proposing a theoretical model that explains variation in political protest through individual-level characteristics, country-level determinants, and interactions between the two. Testing it requires data with information at both the individual and country levels that vary across space and over time. The project’s team pooled information from 22 well-known international survey projects into a data set of 2.3 million respondents, covering a total of 142 countries and territories, and spanning almost 50 years, to construct common measures of political behavior, social attitudes, and demographics. The integrated data set is appended with country variables from nonsurvey sources. Mapping the methodological complexities this work raised and their solutions became the springboard for the analytic framework of Survey Data Recycling (SDR). SDR facilitates reprocessing information from extant cross-national projects in ways that minimize the “messiness” of data built into original surveys, expand the range of possible comparisons over time and across countries, and improve confidence in substantive results.

The rise of cross-national survey data harmonization in the social sciences: emergence of an interdisciplinary methodological field

Cross-national survey data harmonization combines surveys conducted in multiple countries and across many time periods into a single, coherent dataset. Methodologically, ex post survey data harmonization is especially complex because it combines projects that were not specifically designed to be comparable. We examine the institutional and intellectual history of nine large scale ex post survey data harmonization (SDH) projects in the social sciences from the 1980s to the 2010s. An interdisciplinary methodological field of SDH slowly emerges, facilitated in part by a partnership between academia and government and from the coordinated contributions of social scientists, survey methodologists and computer scientists. While there has been a learning process, it is in terms of accumulated practicalities, and not with the coordination or institutional apparatus one would expect from a 30 year effort.