The Society for Personality and Social Psychology (SPSP) kindly posted data on the diversity of their membership online. The raw data is provided in tabular form, with mutually-exclusive counts of members who fall into each cell created by crossing three factors: career stage, sex/gender, and race/ethnicity. Although analyses like χ2 can be used to examine these data, there are more options if the data is expanded out into a form where members are represented each on a row. Using R, this is an easy task to accomplish. After the data was restructured, I was curious to extract some descriptives! Download the restructured data and syntax.
First for some basic pie charts of career stage (Figure 1), ethnicity (Figure 2), and gender (Figure 3).
So, among this population, what are the odds of being from a member of any given racial or ethnic group? Is this different across career stages or by sex*? I decided the best way to test this question would be to calculate odds ratios of the likelihood of being from a particular ethnic group, overall and based on career stage and gender. As you can see in the R script I used for the analyses, I calculated odds ratios by regressing membership in a given ethnic group on an intercept-only model in a logistic regression. The intercept in these models serves as the log odds, which can then be converted into an odds ratio. To estimate odds by a given group, dummy variables were included in the logistic regression models that were coded such that the reference category was the target group and again the intercept was used as the estimated log odds for the reference category. Finally, the moderation by career stage and sex was tested with a likelihood ratio test comparing the models that included those predictors versus not. The following table summarizes these results:
All S/P Psychologists | Undergraduates | Graduate Students | Early Career | Full Member | Retired | Career Stage | Men | Women | Sex | |
---|---|---|---|---|---|---|---|---|---|---|
Asian | 0.13 | 0.12 | 0.179 | 0.133 | 0.099 | 0.008 | χ2 = 65.17, p = 0 | 0.152 | 0.148 | χ2 = 0.08, p = 0.774 |
Black | 0.028 | 0.049 | 0.032 | 0.03 | 0.022 | 0.008 | χ2 = 11.8, p = 0.019 | 0.027 | 0.042 | χ2 = 5.72, p = 0.017 |
Latino | 0.027 | 0.068 | 0.03 | 0.015 | 0.022 | 0 | χ2 = 28.4, p = 0 | 0.03 | 0.037 | χ2 = 1.4, p = 0.237 |
Middle Eastern | 0.004 | 0.005 | 0.006 | 0.008 | 0.002 | 0 | χ2 = 9.36, p = 0.053 | 0.003 | 0.008 | χ2 = 5.08, p = 0.024 |
Native American | 0.006 | 0.005 | 0.009 | 0.008 | 0.004 | 0.008 | χ2 = 6.58, p = 0.16 | 0.005 | 0.012 | χ2 = 5.69, p = 0.017 |
White | 1.299 | 0.981 | 1.031 | 1.276 | 1.634 | 2.263 | χ2 = 77.12, p = 0 | 2.074 | 1.966 | χ2 = 0.61, p = 0.436 |
Other | 0.298 | 0.358 | 0.313 | 0.285 | 0.28 | 0.292 | χ2 = 5.32, p = 0.256 | 0.109 | 0.089 | χ2 = 3.36, p = 0.067 |
Any Ethnic Minority | 0.77 | 1.02 | 0.97 | 0.784 | 0.612 | 0.442 | χ2 = 77.12, p = 0 | 0.482 | 0.509 | χ2 = 0.61, p = 0.436 |
* I decided to only calculate odds ratios by sex (i.e., men and women) instead of the full gender information we were given, only because transgender members are so rare that the odds ratios of any given ethnicity are extremely small.
Below is the restructured data, the original tabular data in Excel, and the R Script that converted the table to a case-wise dataset.
The case-wise dataset has three variables, stage representing career stage, gender representing the sex or gender of the member, and ethnicity representing the ethnicity of the member. All categorical levels are represented with semantically meaningful strings.