Data Access

This site provides access to data from my published and forthcoming work. The data is ordered by topic (top of the page) and by paper (bottom of the page). Data users are kindly asked to cite the indicated reference papers.


[A] Occupation Codes

The occ1990dd occupation classification aggregates U.S. Census occupation codes to a balanced panel of occupations for the 1980, 1990, and 2000 Census, as well as the 2005-2008 ACS. The files below also allow to build an unbalanced panel of occ1990dd codes for the Census years 1950, 1960 and 1970.

Crosswalk Files

  • [A1] 1950 Census occ to occ1990dd.
  • [A2] 1960 Census occ to occ1990dd.
  • [A3] 1970 Census occ to occ1990dd.
  • [A4] 1980 Census occ to occ1990dd.
  • [A5] 1990 Census occ to occ1990dd.
  • [A6] 2000 Census occ to occ1990dd.
  • [A7] ACS occ to occ1990dd.
  • [A8] Aggregation of occ1990dd to occupation groups.

Additional Resources

  • [A9] Construction of occ1990dd occupation codes.

References

  • For [A1] to [A8]: David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.
  • For [A9]: David Dorn. "Essays on Inequality, Spatial Interaction, and the Demand for Skills." Dissertation University of St. Gallen no. 3613, September 2009.



[B] Occupational Tasks

The files below provide task data for occ1990dd occupations. Abstract, routine and manual tasks in file [B1] are based on data from the Dictionary of Occupational Titles 1977 while offshorability in file [B2] is based on task values from O*Net.

Data Files

  • [B1] Abstract, routine and manual task content of occ1990dd occupations.
  • [B2] Offshorability of occ1990dd occupations.

Reference for [B1] and [B2]

  • David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.



[C] Industry Codes

File [C1] provides a weighted crosswalk between NAICS 1997 6-digit industry codes and SIC 1987 4-digit industry codes. The variable weight indicates the share of a NAICS industry's 1997 employment that maps to a given SIC code. Data on employment by NAICS-SIC cell is provided by the U.S. Census Bureau, though some employment counts are only reported in brackets. The construction of the crosswalk file imputes employment within brackets based on information about establishment counts by NAICS-SIC cell, and data on average number of employees per establishment by industry from the County Business Pattern data (see [F] below). File [C2] aggregates SIC 1987 4-digit codes for manufacturing industries such that each resulting industry maps to one or several HS 6-digit product codes. File [C3] aggregates the consistent Census industry code ind1990 to a balanced panel of industries for the 1980, 1990 and 2000 Census and the 2006-2008 ACS.

Crosswalk Files

  • [C1] NAICS97 6-digit to SIC87 4-digit.
  • [C2] SIC87 4-digit to SIC87dd 4-digit.
  • [C3] Census ind1990 to ind1990dd.

Additional Resources

  • [C4] List of SIC87 and SIC87dd and corresponding ind1990dd manufacturing industry codes.
  • [C5] Comparison of 397 SIC87dd industry panel of Autor, Dorn and Hanson (2013) with compressed 387 SIC87dd industry panel of Acemoglu, Autor, Dorn, Hanson and Price (2014).

References

  • For [C1] to [C4]: David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review, 103(6), 2121-2168, 2013.
  • For [C5]: Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Return of the Solow Paradox? IT, Productivity and Employment in U.S. Manufacturing." American Economic Review, P&P, 104(5), 394-399, 2014.



[D] Industry Trade Exposure

We concord 6-digit HS product-level trade data from U.N. Comtrade to 4-digit sic87dd manufacturing industries. The trade data covers the years 1991 to 2007, and reports both imports to the U.S. and imports to a set of eight other wealthy countries (Germany, Switzerland, Spain, Denmark, Finland, Japan, Australia, New Zealand). Trade flows are reported separately for different exporters (China, other low-wage countries, Mexico and CAFTA, USA, Canada, and rest of the world).

Data File

  • [D1] Trade flows by sic87dd industry, importer, exporter, and year.

Crosswalk File

  • [D2] HS 6-digit to sic87dd 4-digit.

Reference for [D1] and [D2]

  • David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review, 103(6), 2121-2168, 2013.



[E] Local Labor Market Geography

Commuting Zones (CZs) provide a local labor market geography that covers the entire land area of the United States. CZs are clusters of U.S. counties that are characterized by strong within-cluster and weak between-cluster commuting ties. The crosswalk files [E1] to [E5] below provide a probabilistic matching of sub-state geographic units in U.S. Census Public Use Files to CZs. In each file, the variable afactor indicates which fraction of a SEA/County Group/PUMA maps to a given CZ. To allocate observations from Census microdata to CZs, one has to merge the geographic unit of the Census file to the corresponding CZ crosswalk file using a many-to-many merge. The labor supply weight or person weight from the Census then needs to be multiplied with afactor. The file [E6] provides a mapping from counties (according to 1990 county definitions) to CZs, and the files [E7] and [E8] map CZs to the states and Census divisions that comprise the largest share of a CZ's population.

Crosswalk Files

  • [E1] 1950 Census State Economic Areas to 1990 Commuting Zones.
  • [E2] 1970 Census County Groups to 1990 Commuting Zones.
  • [E3] 1980 Census County Groups to 1990 Commuting Zones.
  • [E4] 1990 Census Public Use Micro Areas to 1990 Commuting Zones.
  • [E5] 2000 Census and ACS Public Use Micro Areas to 1990 Commuting Zones.
  • [E6] 1990 Counties to 1990 Commuting Zones.
  • [E7] 1990 Commuting Zones to States.
  • [E8] 1990 Commuting Zones to Census Divisions.

Additional Resources

  • [E9] Change of county codes.
  • [E10] Construction of geography crosswalks.

References

  • For [E1] to [E9]: David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.
  • For [E10]: David Dorn. "Essays on Inequality, Spatial Interaction, and the Demand for Skills." Dissertation University of St. Gallen no. 3613, September 2009.



[F] County Industry Structure

The publicly available County Business Pattern data provides employment counts by county by industry. The employment numbers are often reported only in brackets. The data cleaner files [F1] to [F3] implement a fixed point algorithm to estimate employment numbers within the indicated brackets. They also impute employment which is only reported at aggregate industry levels to 4-digit SIC or 6-digit NAICS industries. The files [F4] to [F6] correct errors in county codes and aggregate industry employment data to the Commuting Zone level. Note that considerable computing power is required to run files [F1] to [F3].

Data Cleaner Files

  • [F1] 1980 CBP county-level employment.
  • [F2] 1990 CBP county-level employment.
  • [F3] 2000 CBP county-level employment.
  • [F4] 1980 CBP aggregation to CZ-level employment.
  • [F5] 1990 CBP aggregation to CZ-level employment.
  • [F6] 2000 CBP aggregation to CZ-level employment.

Reference for [F1] to [F6]

  • David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review, 103(6), 2121-2168, 2013.



[P] Data by Paper

The file packages below compile Stata data files, do files, log files and graphs, as well as tables and figures in Excel format. These packages are also available from the AER website.

File Packages (ZIP)

  • [P1] David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.
  • [P2] David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review, 103(6), 2121-2168, 2013.
  • [P3] Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Return of the Solow Paradox? IT, Productivity and Employment in U.S. Manufacturing." American Economic Review, P&P, 104(5), 394-399, 2014.