Regional Perspectives on Russia: Data from the Electronic Repository of Russian Historical Statistics
April 15, 2024

Gijs Kessler
Just like Soviet history, Russian history has been written predominantly from an “aggregate” perspective, that is, with the primary focus on the “national” picture and the development of the country as a whole. Whereas the “decolonization” discourse of recent years has challenged Russo-centric views and given a strong impetus to research on non-Russian cultures, societies and histories, the systematic study of regional diversity, including that of ethnically Russian areas, is still in its infancy. The lack of good and readily available data, particularly regional data, is a major factor here.

The Russia-Ukraine war, together with the crackdown on freedom of speech and research in Russia, has made the situation worse. Formerly freely accessible data is taken offline, while gathering new data on politically sensitive issues and its dissemination have become increasingly risky. A new online project called Esli byt’ tochnym (“To be exact”) has risen to the challenge with online datasets and analysis concerning critical issues in the social development of Russia and its regions. Providing open data to inform a public discussion of the country’s past, present and future is central to the mission of the project.

Looking at Russia’s past, data availability is an even larger problem. During the long Soviet period, few statistics were published. Historical research took its cue from preset ideological tenets, and data-driven research was rare. As a consequence, compared to many other countries, even present-day researchers simply have much less readily available historical data at their disposal. This has further contributed to the dominance of national, all-Russia perspectives. Particularly in project-based research, exploring regional diversity is an option often ruled out by constraints of time, money and resources. Due to the large number of regions in the Russian state – up to a hundred at some points in time – researching and data-mining to systematically trace the regional diversity of experience presents a formidable challenge.

The Electronic Repository of Russian Historical Statistics, an online resource developed by the International Institute of Social History (Amsterdam) and the New Economic School (Moscow), aims to lower this barrier. It makes available a basic grid of indicators on the social and economic development of Russia’s regions over the last two centuries. This basic grid allows researchers to trace regional diversity and compare their findings to the national picture and, combining it with data from elsewhere, to world trends. With over 4,000 indicators and more than half a million data entries, this is a large and rich dataset that covers many needs. It allows users to save often rather scarce resources on preliminary studies and instead channel those resources toward additional data-mining activities that more specific research agendas might require.
Figure 1: Homepage of the Electronic Repository of Russian Historical Statistics (ristat.org).
Five historical cross sections

Data in the Electronic Repository of Russian Historical Statistics span a period of over 200 years, covering five historical cross sections with intervals of roughly 50 years: 1795, 1858, 1897, 1959 and 2002. The choice for the exact year is determined by the availability of demographic data: 1897, 1959 and 2002 were census years, whereas for the pre-census years of 1795 and 1858 taxpayers’ censuses (revizii) are available.

It should come as no surprise that data coverage varies considerably across these five cross sections. Generally speaking, the degree of coverage and detail tends to increase over time, but this does not hold true for all topics and subtopics in the database. A full overview of data availability by cross section can be found here.

A source of great riches

Russian historical data presents a largely untapped resource. Due to the country’s long bureaucratic tradition and strong centralist rule, a solid statistical tradition emerged, rare for a country that had a developing economy well into the 20th century. Good-quality and surprisingly well-standardized data was gathered as early as the late 18th century, and subsequently both data-gathering and statistical procedures were significantly further professionalized. The heyday of Russian statistics was the late 19th and early 20th centuries, including the early years of Soviet rule. During these years, a wealth of statistics and statistical analysis was published, testifying to the great level of sophistication achieved in both data-gathering and data-processing. Of course, centralization also had its downside – though it increased data standardization, statistics were also an instrument for standardization and more effective rule of diverse populations.

The glory days of Russian statistics were brought to an abrupt end by Stalin’s modernization push in the 1930s as the publication of statistics practically stopped until the dictator’s death in 1953. It only hesitatingly resumed afterward before gaining importance again during the late Soviet decades. The statistics published during the Soviet period, however, represent merely the tip of the iceberg. Throughout the Soviet period, a gargantuan amount of statistical data was gathered to inform both policymaking and the planning and administration of the state-run economy. Not meant for publication, this data was primarily for internal use. It was meticulously stored in the archives, off-limits for practically all researchers until the very collapse of the Soviet Union. Over the last 30 years, historians have started unearthing the vast wealth of data stored in the archives. The Electronic Repository of Russian Historical Statistics owes its very existence to the insights and findings of these pioneering efforts.

Topics covered

The data made available through the Electronic Repository of Russian Historical Statistics cover seven basic topics: population, labor, agricultural output, industrial output, service sector output, capital and land.

Each of these topics contains several subtopics. Population, for example, covers much more than just population size. It also includes a breakdown by age, rural-urban residence, religion, social group, marital status and education, as well as basic demographic variables like the number of births, deaths and marriages. The data on labor includes a breakdown of the workforce by occupation, source of income, sector and labor relation. All output (agricultural, industrial, services) data includes a breakdown by sector; the data on capital covers both assets and investments; and the data on land covers both land use and land ownership.

Underneath this basic grid, the data offers a vast array of details, ranging from crops sown and harvested to cultural establishments, religious practices, vaccination programs, infrastructure objects, landscape elements, trade networks, fire brigades and natural disasters.

Regions of Russia

All data in the Electronic Repository of Russian Historical Statistics is presented at the regional level according to the administrative division for the cross section concerned. For the pre-Revolution years, the data is presented at the level of individual governorates (gubernii), while for the Soviet and post-Soviet periods it is at the level of individual provinces (oblasti). Though the coverage is not always complete, the aim has been to present a regional spread for all cross sections that is as wide as possible. When data is missing for a particular region, this is clearly indicated in the database.

The regional spread of the data in the Repository is determined by the following considerations. The principal geographic focus of the Repository is the territory of the Russian Federation in its 2002 borders, as this was the last year for which census data was available when the work on this project started. The census data for 2010 has become available since, but it lacks the level of detail of the 2002 census. From this initial choice follow the main selection criteria for the earlier cross sections. For 1959, the database covers the territory of the Russian Soviet Federative Socialist Republic (RSFSR) within the larger Soviet Union. The boundaries of the 1959 RSFSR are identical to those of the 2002 Russian Federation. For the pre-Revolution cross sections, when a principally different administrative division was in place, we included regions that wholly or partially fell within the territory of the 2002 Russian Federation. Only for the cross section of 1897, the database occasionally contains a number of indicators for all regions of the Empire, as these could be harvested from the sources with little additional effort.
Figure 2: The data comes with elaborate documentation.
Sources

One of the principal aims of the Repository is to make regional trends comparable, both to other regions and to the aggregate picture. This is why we have built the Repository from sources that allow for valid comparisons — sources that cover regional trends from a unified methodology. National censuses are among the best examples, but plenty of surveys and statistical series exist that fit the same criteria. Some were published or stored in the archives as consistent datasets with a regional breakdown. Others were stored or published on a region-by-region basis but conform to a unified template. A good example of the latter are the statistical appendixes to the annual governors’ reports, which serve as the principal source of the Repository for the 1858 cross section. Along the same lines, the 1795 industrial output data in the Repository was aggregated from factory-by-factory standardized forms (vedomosti) available from the archives.

The choice of sources for each main topic of the Repository was the outcome of a dedicated research effort that weighed alternatives against each other to determine the source that best fit the bill. In addition, the merits and demerits of the sources were critically assessed. All of this information is stored alongside the actual data in an elaborate set of documentation.
Figure 3: Accessing data in its “historical” classification.
Where possible, preference was given to published sources purely for reasons of expediency. If no appropriate published data could be found, archival sources were used instead. The mix of published and archival sources varies across cross sections. The cross sections 2002 and 1897 are entirely based on published data, the cross sections 1858 and 1959 almost entirely on archival data, and the cross section 1795 on a mix of published and archival records.

Comparability

Comparability of the data in the Repository between cross sections was made possible by coding the data according to established taxonomies, classifications and typologies. This simultaneously makes the data comparable to the experience of other regions, countries and continents. We call this a “modern classification,” as opposed to “historical classification” of data, which refers to the categories of the time as contained in the source. Data can be accessed both in its “historical” and “modern” classification. Both historical and modern classifications are offered in the original Russian and in English translation.
Figure 4: Accessing data in its “modern” classification.
Examples of the modern classification used in the Repository are the NACE classification of economic activities adhered to by Eurostat, the classification of land use of the Food and Agriculture Organization of the United Nations (FAO), the KLEMS classification of capital assets, the Cambridge Primary-Secondary-Tertiary-International (PSTI) classification of occupations, and the taxonomy of the Global Collaboratory for the History of Labor Relations.

An impression of the uses of the data in its modern classification is presented in the following set of maps, which plot the data on occupational structure for the last three cross sections of the Repository by region. These maps show the share of the workforce by region employed in the secondary sector of the economy (manufacturing & mining) as defined by the PSTI classification of occupations. Note that the 1897 data covers the entire Empire, with the core territory covered by the other cross sections indicated with a black line.
Source: Kessler, Gijs and Timur Valetov, “Occupational change and industrialization in Russia and the Soviet Union, 1897-2002,” in Osamu Saito and Leigh Shaw-Taylor (eds), Occupational structure, industrialization and economic growth in a comparative perspective (forthcoming)
Source: Leeuwen, Bas van, Robin Philips and Erik Buyst (eds), An economic history of regional industrialization, Routledge Explorations in Economic History (Oxon; New York, 2021)
These maps show not only the expansion of industrial employment throughout the 20th century, but also its spread from a small number of industrial hotspots in European Russia toward the East and North of the country, as well as its subsequent retreat to certain core areas.

Because of the universal PSTI system of occupational classification used in the Repository, the data can easily be combined with data for other parts of the world, as shown in this map by Van Leeuwen, Philips and Buyst on regional industrialization in Eurasia around 1900.

On the Eurasian landmass, two areas stand out in terms of the size of the industrial workforce – Moscow and Japan, which soon would lock horns with each other in the Russo-Japanese War of 1904-05.

Future use

These maps unlock only a tiny part of the potential of the Repository, based on the very first use cases. As more and more researchers find their way to this resource, further regional perspectives on Russia’s historical development and diversity will be opened. This can be the fruit of dedicated studies and larger research projects, but also of student assignments and “data stories” from online journalism. The Electronic Repository of Russian Historical Statistics has been developed with a broad array of users in mind and offers an intuitive and easily accessible interface, which requires no specialist knowledge.

“Moscow is not Russia,” as conventional wisdom puts it. The Electronic Repository of Russian Historical Statistics offers the data to show us what the differences are. What is more, with the current restrictions on travel to Russia and access to the archives, the Electronic Repository of Russian Historical Statistics makes accessible important data on the history of Russia that has become off-limits to researchers.
  • Gijs Kessler

    Senior Research Fellow at the International Institute of Social History, Amsterdam
More articles
Subscribe to our newsletter
You will receive our biweekly newsletter with the most relevant Russia-related research news.