
The Digital Methods School
Empowering researchers to unlock closed societies through open data
The Digital Methods School advances the study of Russia and other semi-closed societies by equipping scholars and experts with accessible digital research methods. We provide curated databases, innovative tools, and hands-on training to turn digital traces into reliable empirical evidence—without requiring specialized technical skills. By democratizing access to cutting-edge methodologies, we foster rigorous, data-driven research that illuminates societies otherwise difficult to reach.
DIGITAL RESEARCH TRAINING ON RUSSIA
Designed for research in the social sciences and humanities, the program equips a broad community of scholars, policy analysts, and business experts with practical tools to collect, analyze, and interpret data on Russia — helping them generate evidence-based insights into a semi-closed society. The trainings are structured to be accessible to participants with no prior experience in digital research, while also offering advanced applications for experienced researchers.
Participants will gain hands-on experience with a wide range of methods, including text data analysis, administrative data research, integration of AI and large language models into the research process, and survey data design and evaluation under the specific constraints of Russia. These skills can be applied to pressing research problems such as analyzing political, social, and economic attitudes under authoritarian conditions from tracing the internalization of state narratives within populations to examining Russia’s strategic decision-making. More broadly, they open new pathways for uncovering empirical evidence in contexts where conventional fieldwork is restricted or impossible.
ORGANIZATION
All training sessions are conducted online and are free of charge. We hold one training block per semester, with each topic announced alongside a call for applications. Each training includes 15-25 hours of lectures and dedicated workshops. Lectures within each block are open to an unlimited number of participants, but require applications to be submitted by the stated deadline. Workshops, by contrast, have limited enrollment and focus on personalized work—either guiding students through practical tasks or supporting them in advancing their own research under the supervision of a lecturer.
Trainings are organized in a hybrid model:
Lectures distributed weekly
Workshops in intensive blocks
We strongly encourage the creation of knowledge commons and networking among participants. To foster this, we combine different platforms for communication and resource sharing: YouTube for tutorials and selected lectures, Discord for peer exchange, and Zoom for personalized interaction.
Curriculum
Introduction by Ivan Grek
Tuesday March 3 18:00 CET / 12:00 PM EST
Lectures: Every Wednesday March 4 - April 15 18:00 CET / 12:00 PM EST
Workshops: Every Thursday March 5 - April 16 18:00 CET / 12:00 PM EST
Sign up to join us.
March 3: Creative digital thinking, Introduction
by Ivan Grek, the director of the Russia Program
Creative Digital Thinking: Using Technology in Social Science and Humanities Research. We will explore how digital tools can transform research in the social sciences and humanities. This is an introduction to the field of digital research and an overview of practical solutions and analytical capabilities developed through the Russia Program’s studies.
Ivan Grek, PhD, is Director of the Russia Program at The George Washington University, where he focuses on the application of digital tools and innovative methodologies for scholarly and policy research on Russia. His research examines ideology, corporate structural power, and geopolitical transformation in the emerging world order, combining political theory with data-driven analysis.
March 4 and 5: Seeing the Invisible: Studying Russia through Alternative and Indirect Data Sources
by Alexander Keysut, Senior Researcher, Cedar
Since 2022, traditional methods of social science research in Russia — including fieldwork and surveys — have become increasingly difficult to implement. At the same time, a wide range of administrative, digital, and textual data remains available, allowing researchers to study social, political, and economic processes indirectly.
We will survey the ecosystem of data sources that remain available for studying contemporary Russia. The lecture will cover administrative registries, court and procurement records, business databases, digital platforms and social media, and the still-extensive official statistics. We will discuss what kinds of research questions each source enables, where and how these data can be accessed, and what biases and risks they involve. The goal is to provide a practical map of empirical evidence that researchers can still rely on under conditions of restricted access.
During the workshop on March 5 we will focus on working with Russian court decisions as a large-scale textual dataset. Participants will examine the use of large language models (LLMs) to extract entities and structured information from legal texts, with attention to practical workflows and validation issues.
March 11 and 12: Working with Russian Data Using AI: A Crash Course
by Damir Malikov, AI Lead at Kronika
A practical introduction to large language models for researchers working with Russian-context data. What LLMs can and can't do, where to find relevant datasets, how to collect and process data at scale — from no-code tools to programmatic pipelines. The session ends with a hands-on workshop: entity extraction from a Russian text corpus. No prior AI experience required.
March 18 and 19: Prompt Engineering: Unlocking the Full Power of AI
by Alexey Sidorenko, the director of Teplitsa. Technologies for Social Good
Large language models (LLMs) are increasingly shaping research, policy analysis, and media work. Their effectiveness depends on how interaction with them is structured. Prompt engineering provides a methodological framework for producing reliable, analytically useful outputs across both GUI environments and API integrations.
This two-day lecture equips participants with the ability to:
Understand how LLMs process instructions, tokens, and context windows
Design structured prompts for research, analytical, and policy-oriented tasks
Apply various techniques to improve depth and precision
Use role definition, constraints, and stepwise reasoning to enhance output quality
Identify bias, hallucinations, and uncertainty in model responses
Evaluate and validate AI-generated content for academic and professional use
The session frames prompting as a disciplined approach to human–AI interaction, with emphasis on reproducibility, epistemic reliability, and analytical accountability in politically sensitive and information-dense contexts.
Alexey Sidorenko is director of Teplitsa: Technologies for Social Good, a capacity-building initiative supporting pro-democracy activists, civil society organizations, and independent media in authoritarian environments. For over a decade, he has helped journalists, activists, and NGOs build digital resilience through practical tools, security practices, and strategic thinking. He focuses on AI's impact on civil society and media — delivering AI training for journalists, running AI hackathons, advising newsrooms on responsible AI deployment, and overseeing the launch of factbutcher.com, a GPT-powered fact-checking chatbot. Alexey holds a Ph.D. in Geography (Moscow State University, 2010), and several AI certifications, including an IBM AI Product Manager certification, and Google's Responsible AI for Developers certification.
March 25 and 26: From the Kremlin to Pandas
by Dr Bartłomiej Gajos, historian of Russia, visiting Fellow at Harvard University
A short, practical course for students. This session will show you how a historian uses data science to work with large Russian-language source collections. It focuses on the real challenges of Slavic inflectional languages (especially Russian) demonstrating how you can turn messy historical materials into clean datasets ready to be analyzed.
The following questions will be addressed:
Why Russian (and other Slavic languages) are tricky for quantitative text analysis (many word forms, spelling and style variation),
How to make key preparation choices: normalisation, stemming vs lemmatisation, detecting duplicates and near-duplicates, and why these decisions change your results,
How to treat official sources critically (provenance, context, and what “the data” really represents),
How to go from PDF to dataset: extracting text from born-digital and scanned PDFs, cleaning Cyrillic text, adding metadata, and exporting to CSV / DataFrame.
