The dataset consists of transcripts of official public events with the participation of Vladimir Putin, published on the official website Kremlin.ru, including addresses to the Federal Assembly, meetings, speeches and addresses, press conferences, interviews, articles, and others. Documents are divided into separate paragraphs and sentences for detailed analysis of the president's rhetoric in 2024.
The unit of observation in the dataset is a sentence. Data for each sentence includes a link to the speech on the official website, the date of the event, and an indication of who exactly speaks this sentence — the president or one of his interlocutors.
The dataset is available in PARQUET format and archived CSV (encoding: "UTF-8", delimiter: ";"). The dataset covers 2024 and contains 76,073 observations (sentences) from 505 speeches across 8 attributes.