3. Scrape the data using LDAScraper
To automate the fetching and processing of LDA data, you can use LDAScraper. Below are installation and usage instructions.
pip install git+https://github.com/Ru-Pro/lda_scraper.git
cd
Import the scraper and provide the API key obtained from the LDA website:
from lda_scraper import LDAScraper
api_key = "your_api_key_here"
As an example, let's fetch all filings related to Russia. We select the filings endpoint and define filter parameters with the possible values that will return all relevant filings:
baseurl = 'https://lda.senate.gov/api/v1/filings/'
parameters = {
'client_ppb_country': ['RU'],
'affiliated_organization_country': ['RU'],
'foreign_entity_country': ['RU'],
'client_country': ['RU'],
'filing_specific_lobbying_issues': ['russian',
'russia',
'kremlin',
'putin']
}
scraper = LDAScraper(
baseurl=baseurl,
api_key=api_key,
parameters=parameters
)
scraper.scrape_all() Parsing all the JSONL files into a single DataFrame, filling missing values.
scraper.parse_all()
scraper.disclosers_df.head()
Saving the parsed DataFrame to CSV, JSONL or parquet files in processed_data_folder.
scraper.save_csv()
scraper.save_json()
scraper.save_parquet()