Skip to content

civictechdc/eavs_clc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Election Administration and Voting Survey (EAVS) Data

This repository contains code to download and process U.S. Election Administration and Voting Survey (EAVS) datasets to support research and analysis by the Campaign Legal Center's Voting Rights team.

Developed and maintained by volunteers from Civic Tech DC.

Set up development environment

uv Ruff

This project uses uv as an environment manager.

To create and sync your Python environment locally, run:

uv sync

Tip

This project has Just as a task runner. The justfile contains several helpful recipes for common commands. You can run

just

to print out a list of recipes and some short documentation.

Project Organization

├── LICENSE            <- License for this project
├── Makefile           <- Makefile with convenience commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data/
│   ├── interim/       <- Intermediate data that has been transformed.
│   ├── processed/     <- The final, canonical datasets ready for analysis.
│   └── raw/           <- The original, immutable data dump.
│
├── notebooks/         <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── pyproject.toml     <- Project configuration file with package metadata for
│                         eavs and configuration for developer tools
│
└── eavs/              <- Source code for use in this project.
    │
    ├── assets/        <- Development assets
    │
    ├── __init__.py    <- Makes eavs a Python module
    │
    ├── config.py      <- Store useful variables and configuration
    │
    └── ...            <- ...

EAVS dataset

The Election Administration and Voting Survey data is collected every two years after the national elections. It is a large spreadsheet with ~400 columns and ~6000 rows. Each column is named with a letter, number, and letter, starting with A1a, A1b ... C9a, etc. In order to decode what the columns mean, the Data Codebook maps each column label to a description of the column data. To analyze and manipulate the dataset, we currently use pandas with calamine for fast excel I/O.

Working with the data

Data download

To download the data from the EAC Website, run:

uv run -m eavs.download

This downloads the raw data into data/raw/{year}/{version}/. It also verifies the data file contents against a SHA256 checksum.

Data cleaning

Run:

uv run -m eavs.clean

This processes cleaned data with human-readable column names into data/cleaned/. The best file to work with would be data/cleaned/timeseries.parquet

Notebooks

Our Jupyter Notebooks are for exploratory data analysis and dashboard prototyping. Any finalized features should be converted into Python scripts for reproducible builds. Running Jupyter Notebooks requires jupyterlab (a dev dependency), as well as the relevant data in the data/raw directory. Reading the Jupyter Notebook should give you a good idea of what EAVS data files are required.

  • 1.0-exploratory-data-analysis is the starting point for any exploratory data analysis work with the EAVS data. It shows you how to read, manipulate, and output the dataset.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5