Back to Portfolio
Data Analysis · EDA

Consumer Complaints EDA (2011–2024)

6.9M CFPB cases analyzed: credit-report errors account for 32% of complaints, a post-2020 spike, and FL-TX-CA geographic hotspots visualized with heat-maps & dashboards.

R ggplot2 usmap tidyverse lubridate
6.9M Records
5 Tools
2024 Year

Business Problem

Rising CFPB complaint volumes create reputational, compliance, and revenue risks for banks and fintechs. Without a clear view of spikes, firms struggle to prioritise fixes, allocate resources, and stay ahead of regulators.

Objective Business Value
Early-warning radar Detect product failures before they trigger fines or class-action suits.
Peer benchmarking Compare complaint rates by product & state to expose outliers.
Root-cause targeting Trace surges to specific workflows (e.g., credit-report fixes) and redesign them.
Resource allocation Focus CX budgets on hotspots (FL, TX, CA) and high-risk categories.
Regulatory readiness Give risk committees KPIs that pre-empt CFPB investigations.
Volume of CFPB complaints 2011-2024
Figure 1. Volume of complaints over time.

Data & Methodology

We analysed 6.9 million CFPB complaints from 2011-01-01 to 2024-03-31, downloaded via the public API and processed in R. Below is the five-step workflow that ensures data quality and insightful visuals.

  1. Ingest & type-casting — Load raw CSV and convert dates / factors with readr::read_csv.
  2. Data cleaning — Drop duplicates and empty narratives (0.6%) using dplyr.
  3. Date enrichment — Add Year and Month fields via lubridate for time-series aggregation.
  4. Geo-join — Map two-letter states to FIPS codes (usmap, sf) for choropleths.
  5. Visual EDA — Create heat-maps, bar charts and line plots with ggplot2 & leaflet.

Limitation: dataset only captures consumers who filed formal complaints; silent dissatisfaction is not recorded.

Key Insights

01 32% of all CFPB complaints filed since 2011 are credit-report errors — the single largest complaint category.
02 Complaints doubled post-2020 vs. the pre-COVID baseline, driven by pandemic-era financial hardship.
03 FL, TX & CA generate 27% of total complaint volume, reflecting both population size and financial-sector concentration.

Next Steps

Conclusions

The CFPB dataset highlights persistent issues in credit reporting and debt collection. Geographic analysis shows disproportionate complaint volumes in populous, financially complex states. The pronounced uptick post-2020 underscores how economic shocks translate into consumer-finance friction.

Credits

Analysis by Juan Camilo Sierra Escobar, December 2024.
Data sourced from the Consumer Financial Protection Bureau (CFPB).

Repository

View full code on GitHub

Explore the project