← Back to Portfolio

Data Diff Checker

CSV regression testing tool for API workflows

The Problem

When working with e-commerce data feeds, you often need to compare API responses between production and development environments. A change that looks harmless in code can silently break field mappings, strip HTML content, or introduce subtle data inconsistencies.

Manual comparison doesn't scale when you're dealing with thousands of products and dozens of fields per product.

The Solution

Data Diff Checker is a Python CLI tool that automates regression testing for CSV-based API responses. It fetches data from both environments concurrently, normalizes the output, and generates detailed diff reports showing exactly what changed.

Key Features

Technical Details

LanguagePython 3.11+
Asyncaiohttp + asyncio
OutputCSV diff reports

Sample Output

┌─ Data Diff Checker ─ Elapsed: 01:23 ────────────────────┐
│ Fetches: [████████████░░░░░░░░░░░░░] 48/200 (24.0%)     │
│ Diffs:   [██████░░░░░░░░░░░░░░░░░░░] 20/100 (20.0%)     │
├─ Recent Activity ───────────────────────────────────────┤
│ 14:32:15 [Test 47] Starting (prod first)...             │
│ 14:32:16 [Test 45] PROD done (status=200)               │
│ 14:32:17 [Test 44] +0 added, -0 removed, ~3 changed     │
│ 14:32:17 [Test 43] No differences                       │
└─────────────────────────────────────────────────────────┘

Why I Built It

I needed a reliable way to verify that changes to data feed configurations wouldn't break existing functionality. After manually comparing CSV exports one too many times, I built this tool to automate the process and catch regressions before they hit production.