Data Engineer signal: hash map + dedupe in a data quality monitor context. This is a ProdMatch-owned data engineer drill, framed as a March 2026 PayPal Data Trust simulation, not a copied platform question.
Company context
PayPal · Data Trust
Freshness
March 2026
Product surface
data quality monitor
ProdMatch interview simulation based on product-team patterns; not a claim of a real company question.
For data quality monitor, collapse duplicate checks by external ID. Keep the newest event per ID and return IDs in first-seen order.
Input
Output
Constraints
Concepts
[(a,1),(b,1),(a,3)] -> [(a,3),(b,1)]
Try framing your own approach first. The 30 seconds you think before peeking is where learning happens.
Reveal the approach first.
Your rating tunes when this problem shows up again.