1focus break

#1focus break software#

Having caused massive data incidents by making 2-lines-of-SQL “hotfixes”, I’ve come to believe that: In my experience (Daniel Kahneman rolls his eyes), we significantly overestimate the percentage of externally-caused data issues and dramatically underestimate how often we (people) break things for ourselves and others. Could be us pausing ads, could be a delay in the data ingestion pipeline, or it could be a real problem, but we won't know until someone spends a couple of hours digging into it.

Vendor providing financial data shipped us a dataset omitting three markets.Event was duplicated in the streaming pipeline, causing a fanout in the warehouse.Airflow scheduler errors out a task never ran but shows as completed.Analytics engineer renames a field in a dbt model “for consistency.” An online machine learning model that powers search is no longer online.ĭata breaks for reasons outside of our control:.The same scenario applies to changes to transactional tables replicated from OLTP databases into the warehouse for analytics. Fixing it takes two weeks since it requires new instrumentation.

#1focus break software#

Software engineer removes the promo_code field from the user_signup event because “this context is now provided by another microservice.” Result: Marketing’s reporting is broken.

Apparently, they relied on the data synced from dbt into Salesforce to track revenue. Result: Sales team over reports revenue for three months.

Analytics engineer changes the definition of tax_charge in a dbt model per CFO’s request.

Someone in our organization, maybe even on the same team, maybe even you, made a breaking change.