Implementing soft deletes is a lessons every developer learns early in their career. The fact that Atlassian did not implement that in their cloud, is mind boggling.
So many opportunities missed to avoid this! Look at one ID and ensure it is what you expect it to be. Run the script in a dryrun mode. Run the script for 1 customer. Probably more!
This was addressed in the write up (it’s very long, so missing it is easy). They ran the script against 30 accounts first to verify it worked, and it did, because the list of 30 ids they tested against came from a different source than the other 750ish. It’s a shitty mistake to make but I’m certain I’ve made similar ones.
One of the favorite tricks tricks I've ever seen is how Twilio uses human-readable prefixes[0] on their various identifiers - you will never mistake a device (HSxxxxxxx) for an account (ACxxxxxxxx). It's prevented us (Twilio customer) from making similar mistakes in the past.
I like the idea of human readable identifiers. But generally feels like this class of error could be prevented with more type safety in the api and data model? Like deleteDevice(123) and deleteAccount(123), rather than delete(123). This is how REST is designed, the type of resource is already baked into the url.
It's puzzling to me that we, as software developers, spend so much efforts trying to automate such one off deletion tasks, and the automation would inevitably go wrong and result in data loss.
After GDPR almost all companies need to have something like a “universal delete”. There are safer ways to deal with data retention policies but I can understand why such a script exists.
For large systems adding Soft Delete would be more a PR move, then actual work. In some projects I know implementing soft delete is as hard as writing project from scratch. Having scripts and manual restoration for particular records is the way those businesses communicate with customers.
You can do a soft delete, setting a flag (expire date), have apps ignore records with an expire date then a scheduled job to delete all items with the expire date set with a value less than current time.
Alternatively move the deleted data to a temporary location and then delete the temporary location after a short period of time.
Or better combine both patterns where expired rows get moved to a temporary location before hard deleting a period of time after.
GDPR says you have to delete data when requested. As far as I know you have 30 days to acknowledge the request and up to 60 days to action the request. It’d be completely reasonable to do a soft delete for 7-14 days before doing a hard delete to prevent these kind of errors.
That's exactly what we do: a soft delete followed by a scheduled hard delete 7 days later. The customer is also notified that they won't be able to restore their account after that. Soft deletes are usually automatic and well-tested (for example, a subscription expired) but sometimes there are requests by management or customers to delete accounts manually and this approach really helped avoid disastrous situations like the one at Atlassian because you have a whole week to realize you deleted the wrong accounts (customers will most likely complain much sooner).
Yes, GDPR has a grace period of 30 days or so, it's never been a problem in practice.
Yes. As I understand it, GDPR does not care if customer data belongs to a business or an individual. And even if it did, business data will likely include PII for employees, which would need to be deleted.
GDPR does give you a grace period , so you can soft delete immediately and then hard delete after some period shorter than the GDPR deadline. However, actually implenting such a system can be rather difficult and potentially expensive.
There's no reason you can't have both. Use soft deletes for everything except for a formal GDPR right to be forgotten request (or any other compliance situation).
Soft deletes are just the first step. After some period of time you can automatically purge or manually purge. Which is what they’ve supposedly committed to doing.
> except for a formal GDPR right to be forgotten request (or any other compliance situation)
That "any other compliance situation" includes things you probably want to have a soft delete for, like deleting/closing an account. Customers accidentally deleting their accounts, then wanting it restored happens more frequently than one would hope.
Great case study of a monumental fuck up!