Also, sometimes we (developers) like to use wacky data for testing purposes. For example, I like to put Batman as a dummy user, and my QA likes to upload cat pictures when testing uploads/images.
We do it so it's obvious it's test data, and also we're lazy to think of more "real" data.
Just say some users expect real(ish) data for testing. I had a client who was totally not happy when he saw Batman and Superman in the test data.
There was a bank that wasn’t happy with “Rich Bastard” being used as dummy data but not being replaced in the mail merge, resulting in a couple thousand of their wealthiest customers getting a mailing with the salutation “Dear Rich Bastard,”
I learned a long time ago to be very careful with mock, dummy, or test data.... because some people will just push anything to prod, take screenshots during your demo and paste it into the official documentation... you name it.
I was giving a demo on how to set up multiple computers in a federated setup using Active Directory, ADFS, etc... I had about 5 VMs named things like Hank, Peggy, Bobby, Boomhauer, Bill, and a test user HHill, 123 Rainy Street, Arlen, TX -- someone screenshotted and took notes during the demo and now that's in some formal training somewhere material. Thankfully, it's all internal.
When I and doing dev work and I need an available port, just any port, I use 666 -- because it's never used by anything and also DOOM. I gave a sprint demo and I used 660 instead of 666 to demo that the customer can specify the port number of screen X. Someone put that in the internal and also customer facing documentation... so now my company's product is default setup on 660, even thought it's completely user-configurable. Thank God I didn't demo with 666...
I've never really understood developers' apparent need to add cutesy stuff into their work product's test data, variable names, easter eggs and so on. Adding this stuff is all downside risk with no technical benefit that you can explain in a written postmortem that will be read by your boss's boss's boss.
I mean, I get the motivation: You're working on a boring, dry, SeriousBusiness project, and have a creative itch that needs to be scratched. We all have a nonzero desire for a little joy and irreverence at work. But, man, scratch that itch with hobby projects, not stuff that's going out into the public! Or start a "wear a funny shirt day" at work or something like that. I know this is unpopular and makes me look like Debbie Downer, but our projects already have enough technical risks without deliberately adding more.
There's a saying: "Don't post anything online you wouldn't want grandma to see." The developer equivalent is "Don't use test data you wouldn't want the client [or boss] to see." This also applies to variable names, function names, and comments in code.
For a project that involved creating fake companies and user records, I purposely choose to use characters from Star Trek, Star Wars, and the Simpsons for each of the different companies. They're whimsical, non-offensive, and as an added bonus, if I see Homer Simpson listed alongside James T. Kirk, I instantly know there's a data integrity problem.
That last bit is the main reason why I use odd or otherwise out of place test data[0]. Test data should never leak into production. Ideally there should be no means of that happening.
[0] Recent example: tissue sample, species: dog, tissue type: bone. Valid combination, just not present anywhere in prod.
Well, the problem is, in almost all the examples here so far, said stuff was not meant to go out into the public. If your customers end up seeing your product's test data and---heavens above!---variable names, there is an organizational issue that needs to be addressed, cutesy stuff or no cutesy stuff.
Also, isn't the point of QA testing just to throw all and any data to your system? Would you rather have a system that's tested against the eventuality that someone abuses UTF-8 in a textbox or a full SeriousBusiness system with zero whimsy and cutesy stuff? Someone's whimsy cutesy stuff is someone else's street address.
I think you just put a finger on why I absolutely loathe SeriousBusiness Banking Software: they were designed, implemented, and tested in a vacuum that even normal users end up putting a toe out of line that just breaks the assumptions of the spec. You have to be extremely average down to your name to peacefully coexist with them.
If dummy data ever proposes a "technical risk" to your projects, I might argue you're using the term wrong.
Variable names are different, and I'll give you that, but creating humorous dummy data in lower environments shouldn't ever be an issue. Injecting a little fun legitimately helps overcome despair, and the harder and more difficult your project/company is, the more it needs a dose of lightheartedness.
No matter what the scrum boards that reduce us to story points say, we're all human beings. When everything is very high stakes, you're in a perpetual state of fight or flight. It's literally physiologically bad for you. Blowing off steam helps.
As a test of our new Sev1 alerting system, I created a phony alert "The hordes of Mordor are descending upon our data center".
I read the comment to mean that we have enough technical risk that we don't need more general risk. This stuff adds risk. As I have heard: "do you want to read your joke in a courtroom for a non technical audience?" - in some fields more so than others, but there is always a risk that something will go horribly wrong, your system will be involved and your code shows up either directly or as a side effect of discovery.
I can't remember the details, but I've heard a story multiple times about a fake-sounding name being used in testing -- I think US military payroll? -- and causing problems when a real person had that name. Can anyone here remember this?
In any case, "batman" is just about plausible enough that it could be real. I tend to use names like "Mr. Testy Testalicious" which (a) contain the string "test", and (b) are so wildly absurd that I'm confident nobody will ever collide with it.
Caterina Fake, co-founder of Flickr, famously had issues with IT systems:
Tim: There’re so many places we could start, but in the process of doing homework for this, I found mentioned, and I wanted to do a fact check on this, of you having plane tickets automatically cancelled, and other issues related to your last name. Is that accurate? Did those things actually happen?
Caterina Fake: This has happened to me many times, in fact. And I discovered that it was actually the systems at KLM and Northwest that would throw my ticket out, my last name being “Fake.” And I have missed flights and have spent way too many hours with customer service trying to fix this problem. Here’s another thing too, is that I was unable for the first two years of Facebook to make an account there also. And probably all of my relatives.
lol so much data gets converted into strings at some point when passed around. Definitely encountered systems where you have to check for both null and "null"
This seems like a good spot for the link to @patio11's "Falsehoods Programmers Believe About Names"
So, as a public service, I’m going to list assumptions your systems probably
make about names. All of these assumptions are wrong. Try to make less of
them next time you write a system which touches names.
I get what he's doing, but some of these are not actionable:
> People’s names are all mapped in Unicode code points.
So... what? What do I do with this? My program has to use something to represent text, and since I fail to be a large multinational consortium, I can't invent my own character set and expect it to work.
Also:
> Confound your cultural relativism! People in my society, at least, agree on one commonly accepted standard for names.
This is pretty much true in countries with naming laws, yes.
> People have names.
People in a database will have certain records which will not be NULL. Whether you call one of those records a 'name' outside the context of that database really isn't my concern.
Unicode is not the only character set (or the best one); this is a falsehood programmers believe about character sets (I wrote a list of this too but I do not remember if I had published it). However, that is not the most severe issue, due to the other things mentioned, such as if people do not have names (or if there are multiple ways to enter them, or if people sometimes change their name, or have the same name as other people, etc).
> Unicode is not the only character set (or the best one); this is a falsehood programmers believe about character sets
Unicode is the best if I want to communicate with other people. I lived through the 1990s; you won't convince me that playing "guess the encoding" with dozens of subtly-incompatible standards (and non-standards, and almost-standards) was a good time, or that having to override a web browser's helpful guess was fun.
Try to understand these issues or rather how they could affect your business processes and software implementations down the line rather than dismissing them on a technical level.
You can store the Unicode representation just as you normally would. But what you don't do is assume that your Unicode representation is the only representation of the actual name.
More concretely, there are names that have multiple equally valid ways of writing them. You can probably expect that usually the same one is used, but you should absolutely not require this when building your business processes.
Even more concretely, as an example there are transliteration or simplification / shortening rules that allow people with otherwise strange or long names to buy an airline ticket. The actual, real name may not be any of the ones you have in your system. This matters e.g. when searching for someone or in customer support.
As for people without names (or unknown names), you should probably recognize that the handling might differ by country. E.g. records with "John Doe" in the US might have to be handled differently: analogous to "NULL != NULL" in SQL John Doe != John Doe. Or maybe even "Jane Doe == John Doe" in some cases. See also "Fnu Lu" (First Name Unknown, Last Name Unknown) used in the US.
And although I don't have knowledge about all the countries in the world, it may very well be that this leads to situations where the "no name" has to be handled specially or at least understood to be a special case, completely differently from other cases.
> So... what? What do I do with this? My program has to use something to represent text, and since I fail to be a large multinational consortium, I can't invent my own character set and expect it to work.
Maybe don't rush to remove your "legacy" encoding support because "everyone is using UTF-8"? Or at least check with some Japanese users with obscure names first.
Our first user at one company was Richard Test. He had user ID 1001. Well-meaning people deactivated his account several times over the years because it looked fake to them.
Sorry, Richard. I hope you were more amused than annoyed.
I’ve told your story about Mr. Test to several people over the years but I’ve never been able to remember where I got it from. I’m glad to have finally found it again, and thank you for the anecdote!
One of the audio checks I've heard over the many conventions I've volunteered for is "Ice Ice Icicles, Cue Cue Cuticles, Test... Test... Testicles" with the final word pronounced like Hercules.
I used to use Testy Tester until one of my coworkers commented that she was acquainted with the Tester family, and there were quite a few of them in the area. These days I usually have completely separate systems for testing, but even there I use something like Zzzperson for test data.
I've seen too many stories of placeholder text ending up in production... so I better make it worthwhile and include some Lovecraft quotes [0] because everyone needs more gibbering, cyclopean, eldritch adjectives in their lives.
I give Abubis a special pass, because they sell a business oriented version without the character. The true cost of using FOSS is you don't have any say in what the developer does.
This just confirms OP’s point that "you don't have any say in what the developer does", since the only way to get your modifications in if the developer disagrees is to maintain your own version of the code.
I forget how the phrase goes, but it's something like, "Someone else can do it better than you, but no one will ever care more about what you need than yourself." The point basically being that there are tradeoffs: you are either okay with imperfection, or you have to do it yourself. It appears true, whether it be for software development or home repair.
Sure, and then at some point you get conflicts because whatever thing you modified is not supported anymore and/or the syntax changed for some reason and/or other random issue, and then good luck. Forking works well only if upstream is already stable or you’re fine running an old version.
When designing, the standard practice is to use Lorem Ipsum - sort of mangled Latin that works like normal text but is very recognisable. This backfired once when I did a website for the Jesuits - the feedback they gave was that the design looks good but they were all baffled by the text and could I do something about it please.
I’d not considered that they might be the only client where everyone was fluent in Latin.
Reminds me of the Catholic friend who once told me that he had done IT support for every Catholic religious order with a presence in the city where he lived, except two.
The Carthusians didn't use computers, and the Jesuits didn't need his help.
I'm a big fan of using Emoji for names of test/dummy users. It helps test your application and dev stack's end-to-end Unicode compliance. It is less likely to conflict with real data (so far as I'm aware we haven't yet seen children born named with emoji, though that is likely a matter of time). It is often very visibly test data that stands out. But also and maybe more important, you can have fun with it.
We had a Dev environment that showed a doge meme on the auth page that had been there for like...7 years or something? "So auth. Much secure. Wow." etc.
Every other environment had standard boiler plate corporate logo + whatever product name. We kept the meme stuff in Dev just so you could be visually reminded, "Oh right, this is the crazy broken one."
Queue 7 years later, an emergency where we just had to impress a new client with a demo of how the product would work. And of course, the only thing that was really in a semi-ready state...was Dev. We couldn't move it over to a different one for some stupid reason or another.
Number one comment after the demo? "This looks very unprofessional. We do not want a dog logo on the login page. Is your team taking this seriously?"
We ran into the same thing back in the dot-com era. The development group had a sample customer set up as "Master Bait & Tackle", which had an outdoor outfitter theme. With items like fishing reels, lures, backpacks, etc. Entirely innocent (apart from the name).
Sales & Marketing got wind of how consistent the data was in it and wanted a copy they could use in presentations and for trade-show exhibits. We all said absolutely not. But they went around us and got a copy anyway.
It did not go well when a potential customer made a comment about the name during a demo.
Lesson learned. Always use the word "Test" in your test data. Always.
We do it so it's obvious it's test data, and also we're lazy to think of more "real" data.
Just say some users expect real(ish) data for testing. I had a client who was totally not happy when he saw Batman and Superman in the test data.