It's not non-empirical. He was careful to give it the same experiment twice. The dependent variable is his judgment, sure, but why shouldn't we trust that if he's an experienced SWE?
Unless he was able to sample with temperature 0 (and get fully deterministic results both times), this can just be random chance. And experience as SWE doesn't imply experience with statistics and experiment design.