> Most research is flawed or useless, but published anyway because it's expedient for the authors to do so.
For the record I now know a professor at one of the premier institutions in the world, who is a total fraud. Their research was fraudulent in grad school. Their lab mates tried to raise concerns and nothing happened. That person graduated, with everyone on the committee knowing the issues. Then they got a premier post-doc position. People in their lab (who I caught up with at a conference) mentioned their work was terrible. Now they’re a professor at a top tier university.
Along the way, everyone knew and when people tried to bring up concerns higher ups in the institution suppressed the knowledge. Mostly because their fraudulent work was already cited tens of times.
This wasn’t directly in my field, but I saw it go down and followed it.
In my day job, I just throw out papers that don’t publish datasets and code. Most cs work is equally useless. It’s all a farce.
EDIT: I recommend the book for some insights - “Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions by Richard Harris”
> I just throw out papers that don’t publish datasets and code
There was an interesting research paper that claimed you could up your IQ with a computer game called dual n back and a lot of paper tried to replicate it, but my mind was blown when I realized none of them used the same code. None of them actually shared the code they used for their research and then they all claimed they were testing the same thing when they obviously weren't.
To me, refusing to share the source code to a program that could be used to replicate your research seems like a big middle finger to science itself. It shows a total disregard to study replication.
Exactly, it'd be like omitting the "methodology" section of the paper. There's no way to verify or prove wrong a paper if you don't know how they reached their conclusions.
Even more than sharing code, I think the key is to share raw data, postprocessed data and describe methods as exceptionally clear equations and pseudocode.
Some well intentioned papers share code which is hard to run years afterwards. It's sometimes much simpler to reimplement things that to get the code to run if they are described accurately.
Besides, some articles hide ugly things, nasty tricks and lies in the code, which make their results a lot less believable and valid. Being super upfront about models in terms of equations and pseudocode is important.
Of course, we should also have standards to make code reproducible. Perhaps depositing a VM with everything ready to run.
> Perhaps depositing a VM with everything ready to run.
The danger then is that the VM has so much undocumented complexity that if anything goes wrong, or goes "well" when it shouldn't, no one can explain why. Which also reintroduces a vector to hide nasty tricks.
this. the point of sharing reproducible steps and not the experiment itself is that it can be fully reproduced independently. not just independently verify that the result show what the paper claims.
Results that haven't been independently replicated are suspect. There are just too many factors that can lead an experiment to give some results that are not transferrable or not relevant.
The worst aspect of this is the lack of will or funding to replicate, replicate, and replicate again all significant results that get published. Post-processed data can be altered, but a TB of raw data is meaningless as well if it hasn't been produced properly, has been obfuscated, or is weirdly formatted.
Data availability is a red herring for the vast majority of the science being made right now (almost everything that does not depend on a multi-millions dollars experiment). If data availability is an end in itself, we would just have moved the goalposts and have a data quality problem instead of a reproducibility problem.
> Perhaps depositing a VM with everything ready to run.
Yes, this is hugely important. We need clearer requirements for what constitutes properly 'published' data, code and methods. It should include all raw data (both used and discarded) as well as complete snapshots of each preliminary and intermediate processing step in a time-stamped chronology.
This is an area where the expertise and best practices of software development, documentation and source control could help inform standards and tooling which would dramatically improve the quality of scientific publishing and replication.
> Some well intentioned papers share code which is hard to run years afterwards.
Umm, not really? How many years are we talking about here? Because even Cobol is runnable. Sure some OS related quirks may need changing but getting the raw source is much more likely to expose a subtle flaw in the researcher's method than just the equations. For example, an incorrect negative check (to string vs < 0) or an off by one error.
Don't think COBOL; think Python 2.1 (not 2.2) with this specific version of PIL, and some of the code implemented as a CPython module (for speed) in C that only compiles properly with one specific EGCS fork that adds support for IA-64. Parts of the code are written in slightly-buggy inline IA-64 assembly, but it's okay because after a system call, that particular operating system makes sure those registers are zeroed each loop iteration, if Python's line-buffering its `print`s so that the system call gets consistently run.
Also, the Python script crashes on loading the data files unless there are two (but not more than two) spaces in its full file path. This is not documented anywhere.
Yeah. I can easily run FORTRAN code from my PhD supervisor's PhD supervisor, written way back in the 1980s, but I cannot run some of the Python scripts written by a post-doc 10 years ago. It's a mess of abandoned libraries and things that work only with some specific versions of some libraries, and compiled modules that rely on the dodgy behaviour of an old compiler. Perl seems to be better, if only because people used to rely much less on external libraries.
But properly running code is not the solution either. I can count on the fingers of one had the downloads of some of the codes we've published (easily compilable modern Fortran), and AFAIK nobody ever published anything using them. Having a multitude of codes available does not mean much if nobody runs them, assuming they can be compiled. And I would guarantee that none of the scientists who download these codes would be able to understand what they do in any detail.
Indeed, that's not proper science, too many moving parts. If it cannot be easily replicated by anyone at any time, it is just an experiment which would need more refinement to get published. No experiment should be accepted if it requires a Rube-Goldberg machine.
Sharing code and data can be as harmful to science as it is beneficial. You don’t want to reuse the same instruments that conducted the last experiment to validate it.
Independent replication is inherently expensive, but also critical to the field at large. Some sort of code vault that releases the code and data after a period could be a solid compromise.
Well yes, but no. You want to avoid any systematic bias, which if you reuse tooling/instruments/code, you run the risk of, but source code is also something which can be peer-reviewed.
Reproducing the code base from scratch just isn't going to be tractable for large scientific pieces of software (e.g. CERN).
Better to have the source code and dig through it carefully as part of the review and replication than insist everyone write their own cleanroom copy just to avoid systematic bias.
Suppose someone decided to build another LHC, but to save money they would use exactly the same design, constriction crew, and code used to build the first one. Would you consider that a perfectly reasonable choice or would it seem risky?
That said I am all for peer reviewing code, but that’s not where this ends up. People are going to reuse code they don’t bother to dig into because people are people. The obvious next step would be to reject any paper that’s reusing code from a different team, but that’s excessive.
That said reusing code or data is much safer than reusing code and data.
I’m going to be upfront and say I don’t understand your position You seem to be making a number of questionable assumptions: 1) That people would blindly reuse code to the harm of science, including on a multi-billion dollar project like an LHC. That’s not likely to ever happen.
2) That rejecting those who plagiarize others’ code in a peer-review process is somehow excessive or problematic.
I can’t begin to understand where you are coming from.
I'm not sure if I have exactly the same concerns as the person you're replying to, but I've definitely noticed problems coming from this form of "replication" through just re-running existing code. I don't think that means code shouldn't be released, but I think we should be wary of scientific code being reused verbatim too frequently.
If existing code is buggy or doesn't do what its authors think it does, this can be a big problem. Even if the idea is correct, whole swathes of downstream literature can get contaminated if authors didn't even attempt to implement an algorithm independently, but just blindly reused an existing docker image that they didn't understand. I consider that poor practice. If you claim that your work is based on Paper X, but what you really mean is that you just ran some Docker image associated with Paper X and trusted that it does what it says it does, instead of independently reimplementing the algorithm, that is not replication.
In older areas of science the norm is to try to replicate a paper from the published literature without reuse of "apparatus". For example, in chemistry you would try to replicate a paper using your own lab's glassware, different suppliers for chemicals, different technicians, etc. Analogies here are imperfect, but I would consider downloading and re-running an existing docker image to be dissimilar from that. It's more like re-running the experiment in the first lab rather than replicating the published result in a new lab. As a result it can, like re-running a chemistry experiment in the same lab, miss many possible confounds.
Of course, you do have to stop somewhere. Chemistry labs don't normally blow their own glass, and it's possible that the dominant way of making glass in your era turns out to be a confound (this kind of thing really does happen sometimes!). But imo, on the code side of things, "download and rerun a docker image" is too far towards the side of not even trying to replicate things independently.
For that reason a special multimedia sharing tool was created, which is now called WWW. So the international physicist community could share their code, papers and CAD designs for the LHC experiments. Quite a success for them, but academia is still resistant
It's not reasonable, but for different reasons. Building a carbon copy of LHC would not add anything new in the way of science. A better example would be LIGO. Adding another gravity wave detector of the exact same spec but halfway around the world would be fantastic, because it increases our triangulation ability, and keeping the engineering, hardware, and software the same reduces cognitive load of the scientists running it. Yes that means any "bug" in the first is present in the second, but that also means you have a common baseline. In fact there will inevitably be discrepancies in implementation (no two engineering projects are identical, even with the same blueprint), and you can leverage that high degree of similarity to reduce the search space (so long as subsystems are sufficiently modular, and the software is a direct copy).
The original comment was with respect to some n-back training program. There's so many other potential places of bias in an experiment like that, that you'd be foolish not to start with the exact same program. If an independent team uses a different software stack, and can't replicate, was it the different procedure, software, subjects, or noise?
The first step in scientific replication is almost always, "can the experiment be replicated, or was it a fluke?" In this stage, you want to minimize any free variables.
It's a matter of (leaky) abstractions. If I'm running a chemistry replication, I don't need the exact same round bottom flask as the original experiment; RBFs are fungible. In fact I could probably scale to a different size RBF. However, depending on the chemistry involved, I probably don't want to run the reaction in a cylindrical reactor, at least not without running in the RBF first. That has a different heating profile, which could generate different impurities.
Likewise, I probably don't need the exact same make/model of chromatograph. However, I do want to use the same procedure for running the chromatograph.
Ideally, that would be a concern of the peer review. When I finished my undergraduate degree, I had to present a paper describing a program. I had to show that program working as part of my presentation. Anyone reading my "paper" and knowing I got a passing grade already knows the code I supplied works with the data I supplied, so I don't think it's that important for them to run it themselves.
Essentially, this would be like trying to reproduce a paper and starting by checking if their mathematical apparatus is correct. It's not useless, and it can help detect fraud or just plain bad peer review of course, but I wouldn't call that an attempt to reproduce their results per se.
It could be a nice quick sanity check, in the sense that if they've completely lied or you've completely misunderstood how to use the program, you won't get the same results. So it could tell you that you shouldn't even bother trying to replicate their claims. But there's a risk that people might mistake re-running the code for reproducing the findings of the paper.
The paper is entered into the scientific record, not the code, which will inevitably become obsolete (old language, old frameworks, implementation details tied to old hardware or operating systems, maybe some source code will be lost, github will go out of business). If the code is necessary, then crucial details have been left out of the paper, so it is not reproducible (although there are some journals that let you submit code as an artifact).
The risk here is that the original code contains a bug that is inadvertently reproduced by the replicating scientists after reading it. It can easily happen in some fields that some computation looks very plausible in code, but is actually incorrect.
"To me, refusing to share the source code to a program that could be used to replicate your research seems like a big middle finger to science itself."
I used to work (current PhD student) in HPC (switched specialties) and I was extremely surprised that not only did these people not share code (DOE actually encourages open souring code), but they would not even do so upon request. Several times I had to get my advisor to ask the author's advisor. Several times I found out why I couldn't replicate another person's code. It is amazing to me that anyone in CS does not share code. It is trivial to do and we're all using git in some form or another anyways.
Sharing source code for a HEP experiment is not that easy or possible at sometimes. A lot of stuff is done by different group and whole framework used and overall a lot of raw data collected and reconstructed involved. To analyze those even if available you willl need a lot of people with a lot of resources. So even it made available (which would be a lot of efforts itself) it wouldn't make much sense for replication purpose.
So what? Tough shit! It's still simply the only way to replicate or audit something.
If it's hard or tedious... well so what? So is life. That's most of science is exactly that rigor.
You might as well say no one else can grow a potato because of all the work your farm had to do all year to arrive at your crop. Yes, any other farmer will have to do all the same stuff. It's an utterly unremarkable observation.
How is it possible to peer review a paper when the only people qualified to do so are the one's involved in the research. Seems like a massive issue. Don't want to cast to much shade here, but there's a fair number of people that called out high energy physics for having problematic methodology. See Constructing quarks by Andrew Pickering. Ultimately, it should be CERNS job to release the data, yes even if it is terabytes in size, because that's the whole point of science.
>How is it possible to peer review a paper when the only people qualified to do so are the one's involved in the research. Seems like a massive issue.
Peer review isn't what it's chalked up to be. Basically it's two other people in the same field saying "There's no blatantly obvious issues with this paper" and even that isn't always guaranteed. Reviewers don't make any efforts to actually replicate the research, crunch the data, etc.
Re running the authors code with their data will likely just repeat any methodological issues they had (either accidentally or fraudulently).
In medical studies, one level is to reevaluate the data from the same set of patients to see if any bias or errors wasn't included, perhaps with a different or newer statistical idealogy. The best is for the study to be repeated with a fresh set of patient data to see if the underlying conclusions were valid.
Reading this, I am reminded of a quote by Edsger W. Dijkstra from 1975 [1]:
> In the good old days physicists repeated each other's experiments, just to be sure. Today they stick to FORTRAN, so that they can share each other's programs, bugs included.
"Re running the authors code with their data will likely just repeat any methodological issues they had"
Yes, but that's useful too.
COVID-19 lockdowns largely kicked off in the west due to an epidemiological model from Imperial College London, written over a period of many years by the now notorious Professor Neil Ferguson.
When his team uploaded a pre-print of his paper and started sending it to government ministers, the code for his model wasn't available. They spent months fighting FOIA requests by claiming they were about to release it, but just had to tidy things up a bit first. When the code was finally uploaded to GitHub the world discovered the reason for delay: the model was a 15,000 lines of C trash fire of race conditions, floating point accumulation errors and memory corruptions in which basically every variable was in global scope and with a single letter name. It was a textbook case of how not to write software. In fact it was a textbook case of why variable names matter because one of the memory corruptions was caused by the author apparently losing track of what a variable named 'k' was meant to contain at that point in the code.
Not surprisingly, the model didn't work. Although it had command line flags to set the PRNG seeds these flags were useless: the model generated totally different output every time you ran it, even with fixed seeds. In fact hospital bed demand prediction changed from run to run by more than the size of the entire NHS Nightingale crash hospital building programme, purely due to bugs.
And as we now know the model was entirely wrong and the results were catastrophic. Lockdowns had no impact on mortality. There are many people who looked at the data and saw this but here's just the latest meta-analysis of published studies showing that to be true [1]. They destroyed the NHS which now has a cancer treatment backlog and pool of 'missing' patients so large that it cannot possibly catch up, meaning people will die waiting for treatment from the system they supported with their taxes for their entire lives. They destroyed the tax base, leaving the government with an unpayable debt that can be eliminated only via inflation meaning they will have soon destroyed people's savings too. It's just a catastrophe of incompetence and hubris.
The incorrectness of the model wasn't due only to programming bugs. The underlying biological assumptions were roughly GCSE level or a bit lower (GCSE is the exams you take at 15/16 in the UK), and it's quite evident that high school germ theory is woefully incomplete. In particular it has nothing to say on the topic of aerosol vs droplet transmission, which appears to be a critical error in the way these models are constructed.
Nonetheless, even if the assumptions were correct such a model should never have been used. Anyone outside the team who had access to the original code would have seen this problem immediately and could have sounded the alarm, but:
1. Nobody did have access.
2. ICL lied to the press by claiming the code had been published and peer reviewed years earlier (so where was it?)
3. Then when it was revealed the model wasn't reproducible, they teamed up with another academic at Cambridge and lied again by publishing a report+press release claiming it actually was reproducible and claims otherwise were misinformation.
4. And then the journal Nature and the BBC repeated these false claims of reproducibility.
All whilst anyone who looked at the GitHub issues list could see it filling up with determinism bugs. If you want citations for all these claims look here, at a summary report I wrote for a friendly MP [2].
So. It's good that the Royal Society is telling people not to engage in censorship, but their justifications for taking that stance reveal they're still living in lala land. By far the deadliest and most dangerous scientific misinformation throughout COVID has come from the formal institutions of science themselves. You could sum all the Substacks together and they would amount to 1% of the misinformation that has been published by government-backed "scientists", zero of whom have been banned from anything or received any penalty whatsoever. For as long as the scientific institutions are in denial about how utterly dishonest their culture has become we will continue to see a weird inversion in which random outsiders point out basic errors in their work and they respond by yelling "disinformation".
The meta-study you cite has quite bizarre conclusions. It basically states that isolation behavior is highly effective at preventing Covid deaths, but that lockdowns were a bad predictor of such behavior in people. But, they still seem to attribute huge economic impact to the lockdowns, not the pandemic and natural response (voluntary behavioral changes) to it.
Overall it seems that they would have been better served by adding lockdown compliance to their models, which would likely explain much of the difference. It's absurd to claim that lockdowns don't work when a country like Vietnam (100 million people, first detected cases of COVID-19 community spread outside China) had <100 total COVID-19 deaths in 2020. Strict targeted lockdowns, strict isolation requirements after every identified case, for contacts up to the third degree ( contact of someone who was a contact of someone who came in contact with a patient) - these all must have played a role, and any study that fails to explain such extreme success is simply flawed.
Are you sure it states that? Maybe you mean this paragraph:
"If this interpretation is true, what Björk et al. (2021) find is that information and signaling is far more important than the strictness of the lockdown. There may be other interpretations, but the point is that studies focusing on timing cannot differentiate between these interpretations. However, if lockdowns have a notable effect, we should see this effect regardless of the timing, and we should identify this effect more correctly by excluding studies that exclusively analyze timing."
"It's absurd to claim that lockdowns don't work when a country like Vietnam (100 million people, first detected cases of COVID-19 community spread outside China) had <100 total COVID-19 deaths in 2020"
Many countries claim an improbably low level of COVID death. That doesn't mean their numbers can be taken at face value, although poorer countries do seem to be less worse hit simply because they have far fewer weak, obese and elderly people to begin with.
It's certainly not absurd to claim lockdowns don't work. You're attempting to refute a meta-analysis of studies looking at all the data with the example of a single country. That's meaningless. A single data point can falsify a theory but it cannot prove a theory. To prove lockdowns work there has to be a consistent impact of the policy.
It is completely absurd. "Lockdowns" is shorthand for "people not coming into contact with other people in a way that spreads the virus". Because the virus cannot teleport through walls, lockdowns necessarily prevent transmission.
This is the sort of mis-use of logic that has led to so many problems.
"Lockdowns" is shorthand for "people not coming into contact with other people in a way that spreads the virus"
It's shorthand for a set of government policies that were intended to reduce contact, not eliminate it, because "not coming into contact with other people" is impossible. People still have to go to shops, hospitals, care homes, live with each other, travel around and so on even during a lockdown.
Your belief that it's "completely absurd" to say lockdowns don't affect mortality is based on the kind of abstract but false reasoning that consistently leads epidemiologists astray. Consider a simple scenario in which lockdowns have no effect that's still compatible with germ theory - you're exposed to the virus normally 10 times per week every week. With lockdowns that drops to 5 times a week. It doesn't matter. Everyone is still exposed frequently enough that there will be no difference in outcomes.
"Because the virus cannot teleport through walls, lockdowns necessarily prevent transmission"
Viruses can in fact teleport through walls: "The SARS virus that infected hundreds of people in a 33-story Hong Kong apartment tower probably spread in part by traveling through bathroom drainpipes, officials said yesterday in what would be a disturbing new confirmation of the microbe's versatility."
you're exposed to the virus normally 10 times per week every week. With lockdowns that drops to 5 times a week.
In this scenario, lockdowns work fine, but are insufficient! Perhaps with stringent use of N95 masks during contactless food ration delivery it can be reduced to 0.05 times per week. Then the pandemic ends and we go back to normal.
Also, if you only go out half as often, that's not exactly a lockdown either. A lockdown is people not coming into contact with other people in a way that spreads the virus. It's not people still often coming into contact with other people in a way that spreads the virus but not as often as before.
You don't appear to realize that food gets to your front door via a large and complex supply chain that involves many people doing things physically together at almost every point. Your definition of a lockdown is physically impossible even for cave men to sustain, let alone an advanced civilization. It isn't merely a matter of "works but insufficient".
This kind of completely irrational reasoning is exactly why lockdowns are now discredited. The fact that the response to "here's lots of data showing that lockdowns didn't work" is to demand an impossible level of lockdown that would kill far more people than COVID ever could simply through supply chain collapse alone, really does say it all.
> A lockdown is people not coming into contact with other people in a way that spreads the virus. It's not people still often coming into contact with other people in a way that spreads the virus but not as often as before.
That might be a definition issue. Here in Germany, we had several lockdowns. Many (most?) countries would say we had no lockdown at all.
The question isn’t whether lockdowns with compliance work, the question is how effective is lockdown as a policy. If you create a lockdown policy, what is the impact of that?
> First, people respond to dangers outside their door. When a pandemic rages, people believe in
social distancing regardless of what the government mandates. So, we believe that Allen (2021)
is right, when he concludes, “The ineffectiveness [of lockdowns] stemmed from individual
changes in behavior: either non-compliance or behavior that mimicked lockdowns.”
> Third, even if lockdowns are successful in initially reducing the spread of COVID-19, the
behavioral response may counteract the effect completely, as people respond to the lower risk by
changing behavior
Since it is an obvious consequence of the germ theory of disease that isolation stops the spread of disease, the only real question is if lockdowns are efficient at enforcing isolation (and in what conditions, with what costs etc). At best, the paper concludes that they are not, and that government propaganda about the risks of the disease works as well or better.
> You're attempting to refute a meta-analysis of studies looking at all the data with the example of a single country. That's meaningless. A single data point can falsify a theory but it cannot prove a theory. To prove lockdowns work there has to be a consistent impact of the policy.
I would say the null hypothesis would be that lockdowns do work, by relatively simple, almost mechanical, principles (lockdown forces isolation, isolation means the virus can't physically get from one person to another).
And there is basically no country that hasn't seen significant isolation that did well in the pandemic. Whether that isolation was caused by efective, if perhaps draconic lockdowns (such as in China or Vietnam) or by cultural norms and self preservation (such as in Finland or Norway), this remains true.
The real exceptions are countries that successfully isolated themselves from the outside world, and thereafter isolated only those carrying the virus, through broder closures and strict quarantine requirements plus testing - Taiwan, New Zealand are examples of such.
For this type of response, using broad statistical studies that equate very different ground-level phenomena (lockdowns varied wildly in form and in the degree of compliance, but the paper ignores all that in its quantitative data) to make it look good on a graph are mostly misleading - but what more can you expect from three economists dabbling their hand at epidemiology and sociology?
"The model generated totally different output every time you ran it, even with fixed seeds." - I remember seeing code takedowns of the model from anti-lockdown people who repeatedly cite this issue.
But there is a valid reason for this to happen, and it doesn't mean bugs in the code. If the code is run in a distributed way (multiple threads, processes or machines), which it was, the order of execution is never guaranteed. So even setting the seed will produce a different set of results if the outcomes of each separate instance depend on each other further in the computation.
There are ways to mitigate this, depending on the situation and the amount of slowdown that's acceptable. Since this model was collecting outcomes to create a statistical distribution, rather than a single deterministic number, it didn't need to.
The fact the model was also drawing distributions also will be why different runs produced possibly vastly different results. Those results would be at different ends of a distribution. Only distributions are sampled and used, not single numbers.
Regarding the GCSE level comment, my concern was the opposite, that the model was trying to model too much, and that inaccuracies would build up. No model is perfect (including this one) and the more assumptions made the larger the room for error. But they validated the model with some simpler models as a sanity check.
My view on criticisms of the model were that they were more politically motivated, and the code takedowns were done by people who may have been good coders, but didn't know enough about statistical modeling.
> "The model generated totally different output every time you ran it, even with fixed seeds." - I remember seeing code takedowns of the model from anti-lockdown people who repeatedly cite this issue.
But there is a valid reason for this to happen, and it doesn't mean bugs in the code. If the code is run in a distributed way (multiple threads, processes or machines), which it was, the order of execution is never guaranteed.
Then there's literally no point to using PRNG seeding. The whole point of PRNG seeding is so you can define some model in terms of "def model(inputs, state) ->" output, and get the "same" output for the same input. I put "same" in quotes because defining sameness on FP hardware is challenging. But usually 0.001% relative tolerance is sufficiently "same" to account for FP implementation weirdness.
If you can't do that, then your model is not a pure function, in which case setting the seed is pointless at best, and biasing/false sense of security in the worst case.
As you mention, non-pure models have their place, but reproducing their results is very challenging, and requires generating distributions with error bars - you essentially "zoom out" until you are a pure function again, with respect to aggregate statistics.
It does not sound like this model was "zoomed out" enough to provide adequate confidence intervals such that you could run the simulation, and statistically guarantee you'd get a result in-bounds.
I reckon the PRNG seeding in such a case might be used during development/testing.
So, run the code with a seed in a non distributed way (e.g. in R turn off all parallelism), and then the results should be the same in every run.
Then once this test output is validated, depending on the nature of the model, it can be run in parallel, and guarantees of deterministic behaviour will go, but that's ok.
I didn't develop the model, so can't really say anything in depth beyond the published materials.
I just found it odd at the time how this specific detail was incorrectly used by some to claim the model was broken/irredeemably buggy.
Edit: Actually, in general, perhaps there's one other situation where the seed might be useful, assuming you have used a seed in the first place. Depending on the distributed environment, there's no guarantee that the processes or random number draws will be run in the same order. But it might be that in most cases they're in the same order. This might bias the distribution of the samples you take. So you might want to change the seed on every run to protect yourself from such nasty phantom effects.
My understanding is the bugginess is due to* unintended* nondeterminism, in other words, things like a race condition where two threads write some result to the same memory address, or singularities/epsilon error in floating point calculations leading to diverging results.
Make no bones about it, these are programming faults. There's no reason why distributed, long-running models can't produce convergent results with a high degree of determinism given the input state. But this takes some amount of care and attention.
> So you might want to change the seed on every run to protect yourself from such nasty phantom effects.
That's a perfect example of what I mean where a seed is actually worse. If you know you can't control determinism, then you might as well go for the opposite: ensure your randomness is high quality enough such that it approximates a perfect uniform distribution. Adding a seed here means you are more likely to capture the true distribution of the output.
The other takedown reviews focused on the fact that there was non-determinism despite a seed, without understanding that's not necessarily a problem.
Agreed on the second point about not having a seed, but I added the "assuming you have used a seed" caveat because sometimes people do use the seed for some reproducible execution modes (even multi-thread/process ones), which are fine, and it's just easier to randomly vary the seed generation rather than remove it all together when running in a non-deterministic mode.
"There are ways to mitigate this, but since the model was collecting outcomes to create a statistical distribution, rather than a single deterministic number, it didn't need to."
This is the justification the academics used - because we're modelling probability distributions, bugs don't matter. Sorry, but no, this is 100% wrong. Doing statistics is not a get-out-of-jail-free card for arbitrary levels of bugginess.
Firstly, the program wasn't generating a probability distribution as you claim. It produced a single set of numbers on each run. To the extent the team generated confidence intervals at all (which for Report 9 I don't think they did), it was by running the app several times and then claiming the variance in the results represented the underlying uncertainty of the data, when in reality it was representing their inability to write code properly.
Secondly, remember that this model was being used to drive policy. How many hospitals shall we build? If you run the model and say 10, and then someone makes the graph formatting more helpful, reruns it and it now it says 4, that's a massive real-world difference. Nobody outside of academia thinks it's acceptable to just shrug and say, well it's just probability so it's OK for the answers to just wildly thrash around like that.
Thirdly, such bugs make unit testing of your code impossible. You can't prove the correctness of a sub-calculation because it's incorporating kernel scheduling decisions into the output. Sure enough Ferguson's model had no functioning tests. If it did, they might have been able to detect all the non-threading related bugs.
Finally, this "justification" breeds a culture of irresponsibility and it's exactly that endemic culture that's destroying people's confidence in science. You can easily write mathematical software that's correctly reproducible. They weren't able to do it due to a lack of care and competence. Once someone gave them this wafer thin intellectual sounding argument for why scientific reproducibility doesn't matter they started blowing off all types of bugs with your argument, including bugs like out of bounds array reads. This culture is widespread - I've talked to other programmers who worked in epidemiology and they told me about things like pointers being accidentally used in place of dereferenced values in calculations. That model had been used to support hundreds of papers. When the bugs were pointed out, the researchers lied and claimed that in only 20 minutes they'd checked all the results and the bugs had no impact on any of them.
Once a team goes down the route of "our bugs are just CIs on a probability distribution" they have lost the plot and their work deserves to be classed as dangerous misinformation.
"My view on criticisms of the model were that it was more politically motivated"
Are you an academic? Because that's exactly the take they like to always use - any criticism of academia is "political" or "ideological". But expecting academics to produce work that isn't filled with bugs isn't politically motivated. It's basic stuff. For as long as people defend this obvious incompetence, people's trust in science will correctly continue to plummet.
If you check the commit history, you'll see that he quite obviously didn't work with the code much at all. Regardless, if he thinks the model is not worthless, he's wrong. Anyone who reviews their bug tracker can see that immediately.
> To the extent the team generated confidence intervals at all (which for Report 9 I don't think they did), it was by running the app several times and then claiming the variance in the results represented the underlying uncertainty of the data, when in reality it was representing their inability to write code properly.
Functionally, what's the difference? The output of their model varied based on environmental factors (how the OS chose to schedule things). The lower-order bits of some of the values got corrupted, due to floating-point errors. In essence, their model had noise, bias, and lower precision than a floating point number – all things that scientists are used to.
Scientists are used to some level of unavoidable noise from experiments done on the natural world because the natural world is not fully controllable. Thus they are expected to work hard to minimize the uncertainty in their measurements, then characterize what's left and take that into account in their calculations.
They are not expected to make beginner level mistakes when solving simple mathematical equations. Avoidable errors introduced by doing their maths wrong is fundamentally different to unavoidable measurement uncertainty. The whole point of doing simulations in silico is to avoid the problems of the natural world and give you a fully controllable and precisely measurable environment, in which you can re-run the simulation whilst altering only a single variable. That's the justification for creating these sorts of models in the first place!
Perhaps you think the errors were small. The errors in their model due to their bugs were of the same order of magnitude as the predictions themselves. They knew this but presented the outputs to the government as "science" anyway, then systematically attacked the character and motives of anyone who pointed out they were making mistakes. Every single member of that team should have been fired years ago, yet instead what happened is the attitude you're displaying here: a widespread argument that scientists shouldn't be or can't be held to the quality standards we expect of a $10 video game.
How can anyone trust the output of "science" when this attitude is so widespread? We wouldn't accept this kind of argument from people in any other field.
At the time, critics of the model were claiming the model was buggy because multiple runs would produce different results. My comment above explains why that is not evidence for the model being buggy.
Report 9 talks about parameters being modeled as probability distributions, i.e. its a stochastic model. I doubt they would draw conclusions from a single run, as the code is drawing a single sample from a probability distribution. And, if you look at the paper describing the original model (cited in report 9), they do test the model with multiple runs. On top of that they perform sensitivity analyses to check erroneous assumptions aren't driving the model.
I have spent time in academia, but I'm not an academic, and don't feel any obligation to fly the flag for academia.
Regarding the politics, contrast how the people who forensically examined Ferguson's papers were so ready to accept the competing (and clearly incorrect https://www.youtube.com/watch?v=DKh6kJ-RSMI) results from Sunetra Gupta's group.
Fair point about academic code being messy. It's a big issue, but the incentives are not there at the moment to write quality code. I assume you're a programmer - if you wanted to be the change you want to see, you could join an academic group, reduce your salary by 3x-4x, and be in a place where what you do is not a priority.
Your comment above is wrong. Sorry, let me try to explain again. Let's put the whole fact that random bugs != stochastic modelling to one side. I don't quite understand why this is so hard to understand but, let's shelve it for a moment.
ICL likes to claim their model is stochastic. Unfortunately that's just one of many things they said that turned out to be untrue.
The Ferguson model isn't stochastic. They claim it is because they don't understand modelling or programming. It's actually an ordinary agent-like simulation of the type you'd find in any city builder video game, and thus each time you run it you get exactly one set of outputs, not a probability distribution. They think it's "stochastic" because you can specify different PRNG seeds on the command line.
If they ran it many times with different PRNG seeds, then this would at least quantify the effect of randomness on their simulation. But, they never did. How do we know this? Several pieces of evidence:
2. The program is so slow that it takes a day to do even a single run of the scenarios in Report 9. To determine CIs for something like this you'd want hundreds of runs at least. You could try and do them all in parallel on a large compute cluster, however, ICL never did that. As far as I understand their original program only ran on a single Windows box they had in their lab - it wasn't really portable and indeed its results change even in single-threaded mode between machines, due to compiler optimizations changing the output depending on whether AVX is available.
3. The "code check" document that falsely claims the model is replicable, states explicitly that "These results are the average of NR=10 runs, rather than just one simulation as used in Report 9."
So, their own collaborators confirmed that they never ran it more than once, and each run produces exactly one line on a graph. Therefore even if you accept the entirely ridiculous argument that it's OK to produce corrupted output if you take the average of multiple runs (it isn't!), they didn't do it anyway.
Finally, as one of the people who forensically examined Ferguson's work, I never accepted Guptra's either (not that this is in any way relevant). She did at least present CIs but they were so wide they boiled down to "we don't know", which seems to be a common failure mode in epidemiology - CIs are presented without being interpreted, such that you can get values like "42% (95% CI 6%-87%)" appearing in papers.
I took a look at point 3. and that extract from the code check is correct. Assuming they did one realisation I was curious why. It would be unlikely to be an oversight.
"Numbers of realisations & computational resources:
It is essential to undertake sufficient realisation to ensure ensemble behaviour of a stochastic is
well characterised for any one set of parameter values. For our past work which examined
extinction probabilities, this necessitates very large numbers of model realizations being
generated. In the current work, only the timing of the initial introduction of virus into a country is
potentially highly variable – once case incidence reaches a few hundred cases per day, dynamics
are much closer to deterministic."
So looks like they did consider the issue, and the number of realisations needed is dependent on the variable of interest in the model. The code check appears to back their justification up,
"Small variations (mostly under 5%) in the numbers were observed between Report 9 and our runs."
The code check shows in their data tables that some variations were 10% or even 25% from the values in Report 9. These are not "small variations", nor would it matter even if they were because it is not OK to present bugs as unimportant measurement noise.
The team's claim that you only need to run it once because the variability was well characterized in the past is also nonsense. They were constantly changing the model. Even if they thought they understood the variance in the output in the past (which they didn't), it was invalidated the moment they changed the model to reflect new data and ideas.
Look, you're trying to justify this without seeming to realize that this is Hacker News. It's a site read mostly by programmers. This team demanded and got incredibly destructive policies on the back of this model, which is garbage. It's the sort of code quality that got Toyota found guilty in court of severe negligence. The fact that academics apparently struggle to understand how serious this is, is by far a faster and better creator of anti-science narratives than anything any blogger could ever write.
I looked at the code check. The one 25% difference is in an intermediate variable (peak beds). The two differences of 10% are 39k deaths vs 43k deaths, and 100k deaths vs 110k deaths. The other differences are less than 5%. I can see why the author of the code check would reach the conclusion he did.
I have given a possible explanation for the variation, that doesn't require buggy code, in my previous comments.
An alternative hypothesis is that it's bug driven, but very competent people (including eminent programmers like John Cormack) seem to have vouched for it on that front. I'd say this puts a high burden of proof on detractors.
He is unfortunately quite wrong, see below. I don't believe he could have reviewed the code in any depth because the bugs are both extremely serious and entirely objective - they're just ordinary C type programming errors, not issues with assumptions.
Also food and other required resources are similarly unable to teleport through walls, so the people involved in growing, transporting, preparing and delivering them to your door can't do "lockdown" like the minority who are able to work from home.
I have been adjacent to industrial and academic partnerships where both the university and company wanted to maintain the IP of the work and in their minds that extended to the software. Paper was published and the source code was closely guarded. I wondered how people could replicate the findings without the fairly complex system the researchers used.
> Along the way, everyone knew and when people tried to bring up concerns higher ups in the institution suppressed the knowledge.
As you are doing here. I don't see a name or any specifics anywhere in your comment. Of course not including any names or specifics also means you could be making it up for a good story. We have no way of knowing.
Presumably OP does not have hard evidence and is only relating an anecdote. There isn’t much of an incentive to waste time and resources fighting for academic integrity or some other high minded concept like this. Baring those incentives we’ll be left with anecdotes on forums and a continued sense of a diminished trust in academia.
Op sounded very confident that the persons allegedly involved _are_, in no uncertain terms, not just definitely total frauds but also definitely engaged in a giant fraud conspiracy that definitely goes all the way to the top. If what they meant was "someone told me once that..." they could have said that instead, but they've chosen to word things very very differently. At best they've drastically overstepped reasonable limits of what claims one is able to rightly make, and that assessment feels extremely generous.
Yes. Exactly. Thank you for sharing what I was going to share. Corruption exists where it is allowed by the people who act out of cowardice.
As an aside, I've worked with plenty of academics and while I sometimes thought their research area was stupidly low stakes, the only researchers that I thought were truly wasting time were the ones that had to do research for a medical degree. Basically forced research.
Now I went to a premier university and I'm friends with some smart cookies, but I don't buy for a second the overall theme of this comment chain. There is a reason the West is incredibly wealthy and it isn't because our best and brightest are faking it.
There is a lot less fakery in science than poorly-designed studies, misleading endpoints, underdocumented or incorrectly documented methods, and cargo-culting. The success of the process comes from having a good filtration process to sift through this body of work, and the idea that there will always be some people in the system doing actually good work.
That said, I have also witnessed plenty of low-level fraud: changing of dates to match documentation, discarding "outlier" samples without justification or even documentation, etc. Definitely enough to totally invalidate a result in some cases.
Your first comment was effectively "Everyone knows this person is a fraud and no one is willing to stand up and put a stop to it".
Your second comment was effectively "Well I don't know they are a fraud and I'm not going to be the one who upsets people by trying to put a stop to it".
I don't say this as a criticism of you. I say this a defense of the people you are criticizing. Stopping people like this takes a lot of work and often some personal risk. Most of us aren't willing to do it despite us pretending otherwise.
>Stopping people like this takes a lot of work and often some personal risk.
I don't recall where I first heard it, but the principle that it takes 10x the amount of energy to refute bullshit as it does to generate it certainly seems to apply in this case.
A post-doc of a collaborator of a professor of mine was once found to have doctored data. The professor and her collabs only found out when they looked at the images and found out that there were sections that seemed to have been copy/pasted. Getting the papers retracted was a gruelling process that (iirc) took over 3 years.
>The people in charge were given evidence, face much less risk, and it's their job to put a stop to it.
That is an awfully authoritative statement to make based off what OP shared.
But either way, this has been proven time and time again whether we are talking about simple corruption like OP mentioned or more serious forms of injustice. People will judge themselves based off motivations and others based off actions. We can excuse ourselves because of the personal inconvenience that acting will cause us. But other people don't get that luxury. They get criticized purely based off that lack of action because if we were in their situation surely we would do the right thing. Being honest about this is important step to actually fixing the system because we need to identify the motivations which lead to the lack of action in order to remove them and encourage action. Simply vilifying people for not acting accomplishes nothing.
> That is an awfully authoritative statement to make based off what OP shared.
Which part do you disagree with? Maybe the 'much' on much less risk?
'given evidence' is true unless OP made op the story. (And even if OP made up the story then we're judging fictional people and the judgements are still valid.)
I think 'less risk' goes part and parcel with being administration rather than someone lower rank reporting a problem.
And it seems clear to me that it's their job.
So, criticism. Which is not a particularly harsh outcome. And doesn't necessarily mean they made the wrong decision, but them providing a justification would be a good start.
Inaction is often excusable when it's not your job. It's always important to look at motivations, but vilification can also be appropriate when there's dereliction. Sometimes there are no significant motivations leading to lack of action, there's just apathy and people that shouldn't have been hired to the position.
>'given evidence' is true unless OP made op the story.
OP never mentioned evidence. People just "knew" this person was a fraud but there was no mention of the actual evidence of fraud. The closest thing to evidence is that "their work was terrible" but that isn't evidence of fraud.
>I think 'less risk' goes part and parcel with being administration rather than someone lower rank reporting a problem.
We have no idea the risks involved. It would be highly embarrassing for a prestigious school and/or instructor to admit that they admitted a fraud into their program. Maybe this isn't even the first time this has happened to these people. Would you want to be the person know for repeatedly being duped by frauds? Maybe that would that ruin your reputation more than looking the other way and letting this person fail somewhere else where you would not be directly tied to their downfall. It is also incredibly risky to punish this person without hard evidence as that can lead to a lawsuit.
These are not meant to be definitive statements. They are just hypothetical that show how we can't judge people's motivations without knowing a lot more about the situation.
This website is pretty easy to post on anonymously. It takes five seconds to make a throwaway and another three to post a name.
While I agree that there are frauds out there in academia and elsewhere, I have no reason to believe that you’re not yourself some sort of fraud. You’ve essentially posted “I have direct knowledge of [unspecified academic fraud in which somebody claimed to have direct knowledge of [unspecified academic fraudulent conclusion] but can’t back it up] but won’t back it up”
Your overall point is… what? Fraud of perpetuated by cowardice? Self interest? An overall sense of apathy towards the truth? Your comment could be construed as any of those.
There are people that love to spread fear, uncertainty and doubt without having to rely on being truthful. People that are intentionally misleading with the sole intent of leveraging people’s biases and emotions to confirm folks notions and whip people up into an artificially-created frenzy make statements yours.
Serious questions:
1. Do you actually give a shit about this big fraud you’ve brought up but not revealed?
And
2. Why did you post?
> 1. Do you actually give a shit about this big fraud you’ve brought up but not revealed?
To expose it directly here would likely have little effect generally, but would have an outsized effect personally (damaging trust). If it did have an impact it would likely expose those involved (many careers). I am not in the command chain, I have seen the evidence and it's overwhelming. But I'm not in a position to enact the requisite change.
That said, I don't actually give a shit about this particular case. It's widespread, insanely wide spread. Most studies / work cannot be replicated.
> 2. Why did you post?
As an anecdote to highlight something that I've seen. I also linked to several other informational pieces with public accounts (so don't trust mine, fine -- trust theirs).
> To expose it directly here would likely have little effect generally, but would have an outsized effect personally (damaging trust). If it did have an impact it would likely expose those involved (many careers). I am not in the command chain, I have seen the evidence and it's overwhelming. But I'm not in a position to enact the requisite change.
Sorry, I don’t mean to pick at you but… what?
If you were to anonymously post the name of an academic fraud, you personally would necessarily be found out, and it would ruin multiple careers?
The powers that be know who you are, know of the fraud and its nature and those involved? You’ve been privy to this big juicy secret that’s shared by many, but if its content were to be revealed, you would certainly be the one pointed out?
Are you the only person that could reasonably know about and publicly object to this fraud? If so, is it a necessary function of your social or professional life to cover up this fraud? If so, I’ll go back to “why did you post?” (“Sharing anecdotes” isn’t really an answer to “Why did you share this anecdote?”)
Not sure why people are that hostile. I'm sure that identifying someone also can identify you if the circle around the person is small enough, and if said person is powerful enough, people choosing to believe them over him would end careers, yes. He'd have to say "well, i worked on this particular study with him, and the data was made up vs actually collected."
I think its more "how dare you hint scientists are corrupt!" that kind of drives the outrage.
> This website is pretty easy to post on anonymously.
Correct me if I am wrong, but this website, and the owners are in U.S. of A. If so, an appropriate court order would force the disclosure of logs, IP addresses, accounts, etc. There are plenty of examples where sites had to disclose sufficient details to track an 'anonymous' writer down.
With multi-million (billion?) dollar endowments on the line, I would be worried to presume just a throw away account would work.
VPNs exists and are trivial to use. As is the TOR browser. Yes it's probably compromised already, but the FBI is not going to show their hand chasing a random libel case.
People on HN are so odd sometimes. Ah yes, I must be a liar if I don't want to spill all the details my friends told me in confidence and ruin our relationship.
Sure, sometimes people lie on the internet but this is not an outlandish story.
“Best case” scenario everyone loses their funding, including the colleagues who did report this stuff. The institution, the lab, etc would lose their funding. Those wanted to get PhDs in those labs will lose theirs. It’ll have a serious negative impact on everyone, even those who did the correct thing.
This is why academia is as corrupt as it is. They all evaluate each other’s paper, give each other grants and anyone who exposes anything loses their entire careers.
the two parent comments speak to me (uninvolved) as wholly throwing all people, institutions, grant selection and everything else, under the bus without distinction. The drivers for that are unknowable, but I guarantee that neither these statements, NOR their complete compliment, are Truth. Proof is that without sufficient distinction, nothing claimed is distinguishable enough to even weigh. Add that the selection pressure in many institutions is many orders of magnitude more than ideal, and the consequences of the outcomes, similar.
I don't know, which puts me exactly where I was ten minutes ago. They have no control over the situation, and if they hadn't posted I would know even less, so it's not suppression.
They have complete control over how complicit they are in preserving the coverup of acts they allege to know are definitely true.
> and if they hadn't posted I would know even less
The reality is that right now you know exactly as much as before they posted because what they posted was unsubstantiated. They may as well have said that their name is Princess Peach and that they're pregnant with the mustached child of a certain Italian plumber for all the good it does you.
One of three things must be true:
1) They have real firsthand knowledge that the claims are true and they're actively deciding to protect the identity(ies) of a conspiracy of rampant fraudsters whose actions are so egregious that they tarnish the very essence of the scientific academy itself.
2) They don't have any real knowledge that the claims are true, and the story is rumormongering.
3) I swear I had a third one, but now that I've written those two I can't think of what it was. I'll leave this placeholder here in case it comes to me.
Of course, there are frauds in any industry/profession.
But in my experience (math, science, & engineering), it is actually far less prevalent than in other places.
Forget the overzealous #sciencetwitter people. I have found that academia is one of the rare places where people the absolute top of their field who are often actually modest & aware of their ignorance about most things.
>I have found that academia is one of the rare places where people the absolute top of their field who are often actually modest & aware of their ignorance about most things.
This view doesn't really align with designing public policy - while those people exist - they aren't the problem (or are insofar they aren't voicing their positions loud enough).
Coming from a place of modesty and acknowledging limitations is not the "believe science" movement.
Eric seems lightly inclined to fringe theories and self-importance, but nothing I'd call fraud. Bret has been pushing some pretty unfortunate stuff though, including prophylactic ivermectin as a superior alternative to vaccination:
> “I am unvaccinated, but I am on prophylactic ivermectin,” Weinstein said on his podcast in June. “And the data—shocking as this will be to some people—suggest that prophylactic ivermectin is something like 100% effective at preventing people from contracting COVID when taken properly.”
He wasn't just claiming that ivermectin might have some efficacy against SARS-CoV-2 (possible, though I doubt it), or that the risks of the vaccine were understated to the public (basically true; but it's a great tradeoff for adults, and probably still the right bet for children). Bret was clearly implying that for many people--including himself, and he's not young--the risk/benefit for prophylactic ivermectin was more favorable than for the vaccine. There was no reasonable basis for such a belief, and the harm to those who declined vaccination based on such beliefs has become obvious in the relative death rates.
The first article I've linked above is by Yuri Deigin, who had appeared earlier on Bret's show to discuss the possibility that SARS-CoV-2 arose unnaturally, from an accident in virological research. This was back when that was a conspiracy theory that could get you banned from Facebook, long before mainstream scientists and reporters discussed that as a reasonable (but unproven) hypothesis like now. So I don't think Bret's services as a contrarian are entirely bad, but they're pretty far from entirely good.
What's fraudulent about them is not their papers (there are none of any relevance to speak of) but their character. They are both self proclaimed misunderstood geniuses who have been denied Nobel prizes in spite of their revolutionary discoveries (in 3 different fields, Physics, Economics and Evolutionary Biology). In actuality they are narcissistic master charlatans with delusions of grandeur.
Then the comment by ummonk above is off topic because there is no credible claim of scientific fraud. Lots of people are blowhards with odd opinions. So what?
And the ones that are masquerading as having Nobel worthy research chops to get the audience to believe their gripes about the scientific establishment is on topic enough.
It seems you don't like them for some reason, but complaining about the scientific establishment isn't fraud and has nothing to do with censorship. So what's your point?
One claims to have discovered a Theory Of Everything, putting forth a paper riddled with mathematical errors and with the caveat that it is a "work of entertainment". The other claims to have made a Nobel worthy discovery that revolutionizes evolutionary biology.
I wouldn't use the word 'charlatan' as leniently. Without commenting on the validity of their work, the 'mainstream opinion' about them is most certainly negative. Yet instead of pivoting elsewhere, they stick by their convictions. They might be right or wrong on their opinions, but they are hardly doing it to win any favours. And it's certainly not wrong to stand by something you believe even if the mainstream discredits you; time will tell who was right.
I've got not dog in the fight, but I've listened to some podcasts by Bret Weinstein and Heather Heying, and compared to the absolute nonsense I see on TV today, it is a breath of fresh air. They're reading scientific articles, discussing implications in long form, and have been open and honest about their mistakes.
I have not seen or heard of Eric Weinstein so I can't comment.
I'm not sure what kind of bar you're using to compare your chosen media, but it seems extremely high, and I'd like to know what you consider to be suitably informative.
They really haven't. They continued to double-down on ivermectin and other COVID era flim flam as it came to light more studies were dodgy or outright fraudulent. It's pretend science theatre from a former small university lecturer who managed a couple research papers in 20 years. For example, telling the audience with a straight face that it doesn't matter if the studies going into a metaanalysis are biased because the errors will cancel out.
>There are many examples, at the end of the day... the people in the institutions will protect themselves.
This extends well beyond the science itself, for what it's worth. It's an open secret in every department that professors X, Y and Z have the same level of sexual restraint as Harvey Weinstein. It doesn't stop the "woke" institutions doing everything they can to protect these professors' reputations, though.
> Their lab mates tried to raise concerns and nothing happened. That person graduated, with everyone on the committee knowing the issues.
I understand after some time it would be an embarrassment for the department because they hired, vetted them, etc. But why did that get tolerated initially, was it a case of nepotism? It seems they must have had some kind of a special privilege or status to skate by so easily despite the accusations.
I agree that there is an integrity problem in some areas of research (due to incentives), however the Eric Weinstein reference is laughable. I'd consider him a prime example of pseudo science for dollars / influence with zero credibility. Not saying he can't be on the right side of an issue some times, just that he's a particularly untrustworthy source, and the way he gets to his conclusions is just... wow.
> Most cs work is equally useless. It’s all a farce.
Hey careful there, it must depend on the area because all the type theory and functional programming research comes with and always came with proofs and working code. It couldn't be more rigorous and useful.
it's almost like our fav person Ayn Rand got that right [0]. Wat is pure science? It's little without a consciousness and there are a lot of perverse stimuli. For one, I found the requirement to publish (something "worthwhile", which is subjective) during my PhD so stupid. You can work hard and cleverly in science for 4 years yet only debunk stuff or not even find anything at all (but that by itself should be publishable!) or you can get lucky and publish a lot. All the publish-or-perish pressure does is force subpar stuff into the community. Like you two, I also got a bit disillusioned with non-applied science as we have it.
Edit: I don't want to say (like Ayn Rand, who can be pretty black and white) that it is all bad and we should do away with it, but it's something we should be very aware of and try to build in mechanisms to protect ourselves from these effects.
Probably, I was being sarcastic, if you've been on HN for some time, you know that Ayn is not universally loved here. At times this is ironic because she matches quite well the opinions of people here wrt government involvement in the market, love, and now indeed state sponsored science.
Now think about research on chemicals, everybody has a different source, different quality control (most academic labs do 0 qc on incoming chemicals). I have bought chemicals from major and minor vendors and I could tell you all kind of horror stories... Wrong molecule, inert powders added to increase weight, highly toxic impurities... Now you add that to assays and academics that have been optimized for years to scream "I HAVE A NEW DRUG AGAINST X" anytime they stare too long at the test tube...
This is absolute baloney. I've ordered numerous research grade chemicals from multiple suppliers and not once has any of them been the wrong one nor outside of stated purity grade — and I regularly checked, since it's standard practice. If a solid organic material is in a lower grade of purity it is typically recrystallized.
Now, yes, impurities — even minor ones — can have significant effects. But that tends to be in rare circumstances and chemists are quite aware of the need to check for it where it's most needed such as catalysis research.
No one is going to scream "I have a new drug" for something for which the composition is unclear.
I don't know what world you live in, but it isn't one of a typical North American nor European university research lab.
This was my work to check for quality of chemicals entering the lab. NMR, MS, IR... And over 15 years I have seen dozens and dozens of cases. Now most labs call HPLC with UV sufficient for quality analysis. Lots of things looked "fine" that way that's for sure. Note that I was in the drug discovery world not in the inorganic chemistry world where things are ususally of much better quality.
For the record I now know a professor at one of the premier institutions in the world, who is a total fraud. Their research was fraudulent in grad school. Their lab mates tried to raise concerns and nothing happened. That person graduated, with everyone on the committee knowing the issues. Then they got a premier post-doc position. People in their lab (who I caught up with at a conference) mentioned their work was terrible. Now they’re a professor at a top tier university.
Along the way, everyone knew and when people tried to bring up concerns higher ups in the institution suppressed the knowledge. Mostly because their fraudulent work was already cited tens of times.
This wasn’t directly in my field, but I saw it go down and followed it.
In my day job, I just throw out papers that don’t publish datasets and code. Most cs work is equally useless. It’s all a farce.
EDIT: I recommend the book for some insights - “Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions by Richard Harris”
There’s also a decent podcast by Bret and Eric Weinstein that dives into a similar story https://youtu.be/dJNjH4SP6vw?list=PLq9jO8fmlPee9ezOraOHAJ3g9...
There are many examples, at the end of the day... the people in the institutions will protect themselves.