Not just that, but models that are “independently trained” by a community of people using some kind of distributed p2p method.
If this LLM stuff is as important as it is made out to be (and I think it is), it is absolutely crucial that it isn’t controlled by just a bunch of large corporations and tech oligarchs. A world where everybody needs LLM’s and the only source is from gigantic tech companies would be incredibly distopian.
Plus I seriously doubt the true innovation on these things will happen until the “little guy” can train and infer their own models. Right now these things are playing it way to safe. We should be reading articles about people using home built LLMs to do crazy shit that sticks it to “The Man”. That’s how transformative technology changes things. It challenges the status quo. The only “quo” that is being challenged right now is some boring mega tech company is peeing in some other mega tech company’s cheerios. Yawn. Wake me up when this technology threatens the entire fucking system (and not fake “AGI will take over all white collar jobs”… that is just corporate propaganda).
… I mean remember Napster and all those file sharing companies? Or the million iterations of Pirate Bay? Where is the LLM equivalent of that? Where is the “Linux” of LLM’s that freaks out all the tech companies? Or dark web of LLM’s that attract the eye of every three letter agency in the world? Where is the revolution? It’s just a bunch of huge tech companies safely jerking each other off wearing three layers of protection and their corporate lawyers on speed dial. How completely boring.
From my testing, I see that one still shuts down the Copilot completely. It may mean "late" in French, but it's also often used (at least where I live) to mark SR/Slow Release versions of drugs. Apparently, now, even writing software for pharmacies is immoral and should be blocked... :D
Seems to work again. But how does this happen in the first place. How could someone possibly have thought "hey I have an idea, let's put in a list of english words and just silently stop working if we have see even one of them in a substring". And people in this meeting would nod and say "yeah that sounds like an easy safety fix, let's do that". This just feels odd. This isn't a piece of forum software written by a 14 year old this is a company worth billions and filled with smart people.
Companies that are ironically filled with privileged people who "Gotta do something", since they, while sort-of well meaning, are sheltered and disconnected from actual social struggles in real life.
This reminds me of rental car companies deciding they should switch their vehicle fleets to electric.
A few months ago, I visited San Jose for a wedding. When I picked up my rental car at the airport, the only options were electric, even though I had specified that I wanted an ICE vehicle tat the time I made the reservation. During the four day trip I wound up visiting five different charging stations (some of which slow charged, so weren't able to replenish the battery in the time available), and I had to install three different apps. I still have like $20 of unused credit between them. I spent several hours waiting for the car to charge, not to mention making major detours looking for a fast-charging station. If I were to guess a part of the world you'd expect to find the best possible electric vehicle charging infrastructure, San Jose wouldn't be far off the mark. But my trip wound up being dominated by range anxiety.
I drive a PHEV (plug-in hybrid) for my commute from home to work and I love it. Electric is the future, and it's great that electric vehicles and charging stations are becoming more common. But renting an electric car in a strange city today is about the worst possible scenario for a short range vehicle. You don't know where the charging stations are, the charging stations require different apps, your hotel might not have a charger, and so on. The people making decisions at car rental companies should know this!
When a feature is both conceptually flawed and technically unworkable, the real question is why it still shipped. Engineers typically push back—unless job security concerns make silence the safer option...
And when the decision-maker can also steer the narrative...say...by mobilizing downvotes...The outcome is predictable. :-))
The AI industry is concerned about the fact that the world will consider them to be basically endorsing everything their AIs say. Thus, they are very afraid of there being a situation where you write "gender: 'm<|>'" and hit autocomplete at the <|> and end up with something like "gender: 'male as is normal'" or "gender: 'male', 'female', 'wrong'" or any number of other bad situations.
They are not being randomly paranoid. Even if they did not have this fear, they would have rapidly developed it. We've all read the articles by muckraking journalists that take something an AI said and basically deliberately writes clickbait about how stupid or evil or worthless or whatever the AI is, even if the journalist had to filter through hundreds of replies (or, implicitly, by waiting for the dumbest stuff to rise to the top of social media, thousands or millions of replies) to get it. We've also read the articles where in someone uses the "fancy autocompleter", feeds it the moral equivalent of "Hey, how do you think you AIs will be taking over the world in five years?" and then is shocked, shocked at the "fancy autocompleter" filling in the yarn they are clearly asking for, and go running to either the media, or in particularly pathological cases, the academic literature making wild claims.
(I do not believe that "fancy autocompleter" is a complete description of LLMs, but in this particular case, it isn't a completely inaccurate mental model either. It shouldn't be a surprise that when you prompt it with X, you get more X.)
As a result the AIs are very heavily tuned to some combination of the political beliefs of the company writing them and the political beliefs dominant in the media coverage they are worried about, so they won't get very negative stories written about them. For this purpose, I'm taking the broadest possible definition of "political", not just "American politics in 202x", but the full range of "beliefs that not everyone agrees on and are things people are willing to exert some degree of power over". The AI companies have to take a stand, because taking a stand at least means someone can be on their side... if they just let the chips fall where they may they'll anger everyone because everyone can get the AI to say things that they in particular disagree with and they'll find themselves without friends. Unsurprisingly, the AI companies have been aligning their models with what they perceived to be the largest, most powerful political beliefs in their vicinity.
To be honest when I read them talking about "AI safety" I know they want me to be thinking "ensuring the AI doesn't take over the world or tell people to commit self harm" but what I see is them spending a lot of effort to politically align their AIs, with all that entails.
Your last paragraph is very true, and is the biggest scandal in AI. Unfortunately, we have let the group of people who believes that mere exposure to ideas one disagrees with can cause them significant harm, make the rules, so we see more and more idiotic things like this.
I have a simple question. If censorship is considered evil regarding the written word and communications between humans, why do we want to then censor LLMs differently? It is either counterintuitive or simply a false concept we should abandon. Perhaps it is more about training, similar to how children are 'monsters' and need to be socialized/tamed.
Because of the obvious PR implications of having a program one's company wrote spewing controversial takes. That's what it boils down to - and it's entirely reasonable.
Personally, I wish these things could have a configurable censorship setting. Everyone has different things that get under their skin, after all (and this would satisfy both the pro-censor and pro-uncensored groups). It's a good argument for self hosting, too, because those can be filtered to your own sensitivities.
That would help with cases where the censorship is just dead wrong. A friend was working with one of the coding ones in VS Code, and expressed his frustration that as soon as the codebase included the standard acronym for "Highest Occupied Molecular Orbital" (HOMO) it just refused any further completion. We both guessed the censor was catching it as a false positive for the slur.
> We both guessed the censor was catching it as a false positive for the slur.
There is a word for this. It is called the Scunthorpe problem. Named after the incident in which the residents of the Town Scunthorpe could not register for an AOL account because AOL had an obscenity filter that did not allow the Town name.
It has been a problem since 1996 and still causes problems.
That's the right answer. And it's not like this is a potential risk that is only being theorized about. Microsoft already has a very hands-on experience with disasters of this exact nature:
I censor myself all the time when speaking, depending on the context and who I’m speaking to. I do it because it’s typically in my best interest to do so, and because I believe that it is respectful in that moment. The things I say represent me. I don’t find it too surprising that a company might want to censor their own AI product as it represents them and their reputation.
If I make a word processor, it doesn't need any stance on the Israel/Palestine conflict. It's just a word processor.
But if I make an LLM, and you prompt it to tell you about the Israel/Palestine conflict? The output will be deeply political, and if it refuses to answer that will also be political
The technology industry does not know what to do because unlike industries like journalism and publishing who are used to engaging with politics, a lot of norms, power structures and people in tech think we're still in the 1990s making word processors, no politics here.
Government censorship and having policy on what employees/representatives of your companies say are two different things.
There are a lot of things I can say as a citizen that would get me fired from my job, or at least a talking-to by someone in management.
At the moment at least these LLMs are mainly hosted services branded by the companies that trained and/or operate them. Having a Microsoft-branded LLM say something that Microsoft as a corporation doesn't want said is something they will try to control.
That's also different from thinking that all LLMs should be censored. You can train or run your own with different priorities if you wish. It's like how there's a lot of media out there that you can consume or create yourself perfectly legally that isn't sold at Walmart.
> If censorship is considered evil regarding the written word and communications between humans, why do we want to then censor LLMs differently?
Simple: the people who are very pro free speech (i.e. "censorship is evil"), and the people who want to censor LLMs are distinct groups (though both groups are vocal).
They're not being censored by the state in this case (unlike the gotcha everyone keeps using when deepseek is mentioned). They're being limited by their developers for brand safety.
That's easy. The person who used an LLM to generate the text and then published it as their own without even proofreading it is liable for the text they published.
I think that one difference is friction. Communication in the real world takes more effort to spread and has more limitations on the ability to scale that spread. E.g. it costs money to put up billboards and someone standing on a soapbox in the town square can only reach so many people. That friction provides greater opportunities for cooler heads to prevail and greater opportunities for people to counter questionable narratives.
It's somewhat similar to the laws some places have against providing free alcohol. Alcohol is still legal and abuse still happens. However, at least requiring people to spend money provides some friction to prevent things from escalating too much.
Beyond even PR concerns (which you could, in principle, ignore as silly, though of course in practice they are a significant hurdle), you also need to consider that free speech even for people is not absolute. If an LLM responds to a child's query with sexually explicit content, that likely breaks the law, and the company is liable for that. Similarly, if an LLM generates libelous statements about a real person when prompted to describe that person, the company is liable. If an LLM starts generating medical advice or legal advice, that might break certain laws as well (though perhaps some reasonable disclaimers could fix this too).
Because the people in charge are morons and the public is full of morons and the morons in charge are scared that the morons of the public will get mad and make a big deal about some inconsequential bullshit like "your llm said a naughty word!!"
I think you'll find that censorship is actually quite popular. The censors and censorship-hungry know the connotations of the word "censorship." Nobody will say it with their chest they believe censorship is a societal good straight up and make a robust case for it. It's all coy fuckery to lie to us and themselves that they're not censors.
This is really interesting. We have a company that claims to have an AI that can reason about text and that same company uses an old school hard coded censorship list. When a company doesn't use their own products, it usually tells you the product isn't up to the task.
Your strawmaning MS: pointing out a place where they don’t use their products doesn’t "prove" they don’t use it at all. They very probably use it somewhere else, and arbitrate that this particular functionality would be better served by "old school hard coded list", which also a very valable choice in many casses
I didn't intend to imply that, but I see how my wording was unclear.
I mean that LLMs don't appear to be up for these censorship-like tasks. The evidence being that a highly visible team using LLMs uses much older tech for a highly visible function. It's useful to know the limits of tech, especially novel tech, and this use case appears to be one.
Indeed I didn’t understand your first post that way. What you just wrote is much more clear. Perhaps the tech decision was also influenced by the fact that this highly visible function is also highly sensitive.
One of the companies I worked with developed software for drug rehabs. Copilot would just constantly stop autocorrecting whenever I went into code files that mentioned anything to do with drugs (or sex - that's not a new censor!). It's the main reason I switched to Supermaven (before they got bought out and gave up on updating their extension).
Same. I am back to Copilot sadly. I don't know that there are any other good alternatives for autocomplete (good as in: so much better than Copilot that it's worth switching to).
It feels like nobody's working to improve the autocomplete/copilot experience, everyone's focused on the "chat with the code and get AI to make all the changes for you" instead of "I know what I'm doing, just predict what I'm about to type and save me the effort of typing it out".
The real-word scenario that has happened to us was that it was completely misinterpreting words, for instance it stopped working because our file contained the word "retard", this was a localization file for the French translation where "retard" translate as "late", we needed to change the entire order of the file to avoid this problem and keep the auto-complete.
I can see why model providers wouldnt want to generate certain text for their own interests.
But lets not pretend its for the benefit of users. If a company could release an unmoderated model without real risk to themselves then they should do so.
Probably not that useful to prevent the use of AI, just prompt for any shipments. In a real app the contents of the packages shouldn't be hard-coded anyway. Might prevent the really stupid who can't figure that out though.
• My dad had a story about an all-staff memo about an "African-American tie event".
• I had warnings from Apple about using "Knopf" in a description, which can only have come from the English word "knob" being literally (and inappropriately) translated into German from an English-language bad word filter, as "Knopf" isn't at all rude in German.
Regardless of the why and the politics involved, having a coding assistant block on certain words, especially on common fields like gender, is going to disqualify it for many projects. The fact that a software project uses certain words, doesn't imply anything about the political views or potential abuse in that project. It could be that a project itself wants to censor the same words in content it handles, but now you can't use Copilot on that list. Or it might be a list of words the project wants to promote. You can't know. Making these sort of assumptions in a coding tool is a bad idea, no matter from which angle you look at it.
But I'm not a huge fan of relying on online coding tools. Has anyone tried running Deepseek locally and use it for coding?
Cisgender puberty blockers DEI woke gay homosexual California save transgender children Estradiol fascism Critical Race Theory liberal China radical leftism
On a slightly related note, does anyone feel copilot is not as powerful as it used to be? It's kind of hard to pinpoint exactly what it is, but it feels like it often either does nothing, or generates the bare minimum.
I think one concrete thing is writing a comment about a function, then expecting that function below the comment, but instead you get more comments.
Seems the trend with most LLM tech is once the users are there the model is quantized or downgraded down to the bare minimum level of usefulness either silently or through new versions that are just not better in real use.
Yeah, I'm reluctant to trust these paid online LLMs. I want to run my own, but they're all far too big for that. Except Deepseek, which can apparently run on a Pi with an eGPU, and is apparently better than ChatGPT at coding. So running that locally should be possible and helpful and make you totally independent from whatever shenanigans these AI peddlers want to impose on you.
Agree 100%. I noticed that on the Copilot settings page [1] you can switch to Claude Sonnet model (instead of a model trained by Github I assume?). In my experience this improves things.
I haven’t literally never gotten my copilot integration in visual studio to provide anything usable. I don’t mean this hyperbolically. I literally don’t understand what we are paying for.
I've only tried copilot once and this is exactly my experience. I'd write a comment to try to prompt it, then it would keep writing the comment instead of the function. Best I could do was make it write the function as a comment then uncomment it.
Time to buy some beefy professional GPUs with plenty of VRAM and run models locally (for example, with Ollama and Continue.dev), to get around things like this.
I assume that it will stop working if your code does some nuclear physics calculations /s.
This is joke, actually I used it for that. But not for cool nuclear stuff that you see in the movies.
And if you don't get the joke. It is about o3-mini model card having couple of pages about how they prevent it answering some nuclear and radiation questions.
I remember reading the Apple Quicktime EULA [1] (changed from all-caps):
> The Apple software is not intended for use in the operation of nuclear facilities, aircraft navigation or communication systems, air traffic control systems, life support machines or other equipment in which the failure of the Apple software could lead to death, personal injury, or severe physical or environmental damage.
Does this have anything to do with Trump & Musk recently banning everything non-binary and terms like "diversity" from government? I guess it was about time to fix this.
yeah, for sure my friend. the populistic election fear of "the gays" wich led to the election of a guy who bribed pornstars and another not elected but who yourself already recognize as the true ruler, who raised a transgender daugther, they are alrady omnipresent as gods and impacting every aspect of everyone's life where a pronom may appear.
You are seriously underselling the potential impact of things happening has on planning. Any company worth their 2 cents has contingency plans ready for several situations - i.e. if President A wins vs B, if there is enough plausibility that it might come with serious impact to the business.
> Any company worth their 2 cents has contingency plans ready for several situations
My guy, most major companies struggle to get a quarterly plan in place for what they're going to do by the start of the following quarter. There may be some department focused on lobbying and political risk that cares about this stuff, but there are no product and engineering teams at any company that have "contingency plans" just laying around to be activated for random shit like this. Nobody has time for that.
Not to mention, there’s apparently some research saying code with swear words has higher quality, so if AI causes some decline there, we now know why it is https://www.reddit.com/r/programming/comments/110mj6p/open_s...