Github has had stunning stability. I bet their architecture is nontrivial, that dozens of parts of their code are tied intricately to their architecture, and that opensourcing only the bits that will run on typical AMIs or on debian^Wdevuan would be little more than opensourcing helloworld.rb.
Right and I don’t want to discount that. The scale they’ve achieved and the team that delivered it are incredible but presumably the people who want it to be open source won’t need any of that.
They’ll be running it on a raspberry pi, not hosting the Rails or NodeJS repo.
I would describe GitHub's real "secret sauce" as the issue-tracking, wikis, project boards, and release management parts, that don't get represented in the repo itself.
Which is to say, if you wanted to commoditize GitHub (which is basically what "open-sourcing your secret sauce" means), you'd have to create some sort of library that allowed you to treat a git repo + all those other things as one structured data-object. You would be able to use said library to both operate on all those pieces of data locally; and to sync them between different Git hosting services that all share those features.
Or, better yet, figure out a way to put all those features into git itself, so that every git repo automatically transports those pieces of data alongside itself.
I meant the implementation of these side-features as objects sitting in the repo, with appropriate client commands for creating and editing them, etc.
Git already has notes, and signed commits/signed tags, which are all those same kind of "objects that just happen to be there." So they don't need to copy Fossil's architecture; they can just copy the way that said object types interact as dependents of commits (while letting them get blown away when commits themselves do.)
Maybe "secret sauce" is the wrong term. People ended up choosing GitHub purely because of network effects, I think. But those features are the "lock-in" preventing individual projects from easily migrating away.
If Git repos just "had" wikis, issues, etc. inside them, the lock-in wouldn't be there, so people would be switching between Git hosts all the time—and there wouldn't really be much value in a "git host" at all, beyond what just having a Git dir on your own server, plus a native-GUI Git client supporting the wiki/issues/etc. features, would get you.
Maybe there’s no secret sauce of any value and it’s just pointless github is closed source, like a form of DRM just being used against us cause we’re silly and let it be so even as we aspire to see open source flourish...
I started working on this actually. https://github.com/ioquatix/relaxo is a document database built on top of git. I recently added support for using a specific branch, so it should be trivial to allow a web app doing the things you suggest to store/run within the same git repo.
The problem with trying to commoditize those additional features and creating a standard is that although these different services have similarities they also have subtle differences. I don't know if the companies that create these products or their users would want to standardize because the reason for the differences is because people have different preferences.
Exactly. I wonder how people would react to if Microsoft made the GitHub source public (and available for further extension by the community) but didn't grant the license to redistribute/deploy it anywhere else. Would people be content with that, or would they be grabbing more pitchforks and protesting Microsoft's actions as some kind of toxic evil? I'm skeptical that that would make people happy, and if it doesn't, then it goes to show the ulterior motivation isn't actually to just to extend the platform and "scratch my own itch", as they put it. It's to let themselves move off GitHub.
It's more than that. GitLab raised the bar here. Being able to run GitLab CE internally has de-risked the decision to test internally. For the next wave of customers in the space, familiarity with GitHub open source isn't enough.
Your reasons against "why" read more to me like "why not" as you listed things unaffected by whether they open source or not. The benefits for "why" are the same for all open source software and are not unique to this situation. The only question is if the costs are too great.
If Github was made with micro services architecture, it could be split into open source "Client side + Test-backend" and closed sourced "Production-backend". Backend can be composed of some interface and multiple implementation such as test impl and prod impl. “the secret sauce” could be the "Production-backend" and MS have good in-house talents who operates Azure so no need for the help from OSS community to improve backend.
The last sentence of the open letter says: "Hopefully none of these are a surprise to you as we’ve told you them before. We’ve waited years now for progress on any of them. If GitHub were open source itself, we would be implementing these things ourselves as a community—we’re very good at that!"
I think that's OSS community want, including but not only I want.
This just isn't possible. No company in their right mind would take their monolith and rewrite a bunch of stuff just to please a couple randoms on the internet.
I think you're drastically underestimating the amount of code Github is powered by and how freaking long any type of refactor/rewrite would take. We're talking about years.
Oh, was Github monolith? I did not read any articles about their architecture so I assumed that Github could be composed by micro services, and in that case, splitting could go easier. I edited to "If Github was made with micro services architecture,..."
Does anyone have link to any interview or article talking about granularity of their architecture?
Reading Github Engineering blog posts related to their architecture. https://goo.gl/amkJfV As far as I read so far, Github looks like composed of micro services, or at least split into many distributed services.
Github has been a well known Rails monolith. Absolutely they have services that power all sorts of stuff (they have a few blog posts on it) but it's just impractical to even start a discussion on splitting it up for the reasons you initially argued for.
I've been through a number of large rewrites/reworks that took monoliths much like Github (with many many many services behind it) and split them up into modular pieces and it's an insane amount of work that can take years. You simply need very good reasons (including business reasons) to do that.
Moreover, companies at these sizes just have a LOT of code all over the place. Tooling, infra, supporting services, etc... Not to mention it's just not useful to have external contributors for a business product like Github. Doing code reviews, addressing bugs that were introduced, spending time discussing things with contributors takes an incredible amount of time.
Basically if the reason you want Github open sourced (and reworked into some weird architecture you described) is so that people can contribute to fix things and add features....Github could/will just hire more devs to work on that.
Thinking about the fact that Github have Github.com (production service) and Github Enterprise (self hosted), could it be like this?
Github.com = Github Core + Production Services and Infra
Github Enterprise = Github Core + Services need for Self Hosting
Maybe it's not worthy to proceed based on assumption but if there is something like "Github Core" which is shared codebase between prod and self-hosted, open sourcing the core can be an option?
Thanks TheHydroImpulse for the insights. So Github IS monolith. It really makes sense that splitting is too much work for just open sourcing and I do not see business gain to invest money and people into it.
Then possible path might be isolating least coupled (and small) components of client side code and open source?
It's kind of funny how Open Source is now a strategic initiative of many big tech companies. They open source projects for a variety of reasons: 1) to attract tech talent, 2) to "build a community" (aka get customers to become unpaid tech support), 3) to increase quality by increasing eyes on lines of code, 4) to reduce cost of maintenance by allowing the community to provide development work, 5) contributing back to existing tools allows companies to re-use technology rather than having to roll their own, 6) defeat competition by releasing a supported product for free.
The last one is a really powerful move for a software company. Normally, people will use your software if it's free, but if there's performance or other limits, it decreases the number of potential users. External open source projects then pick up the user base you might have acquired later. By open sourcing your own product, you effectively eliminate the need market for the external projects, giving you back your user base (which helps your community, potentially leads to sales, etc).
For some companies this would be cutting things very close in terms of profit margins, but Microsoft has plenty of other income that it doesn't need to worry about an occasional loss. It has way more to gain from leveraging this product in all its other tech offerings, even if they made zero money off enterprise support.
>Microsoft open sourced Xamarin after acquisition and it was good for community.
I don't think Microsoft's previous open sourcing of Xamarin is a good indicator and may lead us to misunderstand MS's strategy. To me, it would be very out of character for MS to open source Github.
Yes, MS has released things like C# compiler (Rosalyn), Visual Studio Code (Javascript Electron app), and Xamarin to the open source community.
But, MS has not open sourced CodePlex, Visual Studio Team Foundation Server, Skype, Linkedin.
I see a difference between "programming tools" and "collaboration platforms" and Github is in the 2nd category. I see no strategic reason why MS would pay $7.5 billion for Github to just turn around and open source it.
Xamarin is an easy to use leading indicator because of Nat Friedman's involvement in both. Simply in choosing Nat Friedman as new CEO it is hard not to wonder if Microsoft at least has the intention to explore Open Source as an operating option for GitHub moving forward.
> But, MS has not open sourced CodePlex, Visual Studio Team Foundation Server, Skype, Linkedin. [...] I see a difference between "programming tools" and "collaboration platforms" [...]
They've open source some of those "collaboration platforms". It's not a clear black & white situation, I don't think.
A lot of what might possibly be considered "secret sauce" bits of the Azure stack are open source, and you could possibly cobble together your own mini-Azure if you needed too and for some reason it wasn't just cheaper to buy Windows Servers with Azure Stack out of the box or even to just use Azure's existing cloud.
The Azure Functions Host is the biggest example off the top of my head that a lot of people imagine would be closed source but is open source.
(Another example is I've used over the years for different reasons is Azure's "Kudu" website deployment engine.)
I don't know if there's a cut/dried point where Microsoft might currently be drawing the line between its closed source stuff and open source, but "programming tools" versus "collaboration platforms" doesn't seem to be it (even before getting into semantic arguments about the fuzzy boundary between such categories).
That said, there probably is no obvious strategic reason for Microsoft to open source GitHub at this point, and maybe all that money that was spent in the purchase are plenty more reasons not to.
But Xamarin is an interesting leading indicator, and if there is a person to put in charge of GitHub with any interest in exploring the possibility of at least open sourcing more of GitHub, even if never quite "all" of GitHub, it is probably Nat Friedman.
Sure I get those examples of "bits and pieces" being open source. Likewise, Google open sources lots of things like Protobufs (BSD license), CityHash (MIT license), and Kubernetes (Apache license) -- but they don't open source their crown jewels of proprietary source code collaboration, the Google Cloud datacenter management stack, and of course, their latest iteration of PageRank. A lot of those Azure github repos I see are open source examples for client SDKs as opposed to building a full clone of Azure that's equivalent to RedHat's OpenStack.
Those limited examples didn't seem to be the spirit of Joel Handwell's wish. I think we can presume that he wants to download the entire Github source code, compile it, and self-host it like GitLab. I could definitely see MS open sourcing bits & pieces of Github but still not give away the entire stack.
To me, it looks like MS is taking Github in the direction of a hosted service for full Application Lifecycle Management. (For example, add more features to compete with Jira.) Github-Enterprise could possibly eventually overtake Microsoft's own Team Foundation Server as the preferred release management tool. It adds to MS portfolio of other cloud services like Office 365. Likewise, if we ask for MS to "please open source Microsoft Excel", it's probably not going to happen because it's not compatible with their strategy of selling Office 365 subscriptions.
>Those limited examples didn't seem to be the spirit of Joel Handwell's wish.
My wish is the same wish as the open letter dear-github (https://github.com/dear-github/dear-github) signers and not to host open source github in other server than github.com. Just hoping to see requested feature implemented earlier by cooperating with OSS community. I wish to still use github.com after it's open sourced. So for me, the limited partial component open source can be a starting point and the open source effort do not need to go all the way to the production backend which is distributed performance optimizing hacks usually not directly affects UX/UI.
Right, that's mostly what I think. We might see a lot of individual puzzle pieces open sourced, but not see the top-to-bottom "full stack" overview/instruction set/orchestration tool for all of it. Like I said, you could probably cobble together most of Azure if you tried, but it's like getting a shuffled collection of Ikea flatpacks without any instructions and no allen wrenches. That's certainly a model GitHub could follow here.
> To me, it looks like MS is taking Github in the direction of a hosted service for full Application Lifecycle Management. (For example, add more features to compete with Jira.) Github-Enterprise could possibly eventually overtake Microsoft's own Team Foundation Server as the preferred release management tool.
In terms of that speculation I disagree. I don't see Microsoft adding Jira-like features to GitHub when they can already encourage people to use VSTS (Visual Studio Team Services) Work Items if they want the high touch issue tracking. There's already flows between GitHub's Issue tracker and VSTS, and that seems likely to increase. Which is very similar to the existing separation we already see Atlassian makes between Bitbucket Issues and Jira, with Atlassian working hard to upsell Jira to Bitbucket users with more complex Issue Tracking needs.
If anything, Microsoft is maybe in an even better position to make that transition and/or integration work, given the impression that it never feels like anyone inside Atlassian ever uses Bitbucket's own Issue Tracker day-to-day, yet Microsoft plainly today has teams in both GitHub Issues and VSTS, and needing to smartly straddle the "fence" between the two.
It sounds like Microsoft's intent seems similar with the rest of GitHub-related ALM. GitHub has gotten a lot of its support by being relatively ALM-agnostic. While there are many fans of Gitlab adding CI out of the box, there are also proponents that prefer the competitive GitHub Marketplace and the general ability to pick/choose CI/CD providers or other ALM tools.
Microsoft has already for years tried to position VSTS (and sibling project App Center) as the "best" ALM provider for GitHub CI/CD/Release management/etc. I don't expect that to change with them owning GitHub, and it's only in their favor I think to try to leave GitHub itself appearing relatively "ALM neutral" and leaving the upsell to GitHub's Marketplace.
Similarly, there are rumors/indications that GitHub Enterprise has underperformed in the marketplace versus the expense of maintaining its fork of the main GitHub codebase, and Microsoft's easier option is just to migrate its users to on-premises TFS. They'd likely lose hearts and minds doing that, but it seems more likely than migrating the other direction. I think, solely gut instinct, its more likely they just keep both around and let the users decide, at least in the immediate term, but if one is "losing out" to the other, I would currently put money on GitHub Enterprise as being the one to be canned.
I am going to be in the minority here, but why does HN always think all software should be free and open source? GitHub generates large revenue and provides immense value for organizations and companies. It's the same argument I see on HN about people not wanting to pay for Slack and going through insane hoops and efforts to run alternatives. Makes no sense being a technical founder. I truly believe in supporting great software with my wallet. If the organization decides to make their software open source... That's awesome and great, but starting outrage campaigns and public outcry I can't get behind. Microsoft is a public company and they have every right to monetize GitHub.
I really don't see the value in open sourcing GitHub when Gitlab exists if you are looking for a replacement incase MS decides to go full tyrant - unless if you believe GitHub enterprise is an amazing project.
If GitHub wasn't built from the start for hosted deployments, it will probably be both expensive and a pain in the neck to run. GitLab probably has it's issues, but I won't be surprised if there are assumptions made in GitHub's code that assumes you are essentially running on high memory / high cpu machines to reduce latency.
GitHub Enterprise is built from the same codebase that runs GitHub.com. And GitHub Enterprise was designed from the beginning to run on your own hardware. As far as hardware requirements, well, that kinda depends on your workload. GitHub Enterprise system requirements were higher than GitLab but we found GitHub Enterprise much more reliable. GitLab will let you run on anything but they don't really know what will work. They're still trying to figure out how to scale gitlab.com which barely has any users compared to github.com. The experience at scale is huge.
Additionally, we found functionality like HA and DR were implemented very seamlessly in GitHub Etnerprise. Easy to set up and not a hassle once it's running. GitLab basically tells you to build your own HA which isn't something we're interested in doing.
This would be interesting to hear about from an insider. I suspect you're right. Having worked at a company that open sources most of their code, I've seen first hand how much better the code quality is when you know it's going out to the world. When you know it's gonna be hidden and only a handful of people will ever see it, you're much less hesitant to push up hacks :-) We open sourced something that wasn't originally going to be open, and we spent weeks going through cleaning up and refactoring things to avoid public embarrassment.
What’s the business value in spending weeks of developer time refactoring things that are going open source. For some industries it’s required like blockchain. For average shop seems like more expense with no profit
In our case, the business value was that it was a contractual clause with a customer. The idea was that keeping a million dollar customer and spending weeks refactoring was cheaper than losing the customer and getting sued for breach of contract. So we're probably in a similar boat as block chain like you mention.
However, I think you underestimate the value of open source. Besides being the right thing to do (respects freedom, allows developers who built it to use their code later in line with license agreement, allows others to learn by contributing or at least reading, plus many other benefits) it can be a powerful recruiting tool. There's real business value in getting good hires.
I _hate_ people using "M$" un-ironically. It wasn't funny 10 years ago when it started and it's still not funny today. How can you take that one comment seriously with that "M$" part in it.
That's cool. I hated Micro$oft for a while, for being a bunch of evil money grubbing assholes who made it harder to run the operating system I wanted, applied proprietary extensions to open standards in order to force vendor lock-in [nearly destroying the open web], and who leveraged PC manufacturer deals to tax us for software we didn't want. They literally called Open Source a cancer and lied about what the GPL specified to discourage use of the software. M$ was a monopoly, and all about the cash. It was never supposed to be funny.
Also, irony means suggesting the opposite. I don't know why anyone would use M$ ironically.
Like the other commenter said, if it aligns with their business interest, they will. That said, I do find it funny that essentially the "home" of most open source projects itself isn't open source. I'm not a heavy user of GitHub, as we use a self hosted instance of Gitlab at work, so my only use is browsing projects & throwing up my various side projects on it, so maybe someone can enlighten me on my next question.
Does GitHub have some proprietary tech/functionality that is valuable enough to warrant it being closed source? As far as I can tell, GitHub's most valuable assets are it's large user base and the brand itself, it's actual functionality seems to be pretty standard amongst git hosting applications (Gitlab, Gitea/Gogs, Bitbucket, etc...). However, like I said, I myself am not a heavy user of it, so there could be some feature the others are lacking that GitHub would like to keep the code closed to prevent others from replicating.
I think you're right. The only "secret sauce" I could see them having are some unique operational/architectural features of their back-end code that make sense for an entity at Github's scale. No instance of Gitea/Gitlab is even in the same ballpark wrt repo count and traffic than github.com, obviously.
There is no way I can see them open sourcing GitHub. If you have studied the history of versioning control systems (ClearCear, Perforce, TFS, Subversion, etc.) you would know how insanely powerful controlling the hosting solution is. I can see Microsoft making private repos for free and/or making GitHub Enterprise free for the first 10 users though.
It's definitely powerful, but I don't think the power of github is in the actual product. Gitlab is much better in some important ways (IMHO of course), but it's still not seen much adoption.
The power of Github is in the community and the fact that it's the defacto standard place to look for code. That would likely only increase if MS open sourced the code.
GitLab is not open source though. They call their approach "open core" which imho is just a marketing gimmick to appear as an open source solution when in fact they are not fully open source.
I am not a big fan of GitLab-the-company. But GitLab CE is licensed as MIT. You can do with it what you wilt.
I have a hunch that your actual complaint is that they too do with it what they wilt, and what they wilt is not "give to you everything they add on top of GitLab for free and forever".
Yes. That's what I meant, but also I don't think that's a bad thing. Building software is hard and expensive. I think is valid to only open source certain parts of your system and reserve the right to monetize other parts and features.
Good point. I'm not criticizing. I'm a very happy GitLab customer and love what they do. But just wanted to point out that they don't call themselves "Open Source" and their business model is not necessarily an open source first kind of business.
Misunderstandings looks like happened regarding my intention. I'm totally not interested in running my own github instance in other server just like people who signed the open letter dear-github to plead Github to add +1 button in issue which took really long to be recognized as an issue to be implemented. I'm sure if it was open source, somebody implemented the feature and made a pull request.
My intention is to make github.com "Client Side Code" better by let OSS community join into the development. So open sourcing backend and architecture is not my interest and possibly not interest of MS too. I'm not sure if Github is made of micro services, but if it is, certain client side code could be open sourced first and only that could generate great contribution to improve it.
I'm heavy user of Gitlab and disappointed because it is no longer fully open source and some enterprise features would not be welcomed to be implemented by OSS community because it conflicts with Gitlab company revenue. In my ideal world, MS splits Github source into client and backend and open source "Github client + test backend" for community to test if client code works while they make pull request and keep production backend closed. Do you think MS would be interested in this?
I think the +1 button was a really nice feature and something they should've done a while ago, but it's hard to draw the line.
If everyone contributes their pet UI feature (e.g. custom theme, use XYZ monospace font), then there will be dozens of features that only 0.01% of users use, but it increases the overall complexity of the product and maintenance burden.
+1 button could work as voting system to reduce that maintenance burden too. Semi-democratic method can be introduced to merge most voted and looks good to MS.
My words were not clear, I meant and edited to "semi-democratic" way. The voting is one indicator but I also believe strong vision and strategy of leader is the key to success.
And my point was not +1 button but splitting code into open source "Client side + Test-backend" and closed sourced "Production-backend". Backend can be composed of some interface and multiple implementation such as test impl and prod impl. Actually Github could be hosted in Azure and I do not care their backend as I go github.com anyway.
Not sure if they will join conversation but just now I invited @satyanadella @jeffmcaffer @natfriedman @defunkt @bkeepers into the Github issue. If you want them comment, cast vote with +1
I think it'd be cool if they open sourced it, but w/ federation built in. So self-hosted public repo's, would call back to ms github, and show up in searches/etc. You can star/comment/etc on github.com even if it's a federated/self-hosted github.
That way the social aspect is still there, regardless of if the repo is hosted ON github.com or a remote copy.
after seeing what they did with xamarin, i would not be surprised if they did. but unlike xamarin, open sourcing github wouldnt help them gain more developers on their platform and i do not see any other business benefits. i would love to see this happen though
They could, but I'd argue GitHub is powerful for the community, shared user accounts between projects, and integrations. None of which you'd benefit from if you deployed your own instance.
If Microsoft open sourced GitHub, I'd still use GitHub.com, for the reasons I stated above. Heck GitLab is already OSS and I don't use that because GitHub wins via the network effect[0].
PS - I have no issue with people asking for this; I'm simply pointing out the "win" if Microsoft agreed, is lower than you'd immediately anticipate.
GitHub’s success is largely due to the network effect and it’s entrenched status as the canonical code repository.
Besides libgit2, aka “the secret sauce”, is already open source. What are you waiting for?