"a real serious business would have created an infrastructure that will provide paying customers a better redundancy for their accounts"
He should apply the same argument he makes against Github to his own business. Putting together a Gitosis or similar setup mirrored to Github would have given him and his paying customers the necessary redundancy to deploy when one of the hosts is down.
EDIT: Figure while I'm here and it's not OT I should give a well-deserved shout-out to Gitosis which has made my life a lot easier the past two years or so:
Gitosis setup is a one-off, very low maintenance install (you add repositories and users by committing config changes and public keys to a git repository -- hardly anything out of your regular workflow). Once set up, mirroring to github or another service is a one-liner in your post-receive hook.
To borrow Jeff Bezos' analogy from Startup School, should all small businesses who pay a utility company for electricity also have enough generators to run their business should the utility company fail temporarily?
The raison d'etre for companies like Github is that small software companies can't afford to waste a lot of time building infrastructure. If they want to survive, they need to focus on their core competencies, their customers, and their products. If outsourcing infrastructure functions to Github doesn't allow you to do that, then there's not much reason to pay Github.
The electric company analogy is a good one, but I don't think you're looking at it correctly.
No service, not even electricity, has 100% uptime. If thinks are that mission critical, you need a backup plan. Should you disconnect from the grid and generate your own electricity? Probably not. But should you expect that it might go out at any time? Yes. If you are not running a UPS than no one is to blame except yourself. If things are so critical that a UPS won't last long enough, then you should have a generator backup.
In contrary, what would the poster have done if his power or internet had gone down when he needed to deploy? Would he be cancelling those services? For some reason I don't believe he would.
The real key to all of this is not a question of avoiding downtime, it's learning from it and avoiding a repeat of the same incident. I think GitHub has done a great job of this, and they're very clear about what happened and what they've done to avoid it in the future. The poster, however, fails at this point. Not only does he admit he doesn't have a backup plan for deploying when GitHub is down... he flat out states that he knows what he could have done, and REFUSES to do it. Instead he's just going to cut and run, thinking he'll find another service that somehow is somehow immune to unexpected downtime. I wish I could live in that world.
Moreover: Nothing comes without cost. If you want a service like GitHub but with higher reliability, you are going to pay. Possibly in money, possibly in ease of use, possibly in the time it takes to identify such a service (it's really hard to gather reliable uptime data, especially in a field that is constantly evolving).
And, all things being equal, I'm not anxious to pay more money for a more reliable Github. Because it's git, people. Why pay for five nines of uptime when you can just mirror your repo? Every five minutes, if you like? With a one-line cron task? One of git's most basic functions is to make an efficient mirror of itself.
One of our customers wanted an SLA with more teeth. They themselves with a straight face proposed an SLA that would give them ca 8 GBP in credit against hosting fees per day of downtime. This is a multi-million pound business. We said yes, of course.
We'd happily negotiate something with real teeth, as I'm sure most service providers would, and I'm sure Github would too if a big customer pushed for it.
But you're 100% right - if we did it'd be expensive, because we'd have to turn right around and insure ourselves against the risks incurred if the amounts were remotely serious, and we'd pass that insurance cost straight on.
My employer is a local/regional media conglomerate. Newspapers, magazines, TV and a cable TV/Internet/phone service. So we have all sorts of fun toys to keep things running, including these:
If I can't afford electricity to be down, the yes, the business should have generators. I would never put my hosted servers in a data centre without generators, for example, as the stuff I put there is there instead of in our office because it needs a solid environment.
On the other hand, our office servers are on a single circuit and we don't worry about it because a temporary electricity outage just means a few man hours lost productivity (rather than potentially large losses for our clients, leading to bad will and lost business for us)
If a Github outage is so critical to you that it's more than a few man hours lost productivity, then it means you're using it for something you either should have in house or should have adequate backup procedures for. And this holds regardless what you substitute for "Github" in this paragraph:
If it's critical, you ensure you have a backup, as it's virtually guaranteed that there will be some single point of failure outside of your (and theirs) control.
Although his example for cancelling github is a bit silly (Not deploying due to its downtime even though he could have set up a local repository).
His gripe with a paid service being down and unreliable and his action of cancelling is perfectly reasonable if he runs a business and wants a set uptime.
I think the point is that his alternatives don't guarantee 100% uptime either so the whole thing just comes off as some sort of tantrum.
him: I needed to deploy but Github was down! Screw
them! I'm going hosting company X instead.
me: Does hosting company X guarantee that they will
never be down when you want to deploy?
him: ...
At least that's how it comes off to me. If those other hosting companies really do provide him with what he needs better than Github does, then more power to him, but I seriously doubt that they do at the price point they have set.
Even if they do "guarantee" it, what happens if they fail to meet their guarantee? Most likely they'll credit him with a refund of his hosting fees for the downtime, or if he's really lucky some small multiple of it...
Words are cheap, especially in the world of typical SLAs.
Gitosis is great, however I like Gitolite much better (ironically enough, hosted by GitHub - http://github.com/sitaramc/gitolite). Per branch permissions, and a much more intuitive project management.
Not really ironic because everybody hosts on GitHub now. The advantages far and away exceed the occasional downtime, unless you have very unusual uptime requirements as, apparently, the article author does.
I mean, I've been using GitHub 15 hours a day for a year, and last week was the first time downtime affected me, and even then I just couldn't push for awhile. That's kind of the whole point of decentralized version control right? So there isn't a single point of failure? I'd say the git ecosystem worked out just perfectly.
I'd say that if someone doesn't find GitHub's feature set to outweigh the level of downtime I've experienced, that person's use pattern probably doesn't fit the characteristics of a typical user. If I'm wrong, expect people to be leaving in droves now. My bet's on not.
I can imagine two situations where GitHub uptime is critical:
1. You're hosting websites via GitHub pages.
I'm hosting both my business and personal websites on GitHub. (Despite the recent outage, GitHub's uptime beats my previous el-cheapo host by a wide margin.)
2. You host client code and issues in a GitHub private repository.
This becomes an issue when clients want to add a bug to the issue tracker, etc.
I just use git init, ssh, and regular unix accounts, because it takes all of about 1 minute to set up, and is "the simplest thing that could possibly work". Presto, no more depending on external servers.
Agreed, given that you are dealing with Git, having GitHub go down shouldn't be a problem. Gitosis is nice in a simple way, but having a web interface to quickly/easily create repos, browse, and pass around url's is better.
I have been working on a little pet project called GitHaven (http://githaven.com) which is a simple web app that can be installed via a deb package. I have a test install running on http://git.meyerhome.net/ for those who are curious. I use to quickly push up private repos when I need to. Originally created because there seems to be a need for something between Gitosis and GitHub:fi that you can install wherever you need git hosting. If your interested in trying out GitHaven for your own development send me an email at ben@githaven.com
Ignoring the fulminating a bit and concentrating on the substantive issue, there is a real leap in expected level of service once you start taking money from people, and again when you start hosting Mission Critical apps. (What an overused buzzword -- but a true overused buzzword!)
Saying "He shouldn't have had anything Mission Critical on Github, he should have designed his infrastructure competently" is besides the point. Properly designing a fault-tolerant workflow involving git isn't something which is in the easy reach of everyone, and some people would prefer to pay money than to deal with the hassle of doing it themselves. (And trust me, it is a hassle. I have my own gitosis server because I had a business necessity to host my OSS on my own domain. Holy bleeping heck I lost a day of my life to getting that working. That's close to $1,000 of engineer time replicating something very close to what every git hosting company offers for $9 a month.)
People who don't have the skills and don't have the desire to set up a fault-tolerant git infrastructure pay other people to do it for them. Other people being... well... Github. That means that Github has acutely more sensitive uptime demands than somebody's blog on OSS topics or even paid services in less critical contexts. One reason my business is in a less critical context was because I knew that the constraints I was working under made it unreasonable for me to offer customers the assurance that they could build their businesses on me.
Does Github offer the assurance that customers can build their businesses on it? If you want an SLA you will probably pay more than $9 per month (beyond the implicit SLA of "I'll take my business elsewhere if somebody else has better uptime for the same price").
SLAs are largely meaningless pieces of paper invented by folks who either think that downtime can be wished away by writing the number 9 enough times, or largely meaningless pieces of paper meant to quiet the discontent of non-technical stakeholders by pointing them to a suitably impressive numbers of 9s.
I'd love seeing a copy of that if you don't mind. (Incidentally, should any of you ever want to do this to a quote on HN from me, feel free. I trust your judgement regarding whether it is substantative enough to merit attribution.)
We're all businessmen here and everybody knows I hold absolutely no ill will towards Github for what I am about to say, right?
The primary concern was what SEOs call link equity (a link to your site from a trusted domain is a signal of trust, Google uses trust as a signal of quality in determining rankings, rankings are worth money, thus inbound links are essentially illiquid capital), with brand recognition being a major secondary concern. Take a look at Rails blogs discussing the OSS packages they use: everybody cites the canonical version of the code and if that is on Github then they cite the page on Github, frequently to the exclusion of the author's post or page about it.
Related problem: Frequently authors don't have a substantive page about it. That would be a mistake if you're trying for link equity or personal branding, in my mind. Relatedly, I spent $250 on a logo.
I spent quite a bit of time on that project and wanted the various marketing benefits of it to accrue to me, not to Github. So I put it on my site, not theirs. Folks subsequently cloned the repository and put it in their Github accounts. I am totally OK with that: mine is the canonical one and the canonical one gets the links, mindshare, etc.
This is the problem I tried to solved with the Indefero hosting I offer. All the forges have their own domain name for free and you can migrate all your data to your own server with your own Indefero installation in 3 easy steps (download, import, switch DNS).
Anyway, the problems with Github is that their level of quality result in people considering all the offers as having such level of quality. We (code hosting providers) are all suffering when Github is down... this is where it annoys me.
Git is decentralized. There's no way Github can affect your ability to deploy. Code can only be pushed there meaning it was written and exists somewhere else.
There's also no way to run your own private SSH or git server without paying a minimal monthly fee. Any such server or Internet connection will go down occasionally.
Nobody is going to stop you from finding an alternative that suits you better. Good luck.
Thank you, you are the only poster yet that seems to get it. I agree, the point of git is that you already have your entire repo. Who cares if your remote host is down? Deploy using your branch and when github (or whatever) comes back up you push your changes to github. Done. I don't think of Github as a centralized server, but as a place to search and share code.
This is true, but it's a hassle. The one time I needed to deploy and github was down, I had to read a link on their website and set things up. It was a real drag and delayed my launch by at least 15 min!!!
This is really harsh and it sounds like he's looking for a scapegoat on why he is behind on client projects, "i delayed 3 client deploys in the past week due to Github’s downtime".
GitHub is very transparent about their issues and from looking at their blog ( http://github.com/blog/597-a-note-on-the-recent-outages ), "Following three months of near 100% uptime, we’ve just been through three major outages in as many days."
The outages weren't "days long" so those delays in deploying a client's project shouldn't have been significant. If the client expected a deployment during a short window it is irresponsible for the development company to keep the code only in a single location. I cover this in a post about availability on my blog ( http://www.bretpiatt.com/blog/2009/10/03/availability-is-a-f... ).
We just had a discussion about this a couple of hours ago. If your deployment depends on an external service, your deployment is broken.
You're using git. Have a fabric script that pulls down the remote repositories that you need, and rsync's them to your servers. This works locally or on a remote box. If github is down, pass in local directory to sync instead. This way you can always push.
He built his infrastructure wrong, and is trying to blame someone else. Sorry, but you are just as much at fault, if not more, than github.
I don't think that's an entirely fair argument to make.
For example, I don't think it's unreasonable to write a unit test that expects an SMTP server to be available. Maybe doing so would break the deploy if the SMTP server ever went down, but at that point you have two options: rearchitect your deploy, or find a more reliable SMTP server to test against.
If the author would rather spend his time and money looking for a more reliable git host than rearchitecting his deploy to account for service failures, why is that so bad?
It is so bad because it fails to take into account that there are multiple failure points involved (his git host, his internet connectivity, multiple transit providers and routers beyond either his or his hosts control, electricity supply etc.) and that getting a more reliable git host won't even address one of them - no hosting will ever achieve 100% uptime, if only because of eventual human error.
He'd be better served first addressing the significant flaw in his overall process of not having a fallback that bypasses as many failure points as possible.
I.e. why does he not have a constantly updated clone of the repositories locally at the provider hosting his actual sites, so that he can deploy as long as he can get hold of an internet connection somehow and is able to connect to his hosting environment?
This bypasses the local electricity supply, his office internet connection, the transit providers between him and Github, etc., because there are many easily accessible ways of getting an ssh connectio in a pinch, and the remaining dependency is his production hosting, without which deployment wouldn't work anyway.
It would be a simple, cheap and prudent safety mechanism if being able to deploy immediately is business critical for him. It's a matter of a single command in a crontab on one or more of his production servers.
I've actually been in that exact situation. Yes, the unit test should expect that once it offloads the message to the SMTP server that everything works correctly, but there should also be another set of tests that ensures the app fails gracefully when the SMTP server is not there. Once you have that, then you should find a more reliable solution to the STMP server failures.
He makes some decent points about Github's response to their reliability issues, but that's about where it ends. This rant can be summed up as: Github is unreliable and they should work on this, pushing free users to the back burner in favor of their paying clients. Whether or not you agree with that, the rant was a bit unnecessary.
It did seem needlessly harsh. Github responded pretty well to the recent issues though.
Seems like the author just wanted to blame Github for all their problems as opposed to considering that basis of the internet is itself unreliable, factored that in, and handled it gracefully. But maybe I'm being too harsh >_>.
How would they do that though? If you look at the details of what cause the outage, it was not a matter of free vs. paid users. Even if they added extra redundancy for paid users that they didn't give free users, that wouldn't do any good in the case of a site-wide failure, which is what they had.
For that matter, I don't think they've ever had an issue that could be isolated to affect only free users, have they?
For Outage #3 they said "After inspecting the HTTP logs, we identified a Yahoo! spider that was making thousands of requests but never waiting for responses." Spidering isn't an issue for the paid private repos since they have nothing public to index. However, the web access to private repos went down due to the spidering of the free user pages.
I think i need to clear it up a little bit. (Yeah, it's my post).
I agree that the post came off a bit too harsh maybe, but as far as reliability for a payed service goes, github was yet to provide a decent level.
Ever since Github outages started (a while ago, even on EY's age) i decided to push into 2 separate locations and therefore giving Github a further chance (2nd location is another paid account on Unfuddle.com) but the delay on deployments was due to simple need on our side that resulted in the need to push another change to github (updating a vendored private gem) while this process may probably not be perfect it does not remove any responsibility from github as a payed service to provide a better way of supporting payed accounts.
i will review my internal processes further more to see if there's a better way of doing them, but as far as continuing a payment to an unreliable service.. that part is over.
Your article was intended to stir up flames on Reddit, period. Github is an ambitious service to pull off, and I think they've done a fine job so far. It's a small team trying out a very new infrastructure, etc.
Maybe you should move your stuff to Google Code or Sourceforge. I personally prefer github b/c they do innovative stuff that other companies wouldn't have the balls to try (such as some of the ajax, caching, mostly real-time graphs, etc.).
I will not go into a "he said she said" kind of thread, but i really had no intention of making it something like it actually came to.
Yeah it's a small team and a very talented one, you are right. but graphs and ajax? this is what you pay for? or you actually want the basic service to work as expected?
In a simplistic sense, yes, you are paying for graphs and ajax.
You can host git repos very cheaply on a random server. You can probably even make it work on shared hosting.
You pay for GitHub for a very convenient and pretty web interface (which is pretty well exemplified by "graphs and ajax"). Permissions on private repos are easy. You can look at code, commits, blames, etc in ways that your eye can parse easier. Graphs let you see some cool-but-probably-unimportant data. Your newsfeed tells you what's going on on repos you care about.
None of that's essential. You can see blames by typing `$ git blame <file>` at your favorite shell. But it's way uglier and doesn't show you blocks of lines that were last changed at the same time as intuitively.
Then you also get GitHub pages and a few other things like that, but it's pretty easy to setup jekyll on your own.
Basically, you just pay for a lot of small conveniences that add up to being worth not setting it up yourself and losing the better interface.
I do, but the outages have not imapacted me since I keep a local backup (I also haven't needed to do any deploys during the brief outages).
I could host git repos cheaper than github with a $20 slice over ssh. So yes I pay extra to github for the additional features.
Edit: I also enjoyed reading about Github's custom stuff (like bert/ernie)... clever stuff that most teams would determine wasn't worth the effort, but that has worked very well for github.
"Four nines" is one of the reasons SLAs are pretty meaningless. That's an hour of downtime every year. One unplanned outage blows that.
Google's flagship product (search) probably breached four nines last year for some users (they had 20~40 minutes of effective downtime with that one "All sites have malware" incident) and they're much, much better than you, I, or Github will ever be.
I think this article is being a bit harsh and overly dramatic. However, he does have a point in that the github folks should probably try to architect things such that their paid service is isolated from any mishaps that come from the free service.
I don't currently have anything hosted at Github, but I use them regularly to check out other people's code and their uptime history is disappointing. I know that they always explain what caused the downtime and I'm glad that they do. Explain what's wrong isn't enough, though. It seems like they have a poorly designed system that can be brought down by almost any piece failing. Their latest outage makes it sound like they don't know how to build a fault tolerant system at all. They have the parts there, like DRBD, but when things go wrong everything breaks anyway. It's like making backups but never testing them so that when you need them it turns out that they can't actually be used to restore anything.
It seems that they have the part that most companies miss, the communication, but that doesn't make up for the lack of reliability.
This kind of feels like another "don't know how to use my tools" kind of post.
I was doing some critical work during the last time. I had to collaborate with another developer and he was unable to push through github. So we exchanged code a different way.
I was also doing EC2 tests of the code we were working on during this window. I didn't have trouble getting that deployed (via git, no less) during this time.
It's easy to do the lazy thing and rely on centralized systems when it's completely unnecessary just to avoid a small amount of work to greatly increase the reliability of your platform.
Yes, it is pretty bad for github to have so many problems in so few days and if he isn't happy with the service he should leave.
However, seems like he doesn't really understand how to use Git if github being down is barring him from doing his job or deploying his work. Having a single point of failure while using a distributed version control system is incompetence on the developer more so thann it is a problem github being down. Github provides you a lot of nice features and fancy web based interfaces and some social aspect to coding, but to rely solely on it shows a lack of understanding and removes many of the benefits git provides a developer.
Customer doesn't feel like the price they are paying is worth the value and leaves.
Only difference is that there are alternatives to this "critical service."
Had this been a rant about the electric company going down in the snowstorm no one would have cared because it was understood these things happen.
Have an absolute need for power no matter what? Better get yourself a backup generator. Just because you pay for a service doesn't mean it won't go down. Just means you should expect them to fix it ASAP.
I pay github for a private repo I rarely use, however I really depend on them for my development as all the open source libraries and documentation I use are stationed there. I don't see switching to something else as an alternative.
So what can we do to help them keep their servers up for the community's benefit?
It's the Internet. Everyone has downtime, even Amazon.
My only complaint about Github is how my profile is turned into a giant ad for github when a non-logged-in user visits it. I think it's really underhanded to use a paying customer's profile page to convince random people to sign up for github. If they want to, they will figure it out, just like the other million users.
There used to be a pop-over when visiting a repository when not logged in. Perhaps that was changed recently (within the last week), because I can't reproduce it from my home machine.
There's nothing shameful about asking users for money. Putting a 'Sign Up' button or two on the page is just good design.
Would you prefer they buried a PayPal donation button on their About page?
I'm also curious about why jrockway thinks a Sign up button is underhanded. Maybe if there was some sort of bait and switch involved, I might see their point. But, the 112x23px button pretty clearly says "Pricing and Signup". Of the 920px x 3783px of content on kneath's profile page, that 112x23px represents 0.07402% of the page.
Plus, it may be your profile page. But it's your profile page on Github's server. If you don't like it, why not build a custom HTTP front end to your privately maintained git repos?
The large amount of grammar and spelling errors make me question if the author is at all interested in a "real serious business". And WTF is up with the horrendous comma usage? It's similar on his Nautilus6 and LinkedIn websites, too.
Also, how precisely did temporary downtimes prevent deploys? At best, it appears they would've been delayed by a couple hours, and the great part about Git is that you can work around something like that pretty easily. Granted, downtime is annoying, but his extended difficulties seem to largely stem from his own mistakes, and for $7/month he's not exactly paying top dollar to guarantee uptime.
Like i noted on one of the comments on the post, the need to wait a few hours was due to a last-minute change we had to push.
yes, i do need a better process (even if this case WAS a very specific case that is not likely to happen again) but still, if i pay i want to get the service i pay for.
Github had outages since ever, and although they do seem to do the right thing and make it better, it seems that they are aiming at the wrong direction.
And about the spelling/grammar/whatever errors, i'm just not a native english speaker. Don't think that was the real issue in the post but whatever, one day i'll learn to spell.
The spelling/grammer issues just add to the feel that your post was a pissed off rant. My own typing goes to crap if I'm in a rage. The key is to wait a period of time to cool off and go back over the post and fix the errors before you send it, that's all.
He should apply the same argument he makes against Github to his own business. Putting together a Gitosis or similar setup mirrored to Github would have given him and his paying customers the necessary redundancy to deploy when one of the hosts is down.
EDIT: Figure while I'm here and it's not OT I should give a well-deserved shout-out to Gitosis which has made my life a lot easier the past two years or so:
Gitosis setup is a one-off, very low maintenance install (you add repositories and users by committing config changes and public keys to a git repository -- hardly anything out of your regular workflow). Once set up, mirroring to github or another service is a one-liner in your post-receive hook.