Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
You might not need a WebSocket (fanout.io)
160 points by 650REDHAIR on July 10, 2014 | hide | past | favorite | 66 comments


I live in Africa.

Websockets suck here. BGP routers here flap very often, nuking TCP connections because hey, power cuts. Guess what happens?

Tip: If you're "out to change the world", make sure you can turn your wifi card every 30 seconds and still get a decent-ish experience (yes, long-polling HTTP works, so does good old timer-based polling).


Indeed. These days, you have to expect users will wander out of their wi-fi range or have crappy mobile connections. Whenever I ride the train with my 4G tether, my TCP sockets die every few miles.

I lose ssh and my websockets, but long-polling apps keep working. Kind of ridiculous, but that's the world we live in.

Making your websockets resilient means pinging and aggressive reconnecting.


If the connections are dropped quietly (client side does not find out the connection is gone) then how does the long poll case know that the existing poll conn is not going to return any data and a new one needs to be initiated?

Or is it just that long polls generally have short timeout compared with what you might implement for a websocket ping?


Correct, long polling clients generally have timeouts. I wouldn't say they are necessarily shorter or longer than typical websocket ping intervals (I've seen as low as 20 seconds and as long as 2 minutes), but they do need to be there for the client to recover quickly.


It would be nice if you could drop the http overhead for long-polling though, which is the main reason I use websockets.


I'm fine with sending a couple of extra kb if that means I actually get an answer :)


It sounds like EventSource would be a good solution in that case: low overhead, and you'd get auto reconnect for free.


Have you tried http://mosh.mit.edu/?


Mosh is great, I use it on an almost daily basis.

Setting an alias like "alias sssh='mosh'" greatly helps with muscle memory, too (of course, don't override "ssh", you might still need it).

Funny anecdote: I live in a city called "Moshi"


I don't recommend replacing ssh, but when you do want to shadow a command, you can always call the original using \, e.g.:

  \ssh


That looks awesome. Will definitely check it out. Broken ssh connections make me sad.


Have you tried socket.io-based web applications on the train? It does pings and reconnecting.


Do you know of any public socket.io apps?


Etherpad uses socket.io.

Here's one hosted Etherpad: https://etherpad.wikimedia.org/


Dasher (http://dasher.im), uses socket.io. We're using socker.IO-objc for our iOS client: https://github.com/pkyeck/socket.IO-objc

And will be launching a web client in the coming weeks.

So far, so good. We do handle the reconnects ourselves, but otherwise it's been treating us well.


So if connection resets are the problem, as opposed to router/proxy software problems... In cases where long polling seems to work as you say, why wouldn't an automatic WebSocket reconnect work just as well?


You're right, in that case it'll work just as well. The problem is that it's easy to get wrong, so the developer needs to know what he/she is doing.

Fortunately wrappers like Socket.io, SockJS, Primus, etc can help with this.


Also it just completely falls down in the case of many proxy servers and often the end user has no control over that.


> Long-polling gets a bad rap.

Yes it does, but not for the reasons you mentioned. The main objection against long-polling in the olden days was that it would spin up a new server process every time, leading to server overload. Websockets solved this by allowing a single server to handle multiple connections on the same socket.

However, these days that is arguably not the case anymore. I think most servers don't make a new process for every request anymore, but I could be wrong. Am I wrong?

Regardless, long-polling has left a sour taste in people's mouths and now they've moved on.

But I agree, websockets are overkill for most cases and I have personally never found them to be a very reliable piece of technology. They're super fragile and finnicky and I never liked working with them.


Modern setups don't even involve spawning a new thread per request anymore as even that doesn't scale well enough.


>> most servers don't make a new process for every request anymore

That's correct. Most designs use either a prefork model (e.g. Apache) or an asynchronous event-driven model (e.g. Nginx). With prefork, a number of worker threads or processes are created on start up, and each services one request at a time. In the asynchronous model, a smaller number of (typically single-threaded) processes are created on start up, and each services one event at a time. An event could be an incoming request, a database query returning, etc. In either model, thread/process creation typically happens entirely on start up.


Side-note: Apache 2.4 finally has an event-driven mpm. A little late, as nginx rules the web now, but it's still nice to see they finally realized their style didn't scale.


Justin, this is awesome! The overhead of websocket is also surprising when all layers are considered including TCP Frames http://tavendo.com/blog/post/dissecting-websocket-overhead/


Interesting! I'm surprised that browsers wouldn't batch up many outbound messages into single TCP frames. The server behaves more the way I would expect.

That reminds me of a related point: The number of TCP frames can be considered more important than the number of bytes-per-frame when it comes to network performance. Most message exchanges, whether over WebSocket or HTTP, will occupy a comparable number of TCP frames, even if HTTP uses more bytes for the headers.


If you are interested in Server-Sent-Events, I made a small library a while ago that may be helpful: https://github.com/boppreh/server-sent-events


Some use cases for applying websocket in finance/banking: a) near real time quotes b)order fill notifications c)miscellaneous alerts generated by background(regulatory monitoring) tasks.

We also support long polling etc. but websocket servers are idling while our polling/push servers are at higher cpu utilization with comparatively lower loads.


If notifications only go one way, server side events should also be considered.

...but I don't remember the advantages anymore :)


Have a look at this [0]:

- Automatic reconnection

- Event ids

- Straightforward protocol built on plain old HTTP

[0] http://www.html5rocks.com/en/tutorials/eventsource/basics/


Once you introduce a websocket on your html simple page app, it makes sense to send all the user/browser side events to server over the websocket as well. Use socket.io like aggressive websocket connectivity tests and failover to normal REST ajax.


WebSockets are super easy to use and the proper solution to real time data communication between browser and server.

Yes, there are other ways to accomplish this, but why would you choose a kludgey way like long polling when an elegant solution is available? It makes no sense to me.


WebSockets are super easy to use (we still love it offer support for it), but if your users are on older devices or browsers you're going to run into problems.

You also have to take into consideration different network issues, for instance last year Verizon 3G blocked WebSockets without any documentation. Or fragile wireless networks where the socket will break.


SSE is also a proper solution. If you have a REST setup, you can use the same HTTP resources for both pulling and pushing.


While adding chat to a site, I realized the same conclusions. I started adding browser->server commands to my WebSocket instance, but then I realized I had everything already setup to do that in my API, and I was just splaying browser->server logic over two transports.

WebSockets definitely complicate the code, since you now have to deal with the connection before you can send a request. Keeping it constrained to realtime events cleaned things up nicely.

I was also trying to build some persistence into replaying chat messages on reconnection (data sync), but it was way easier to just turn "get_last_messages" into an endpoint that gets called on page load, or on socket reconnection.


If I understand your first problem correctly, you could have setup an RPC system so you only write your apis once. Use the same 'endpoint' of your api as the data call in your socket connection, and you only end up writing your api once.


I can definitely hit the API from the websocket server, but I just didn't even bother! :D I just wrote the websocket server to do publishing events, and left syncing issues to the API


So I'm curious, what's the difference between fanout.io and something like pusher.com?

My company's application has some real-time components and we've just been using pusher to handle it because it's simple and cheap. Is fanout something I should consider instead?


Fanout's great for powering a public-facing realtime API. For example, you can see what Superfeedr did with it here: http://blog.superfeedr.com/stream-superfeedr/

If you're not making an API, and you're just pushing data to your own client applications that you control, then Pusher and Fanout are more or less the same. Although you might like Fanout's simpler pricing model and our more open philosophy.


That sounds great actually. We don't have an API now but we're planning to start one in the next 30 days and I was wondering about how we were going to do the real-time aspect.

I'm going to bookmark this.


What would be the best technology for building real-time browser based chat application available today where I own the client?


I think I'm too biased to answer this properly. :) All of the realtime-as-a-service options have good resiliency. Pair one of them with RESTful AJAX for the chat db/logic and you'll do well.


Interesting bit from Derby's FAQ[0]

  Why racer-browserchanel uses long pooling and not web sockets?
  Web sockets does not guarantee order of messages between client and server, which is critical for OT.

[0] https://github.com/derbyparty/derby-docs/blob/master/faq.md#...


Server Sent Events aren't used enough. I'm a big fan, so I wrote a little Node.js library which wraps (with options for preprocessing) an event emitter in an express middleware for the company I work for. https://www.npmjs.org/package/nudge


by the way, if anyone is dealing with websocket implementations for the web, have a look at SockJS, it makes things much easier and includes fallbacks for situations where sockets are not supported.

https://github.com/sockjs/sockjs-client

(no affiliation, I just like it)


Probably not, but they're so addictive! The novelty is yet to wear off, for me anyway.


Are you using WebSockets in a big production environment or just for hacks/projects?


Author here. For what reasons do HN folks choose WebSockets? Is it purely technical? Joining the bandwagon? Used by preferred frameworks? Would love to hear some opinions.


Conceptually, polling just seems dirty.

if (some event occurred){ do something over and over every so often that requires multitudes of processes including authentication, DNS etc. across many machines, that might not even ever return any value to the user. }

is not as nice as,

on('the arrival of a message', doSomething)

edit: pardon the sloppy pseudo syntax...


Why not server-sent events[0]?

    var source = new EventSource('/some/url/path');
    source.onmessage = doSomething;
[0] http://dev.w3.org/html5/eventsource/


agreed but "long polling" is not really polling because you're blocking on the HTTP request. Something over UDP sounds like the answer to the complaints about TCP connections getting dropped though. TCP was built for streaming...


TCP technically relies only on state on the endpoints, so it "should" never be dropped unless the endpoints agree. But NAT and various other interlopers ruin this model, as state is then kept in middle-boxes.

UDP would evince the same problem when traversing NAT, since it also would require state to be kept in middle-boxes. The only difference is that instead of periodically reconnecting, you'd need to periodically ping.


Mostly technical, I need the performance, and my problem maps pretty well to raw sockets. The application I'm currently writing can push upwards of 60msgs/sec to the websocket client.

Using long-polling, etc. this just doesn't seem feasible. I'm sure my HTTP stack _could_ push small messages in <17msec fairly reliably... but that's not what it's built to do.

Tossing the request all the way down my HTTP stack just seems wasteful. There might be middleware hitting persistent caches, there might be a reverse proxy out in front, etc. -- This all takes non-negligible resources away from other requests, and this all takes time.

---

I should note that I do use SSE or long polling for things like event notifications. So in the end I fully agree with the article. WebSockets are not a tool that should be applied haphazardly. In my stack WebSocket upgrade requests hit an entirely different server, so I take great care in electing where to use them.


I don't use WebSockets mostly because of general lack of support and stability in the frameworks I use and clients I want to support. I typically use an in memory buffer server side of events generated to send down, the client polls (as rapidly as desired, sometimes on the minutes time frame, sometimes on the seconds) passing up the last package ID it received (or nothing if it's just starting up). The server from memory sends back all messages it has as the delta. This combined with keep-alive has worked rather well. I would not recommend this for mobile users or users who would be on metered connections, but the behavior of the client is easy to implement using ajax and throttling on the javascript side. By it's nature it automatically recovers from network drops or reroutes with no additional logic.


For me it was just "the thing everyone was talking about". Was amazed when I discovered EventSource and puzzled why it wasn't more widely known. Axed the socket.io complexity immediately. Thanks for putting the word out about this!


I used Socket.io for my site, so I don't care what the implementation. I just needed a way to push realtime events to the user.

Side note: Thinking about switching to SockJS since I use 1% of Socket.io features, but it's working well as is


I suspect we originally started using WebSockets because of the bandwagon, but it still feels like what we want. We use them to subscribe to various message feeds; client->server messages are used to add/remove subscriptions and server->client push messages that are on the subscribed feeds.

We have a small abstraction over the browser WebSocket object that gives auto-reconnect (and catches disconnects through polling frames) and an interface that looks more like a Node.js EventEmitter. We fall back to (normal) polling if the socket takes more than a few seconds to connect.


But when you're in a good, stable, wired network on your PC, websockets means a faster experience right?

Maybe we just change the user to long poll after a couple disconnects?


Server-Sent Events have quicker handshake, send events down just as fast as WebSockets, and automatically reconnect (so works like WebSocket on stable connection, long polling on unstable).


You're probably asnwering my other comment? Thanks for the info anyway.


SSE also works across all proxies, which even if you are on a stable connection you might have a transparent proxy that borks websockets, so that's at least one reason to prefer SSE over websockets on a stable connection. A work around then is to use wss: instead.


I like this. The other day we were thinking about how the suggested practice these days is to start with long-polling first, then attempt to upgrade to WebSockets (as opposed to the other way around). But, in some cases it's not the browser but the network that prevents WebSockets from working. This means if you wander to another network that doesn't support WebSockets, you actually still want a way to downgrade back to long-polling.

Maybe such best practices could also include some kind of poll/ping interval that adjusts based on the apparent stability of the link. Stable or wired connections could have very large intervals.


That will introduce yet another delay/flapping point for people on unreliable connections:

You start with a longpoll and get some data, then negotiate websockets (hey, my network and browsers are websockets capable!). Then the connection drops. And then you go through another round of everything again when you get back online (http + longpoll + websockets, instead of just http + longpoll).

Odds are, your website doesn't need websockets anyway, and 99% of your users are just fine with waiting a couple of seconds more to get their notification (and in the process, people with unreliable connections can still use your site).

Of course, if you're doing online stock trading notifications or something like that, you might actually need these two seconds. But then I'll strongly question your choice of using a browser for that :)

EDIT: fixed stray "*"


> Odds are, your website doesn't need websockets anyway

Odds are your website doesn't need CSS, but it certainly can benefit from it! I don't disagree that almost all websites could get by just fine without websockets, but that seems like an incredibly anti-innovative approach to web development to me. Instead we should focus on making the experience better, as opposed to just avoiding innovation altogether.

> 99% of your users are just fine with waiting a couple of seconds

Pretty much every study done in the last few years has suggested that is completely false, that as wait times rise above a second, the number of visitors who simply close out rises exponentially.


I question the value of websockets in most cases, because in most applications I can think of (except maybe games and high-frequency trading), they don't bring me any value and actually prevent me from interacting with the website in a meaningful way (yes my connection is crappy, but that is the case for a big chunk of the online population).

Sure, I can't think of all applications that would benefit from websockets, but so far, none of the ones I've seen do. I'm certain you can find at least a few exceptions, but my point is that it is only this: exceptions.


Anything that has a volume of changing data that needs to be close to real time would benefit. Anything that has that requirement and also wants to be available on the web would most certainly benefit. High-frequency trading and games/betting/TV-related-data (e.g. NASCAR), etc are definitely cases. How about just anything that is currently done via polling and wants to reduce network overhead? There are a lot of those types of applications. (full disclosure - I work for Kaazing, the leading enterprise WebSocket company, though I'm most definitely not speaking for them in any capacity).


I'm sure there's some small difference between waiting a second for the page to load and receiving a notification a second later. Presumably the notification hadn't existed at page load, or what are you doing delivering it by web socket in the first place?


Its also interesting how Gmail doesn't try to hide that its connection problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: