Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I guess this is basically the same as OVH's "VAC" system? I sometimes get these emails:

>We have just detected an attack on IP address x.x.x.x. In order to protect your infrastructure, we vacuumed up your traffic onto our mitigation infrastructure. The entire attack will thus be filtered by our infrastructure, and only legitimate traffic will reach your servers.

and then:

>We are no longer able to detect any attack on IP address x.x.x.x. Your infrastructure has now been withdrawn from our mitigation system.

I never need to do anything, but I don't think these attacks are real anyway.



> I never need to do anything, but I don't think these attacks are real anyway

What would it take to convince you an attack is real when it has been 100% mitigated and you never saw it in your backend infrastructure?

I ask as the engineering manager for DDoS protection at Cloudflare, and we stop a lot of attacks. But I feel this tension in the communication and product offering... if we do our job well enough that a customer's system does not see the attack, how does a customer see and feel the value?

An example is that as a reverse HTTP proxy we are implicitly also a full TCP proxy for HTTP traffic and so we receive significantly large SYN or ACK floods. We stop these 100% by virtue of being the terminating TCP proxy, but also by using connection tracking, anycast, XDP + eBPF, and so forth... you won't see a single one of these SYN or ACK packets hitting your infrastructure... so what would we have to communicate to convince you that the attack existed?


>What would it take to convince you an attack is real when it has been 100% mitigated and you never saw it in your backend infrastructure?

I was running node_exporter, which exports a lot of detailed network info from my kernel to Prometheus. During the time intervals leading upto, during, and after the attack, there is nothing there. Not even a blip.

I don't find it likely that OVH completely prevented any kind of volumetric attack from hitting me with zero detection latency. I just have doubts about there existing a perfect technology that doesn't have any false positives and also kicks in instantly. I'll keep an open mind.


Simple reporting with relevant metrics, not logs.


Do you publish metrics on “attacks prevented” (or access to logging and monitoring) for customers?


Yes.

For HTTP customers there are full SIEM logs under Firewall > Overview on our dashboard, and for paid tiers there are drill-down analytics in addition to the full SIEM logs. There is also log push to receive near real-time full HTTP logs into Google or AWS for your own analysis and these show if a firewall feature touched the request or if it was served from cache.

In addition for HTTP customers we show graphs of SYN floods, etc for the IPs your web properties are advertised on.

For L4 customers via Magic Transit we also have Network Analytics showing what we received at our edge network and a log of attacks detected and mitigated.

There is still lots of room for improvement... that's really what I'm asking, what does the ideal system look like for someone where they see and understand the data and trust it.

For example, is it valuable to see the attack landscape and what is happening across our systems even when you are not the target? Would that help give perspective to attacks that do target you, and also increase faith that this system exists and is stopping attacks when attacks do not target you?


These are great examples of technical details, but they're difficult to translate into impact and business value.

Would 100k SYN floods have slowed me site down? Would it have taken it offline? Would it have caused the site to remain up but corrupt data on the backend for some reason?

Off the top of my head, I would think about offering a "replay attack against your staging infra" feature on higher tier plans. The price point should help prevent someone leveraging you as an attack platform, and customers will be able to understand the value that you're bringing to the table in a much more practical way.


I'd build a (metaphorical) visualization of the customer under siege, so they can watch it while they're being attacked and see what they'd be up against without your protection.


I think it'd be helpful to highlight the impact on YOUR infrastructure for an attack i am facing.

Will help add perspective to how disruptive the attacks are.


Yes, also perhaps some guidance figures on what the impact would have been had these measures not been in place.


Hard to answer the impact on your systems had we not stopped it... we don't know the full capability of your systems. Whether you can take a 10k packets per second ACK flood or a 1M pps ACK flood, or the 100M pps ACK flood depends on a lot of things we aren't privy to.

What we can tell you is the frequency, size and nature of attacks that Cloudflare sees, and when we can clearly identify that an attack was unambiguously targetting you specifically then we can tell only you about that too.

If there were a global dashboard which was vague about the target and source, merely the frequency, size and nature... would that be valuable?


> If there were a global dashboard which was vague about the target and source, merely the frequency, size and nature... would that be valuable?

Yes.

> What we can tell you is the frequency, size and nature of attacks that Cloudflare sees, and when we can clearly identify that an attack was unambiguously targetting you specifically then we can tell only you about that too.

Yes.

Also, even if you could tell us WHAT kind of attack it was that would be helpful too.


I should have made it clear I'm not a user,feeling your frustration at being 'invisble', that given, yes, I think a dashboard as you described, perhaps you could have some interactive option to enter your system config to allow you to see how that would have affected your infrastructure?


Maybe describe how big the attack was in a communication with the customer? ie: how many connections per second, bandwidth used, etc? If you could trace the attacker and prosecute them, that you be a lot better, of course (and possibly the way that would gain the most confidence). In other words, if any of your claims could be confirmed by a third party, it would be good. Or you could propose to them to be hit by the attack for a set amount of time before you move in.


You are looking at it as if DDoS protection provides some additional value to customers they don't comprehend, but not as a basic necessity for hosting providers to ensure competitive quality of service they can offer customers, which is how it is in competitive markets. Trying to convince customers about attacks to fake perceived value is the same as trying to convince customers of edge nodes failing over to other nodes, but you don't do that, don't you? Think about why you don't do that. Faking perceived value is AV companies level of shadiness.


They are very likely real and OVH has a very good system. You can thank them for making free DDoS protection mainstream, dragging all other hosts kicking and screaming into providing DDoS protection.

In the past providers like Linode were happy to just null route your IP for several hours/days or charge you thousands to block a small flood.


> You can thank them for making free DDoS protection mainstream

AWS does charge for WAF and Shield, I believe.

I also remember comparing AWS Lambda at Edge vs Cloudflare Workers (though Lambda allows for longer execution times and generally provides more flexibility like RAM, CPU, Runtimes since it runs on a Linux VM vs V8 Isolates for Workers), costs were something like 10x apart.

Can't wait for WebSockets support for Workers.


> I also remember comparing AWS Lambda at Edge vs Cloudflare Workers ... costs were something like 10x apart.

According to the AWS pricing example[1] 10 million requests per month on Lambda@Edge costs $9.13. The same thing on Cloudflare Workers[2] costs $5.00. So I would expect it to be closer to 2x. Although as you say there's a bit more flexibility with Lambda@Edge so it'll depend on your particular case.

I'm curious if your situation was different somehow that made for such a big cost difference between the two?

[1]https://aws.amazon.com/lambda/pricing/#Lambda.40Edge_Pricing [2]https://workers.cloudflare.com/#plans


$5 includes a generous free tier for Workers KV that can hold upto 10MiB of data against a single key. Cloudflare does not charge for bandwidth consumed, I believe. Also, use of Cloudflare's zonal http-cache is free.

I guess, when I compared, I took Lambda@Edge's per second billing into consideration and not per 50ms (which brings down the RAM usage cost from $62.52 to $3.13 and total usage from $68.52 to $9.13).

What really sealed the deal for me was the very low cold-start times with Workers. I'm not aware of recent improvements with Lambda@Edge, but the last time I tried them, it wasn't uncommon to hit 100ms+ start times.


That's interesting, thanks. I haven't really used Cloudflare Workers for much myself so it's interesting to hear folks' comparisons.


One more thing, I am not sure if Lambda@Edge charges based on wall-time or cpu-time. Workers' 50ms is cpu-time only and not wall-time. You could, in theory, spend 30s waiting for a fetch to return as awaiting on the network doesn't count against a Workers' 50ms cpu-time limit.

Ref: https://developers.cloudflare.com/workers/about/limits/


It wasn't just Linode (a provider that's generally much cheaper than OVH), but high end providers like Softlayer too (and, as far as I know, still are)


SoftLayer, now IBM Cloud, does not provide “DDOS Protection”. Their filtering service is only up to a certain amount, iirc max 5.5 Gbit or something like that. It is absolute trash and pretty much kills all traffic legitimate or not. If you’re filtered for more than 6-8 hours, they will just nullroute the IP. I’ve had the misfortune of hitting this every few months and it’s a major headache. If they remove you from the nullroute, and you land back on it, they won’t remove you again for 24 hours. A few months ago we hit a bug with their detection where it considered outbound to be the same as inbound, and produced some wildly off base numbers that made no sense. We’re not a small account, I’d say medium sized probably at this point, but I wanted to hop a plane to Dallas and strangle the techs there. It’s really gone downhill since IBM acquired them, and I dread to see how Red Hat fares...

If you ask for any estimated traffic size so you can go to a service that does do filtering for a living, they won’t give you that stating “nobody does”. It took a lot of time getting numbers out of them, and that finally finding a top level employee through our account manager was what led to them going “oh yikes, yeah something is off.” Sigh.


How is Linode “much cheaper than OVH”? Their margins are probably way higher.


Linode's origin is as a VPS provider. OVH's is as a dedicated server provider. OVH's top server is twice the cost of Linode's and their cheapest is 5x the price (10 vs ~50).


Cheapest OVH dedi I see is the KS-1 for 3.99eur, which is less than the cheapest Linode VPS. OVH also doesn’t try to nickel-and-dime you by charging for data transfer.

I wouldn’t recommend the cheapest Kimsufi offerings though, something like the SYS-WS-1 goes for $33 and is easily comparable with Linode offerings priced at multiple times that.

OVH has a VPS product that’s far cheaper than what Linode offers, but I can’t speak to the quality of that offering.


Comparing prices for bare metal with prices for shared infrastructure is pretty much useless though.


Why is that?


I've been using OVH services to host game servers for several years now, and their "VAC" is a godsend. Other providers would prefer to terminate my account or offer some kind of protection for thousand of dollars.


I’m curious, does anyone know what that means specifically? How can they differentiate normal traffic from malicious traffic? What exactly triggers it? Is a ping flood with a slow (50mbits) internet connect enough? I am aware that the details are mostly likely private to protect them from abuse and are also a trade secret but I have a very hard time to find a general approach that might be similar to their solution?


Most probably they have DDoS appliances (i.e. Arbor, corero, etc) installed in their network. One of the implementation is they will redirect all customer traffic to this appliance. And then the appliance will get some sample of the traffic and match it with their attack fingerprints database. If matched they will block the traffic. For the good traffic they will let it go to its final destination.


This is how we implemented it at an ISP I worked at before. All our peering routers sampled traffic using IPFIX and sent it to an Arbor collector for fingerprinting and analysis. If the collector detected malicious flows it would automatically send a BGP Flowspec message with the list of malicious flows to our peering routers. The BGP Flowspec message would cause our peering routers to redirect the matched traffic to a Arbor TMS server which would scrub the DDoS traffic from the dirty traffic and send the cleaned traffic back to our routers to be routed normally to the end-user. There are other ways to mitigate DDoS but this is what ended working best for us.


They have their own system (a friend worked on it). I don't know the details of the system though


I don't know their solution exactly, but what usually happens is they look for common packet signatures. Most DDOSes aren't very sophisticated, and can be blocked with fairly simple rules.


Linode support is fantastic and we host some critical infrastructure with them for this reason and have been happy for years. DO has the managed DB however so we've been migrating some services to them. If Linode offers a managed DB I'll move everything back.


It's much more advanced than you think : custom asics etc...


Source?



Thanks


Hacker: ping -t 1.2.3.4

OVH: go into lockdown!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: