Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is pretty shocking. What is PII doing in the query string in the first place? Disclosing pregnancy status from an insurance application sounds like a possible HIPPA violation and runs afoul of various state laws around 'Insurance Information and Privacy Protection'. E.g http://www.leginfo.ca.gov/cgi-bin/displaycode?section=ins&gr.... See Section 791.13(k). That's just CA law but many states followed with their own version. (IANAL)

I think the really big penalties come into play when medical information is 'personally identifiable'. Since this data is going to Google, Facebook, and Twitter (really?!) with 3rd party cookies, or even without, it would be hard to argue this data is not personally identifiable.

It's not like they didn't know they weren't sending this data out. Or perhaps the highly advanced debugging prowess of "Chrome Inspector" is beyond their pay grade.

Edit: Oh it's not even just Referral leak it's actually it the request in some cases, so blatantly intentional. :-(



> Oh it's not even just Referral leak it's actually it the request in some cases, so blatantly intentional.

Be careful about throwing the term intentional around. There is nothing to suggest this is the case. It's just a shocking breakdown in security/testing processes and/or a bug. We see security/privacy issues everyday. They are almost never intentional.


> They are almost never intentional.

Somebody had to specifically code the application to concatenate into the string:

> smoker=1&parent=&pregnant=1&mec=&zip=85601&state=AZ&income=35000


Reading the example more closely, that's part of a URL:

https://4037109.fls.doubleclick.net/activityi;src=4037109;ty...?

Unfortunately, a quick Google search doesn't explain what the oref parameter does but from the name I'm assuming it's something like "original referrer".

You don't need malice to explain this – it's entirely plausible to imagine that some people wanted to track user activities and they had a staggering lapse in HIPAA auditing due to the rush of getting the site out and stabilized.


> it's entirely plausible to imagine that some people wanted to track user activities and they had a staggering lapse in HIPAA auditing due to the rush of getting the site out and stabilized.

Considering they spent 1.7 billion on the site, I simply cannot believe that they were so unorganised and lazy on their testing that they couldn't find this. Otherwise I don't know what to think anymore.


I don't think it's accurate to say "they spent 1.7 billion on the site".

I think the $1.7 billion figure comes from this OIG report http://oig.hhs.gov/oei/reports/oei-03-14-00231.pdf

However, the OIG report has a number of important caveats:

* The list of 60 contracts in the report includes contracts to support state websites and for programs unrelated to the website (for instance, I found an $85 million contract related to accountable care organizations, which doesn't seem to have any connection to the website).

* The $1.7 billion is not the amount expended, it's the estimated value at the time the contract was awarded if all the options are exercised. When you look at the individual contracts, this estimated value turns out not to be very useful. Some contracts had double the estimated expenditure, some had $0 expended. Looking at the total amount expended, you get a figure of $500 million.

So I think it's more reasonable to say that they spent $500 million on various projects to implement the law, including both the user-facing website and all the behind-the-scenes stuff.


> rush of getting the site out and stabilized.

I agree that there's no evidence, at least not yet, of malicious intent. But remember that the "rush of getting the site out" took place back in 2012-2013, with a launch in 2013. It's 2015 now.


Oh, sure – I just suspect that project has been in death-march mode for the last few years. I'd be shocked if the initial launch & stability rush wasn't immediately followed by “now that that's done, we have this backlog of postponed requirements…”


Would it excuse it if it were, say, not HealthCare.gov but rather some private company's website?

(probably not)


I don't see anyone excusing it – only discounting the supposition that it was intentional.

The only reason this is particularly newsworthy is that it's a .gov service connected to a contentious political issue – I mean, my health insurance company uses the same DoubleClick tracking service and I doubt I could even get a reporter to call me back if I tried to peddle some conspiracy theory about it.


Good digging, and I think you're right, this certainly explains how the data could have inadvertently made it from Referrer into the request.


There is also such a thing as criminal negligence, right?


It could have been passed along as a returnUrl. Never attribute to malice what can be explained by incompetence.


Here's the full URL, which was embedded into another URL:

  https://www.healthcare.gov/see-plans/85601/results/?county=04019&age=40&smoker=1&parent=&pregnant=1&mec=&zip=85601&state=AZ&income=35000&step=4
Looks like a plausible search URL from a <form> element with GET. Putting it into the querystring instead of a POST body is a bit surprising, but I think not utterly negligent. Then some Javascript code (maybe not even healthcare.gov's in-house code) looked at window.location.href and put it into another URL, and nobody noticed or stopped it. That is negligent, but more understandable, and fair to presume as unintentional, I think.

There's plenty to be legitimately upset about here, but your comment ("specifically code the application to concatenate") seems to imply the code has something outrageous like "&pregant="+currentUser.pregnant+"&smoker="+currentUser.smoker somewhere, without you giving any evidence that's the case.


"Putting it into the querystring instead of a POST body is a bit surprising, but I think not utterly negligent."

Putting sensitive data into query strings has been considered bad practice for a very long time. To name just one problem among many, it goes into the browser history.


You see it all the time though - it is either implemented that way from the outset, or when setup with a proper POST then someone in testing files a feature request or bug that says "when I bookmark the search page or email the link the search results reset - we need to be able to link to a results page" and wa-la.

It suggests that this data was surfaced accidentally and DoubleClick may not have been interpreting the actual parameters. Either way, it is horrible - there shouldn't be any ad networks or third-party requests within a domain scope that is handling health data.


   "when I bookmark the search page or email the link the search results reset - we need to be able to link to a results page" and wa-la.
Wallah = 'By Allah's Name' in arabic

Voilà = Literally 'See' + 'There' in French

Only Americans seem to confuse these two words.


I'm not American, but kudos on the bigotry.

Incase you didn't notice, I wasn't spelling any word - just phonetic punctuation which is common in online chat.


Downvote for the insult. I would have corrected him as well, but this was unnecessary:

> Only Americans seem to confuse these two words.


It's just an observation that this switcheroo of Wallah/Voilà seems regional to that area of the world - would you say that's not true? No insult intended.

Other seemingly American conventions include saying: 'would of', 'could of', 'for all intensive purposes', 'I could care less', and plenty more. I travel to the US a couple times every month and notice things like this!


The implied insult is that this region is the only region where people are dumb enough to misspell these words.

All of your examples are examples of words and phrases that people have primarily heard (rather than read), and thus they spell them incorrectly. People who have read these phrases repeatedly are less like to make these mistakes of spelling.


Yes, it's hard to argue against that. I didn't focus on it because the privacy harm of leaking to third parties seems greater than that caused by the URL going into browser history or server logfiles (the most relevant concerns I see.)

This doesn't excuse them, but it's interesting to consider that probably tens of thousands of sensitive Google, etc. searches per day go into query strings. I guess the difference is that Google doesn't know if your freeform search is sensitive, whereas a "Pregnant?" checkbox is known to be sensitive. But maybe general search engines should not be using query strings either, just in case.


Browser history could be a problem if, say, family member #1 (pregnant, but hasn't disclosed it) signs up right before family member #2 (who doesn't know about the pregnancy, and who family member #1 doesn't want to know about the pregnancy).

Then there's the issue of public computers, which people signing up for Obamacare may well be more likely to use (I don't know that for a fact, but it seems plausible... I know in my area they've had workshops where you come in and someone helps you sign up using a public machine).


Given that these requests are using HTTPS (not that a non-secure POST is any better than a non-secure GET in that regard), could you give some other issues aside from the browser history issue?


Query parameters are public data. They are sent along with every single request from your browser to 3rd parties which is launched from the page! Analytics, image downloads, click-throughs, everything. I think browser history and server logs are actually the lesser evil here.

These query parameters in particular are not search terms, they are PHI you enter to determine the rates you will pay for health insurance. They are basically all the key parameters besides the age of household members which determine the price you will pay for a given insurance plan.


The other big one (as another poster noted below) is server log files. Those may be accessible to someone who shouldn't necessarily have access to the data.


Not intentional. This was a bug.


Except when it is:

Medicare spokesman Aaron Albright said outside vendors "are prohibited from using information from these tools on HealthCare.gov for their companies' purposes." The government uses them to measure the performance of HealthCare.gov so consumers get "a simpler, more streamlined and intuitive experience," he added.

http://apnews.myway.com/article/20150120/us--health_overhaul...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: