Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
My Hacker News firehose (scripting.com)
105 points by davewiner on Nov 8, 2010 | hide | past | favorite | 54 comments


Hacker News isn't a potentially valuable Internet resource waiting to be mined for interesting new applications. That's Twitter. Hacker News is an actual community.

I don't know or care why the API got blocked, but I've been as happy with as many HN add-on apps as I have been incredibly irritated by them (paging: guy who scraped all HN job postings and made his own job site with them). From what I can tell, it's not actually part of this site's philosophy to be a building block for other people's software ideas.


Are you against http://searchyc.com?

Are you against Gabriel Weinberg's Ask YC archive (http://www.gabrielweinberg.com/startupswiki/Ask_YC_Archive)? Would you like it less if he built it automatically instead of manually (cf. http://metaoptimize.com/projects/autotag/hackernews/)?

I agree that the job posting site was really tacky, but that guy entered all the data manually anyhow.

I think in sum total that most add-ons are pointless. A few are useless. And only a very small fraction are bad, and should be dealt with on a case-by-case basis.


I like SearchYC.

I've never used the Ask YC archive, but I trust Gabriel Weinberg.

I'm not advocating an end to all add-on apps. I'm saying that there is no principle that animates Hacker News that requires Graham to bend over backwards to accomodate APIs. It's a simple, low-drama point.


You misunderstood what I was saying. Start with the firehose. I don't want it to make money, I want it so I can link this flow in with other flows that I'm following without having to visit all the sites. Not trying to make a business, just scratching my own itch having no idea where it leads.


I'm confused why you needed a feed of all stories as they're submitted.

Is that substantially different from the new feed? http://news.ycombinator.com/newest


Same content but instead of being HTML it's RSS 2.0.

http://static.scripting.com/hackernews/rss.xml

That feed stopped updating this morning. That's the issue. I want the API that it's built from to be turned back on.

Hope this helps alleviate the confusion. :-)


I think it's the other way around; you've missed my point. My point is, contrary to the point you raised in your blog post, it does not appear to be part of the charter of Hacker News to accomodate API's.

Surely you've noticed, over the brief time you've been contributing to Hacker News, that the site functionality is very simple. People have asked for hundreds of features that haven't been implemented. I believe that it's not entirely lack of developer bandwidth that keeps those itches from being scratched.


Okay maybe I have missed your point.

Let me ask you a question then, to clarify.

You think there's a reason they have an RSS feed of the top stories but do not have a feed of all the stories as they come in?

Help me, an admitted newbie, understand. Why have one and not the other?


I've always assumed it was because the feed of all stories as they come in contains a huge amount of junk, and also because RSS on Hacker News is an afterthought. I use RSS as my primary interface to HN, but I also recognize that it's inferior to the site itself.


Have you tried this? This is something else I'm working on...

http://viewtext.org/article?url=http%3A%2F%2Fnews.ycombinato...

It will populate the RSS feed with the full content.


I'm generally disinclined to replace HN's existing RSS with someone else's RSS feed that might at any moment stop working. HN RSS sucks, but I never have to think about it.

As someone else pointed out, I also depend on SearchYC, which is an external application. The difference is the SearchYC is extraordinarily well known. HN can't cut off SearchYC without alienating a sizeable group of site veterans.

That's more than you probably wanted to know, but, hey, you asked.


I think that's an unfair and inconsistent application of logic.

I like reading HN and recently contributing tiny bits to HN because I think the community here is top-notch, friendly, and knows its stuff. HN never struck me as elitist; rather, it's more like a really friendly meritocracy. About the most elitist it gets is requiring a karma threshold to be able to downvote, but that's really just part of being a good meritocracy.

But to be accepting of one add-on application simply because of heavy site veteran patronage, while rejecting all others... THAT would strike me as heavy elitism and not in line with what seems to make HN so great. If add-on apps MUST be rejected, the criteria should be for other more respectable reasons, no? And mind you, it's all pg's call anyway.


That would be elitist. If it was what I was saying.


I think the idea behind HN is to have a place where you can go to chat when you are bored or feeling distractable. Having every story posted as its submitted is counterproductive -- it makes HN become a job (like reading Email) instead of something to do once in a while when you feel like it.

I miss probably about 90% of the stories on HN. I enjoy it anyway.

HN is a place to go, not a list of articles.


Why is it that every time I visit scripting.com (about twice per year), I'm met with niggling little issues like reload loops or the lovely javascript link to the loopback address -

"http://127.0.0.1:5337/scripting2/editor/controls?username=da...

I think Dave needs a new tool.


Just curious, does that link cause you any problems in your browser? If so, what browser and on what platform?


Yes. Caused both firefox 3.6.12 and safari 5.0.2 on osx to have fits.

firefox on ubuntu and chrome seem to work without trouble.


Sorry about that. What do the fits look like?

How much can they penalize you for opening a web page that has (what amounts to) a broken link?

BTW, I use Firefox on the Mac, on machines that don't have my CMS running on it (that's what the link connects to) without problems.


>Apparently that's because Hacker News has blocked his API.

Does anyone know if this is actually the case? http://api.ihackernews.com gives me a 404, but I'm posting this from ihackernews's browser interface, which obviously is getting updates.


I took down the site after YC blocked my IP address. I believe it was blocked as a result of increased usage from the explosion in traffic, resulting in some really heavy usage of the API. My guess is that YC's software did this automatically to prevent abuse.

In addition, many folks weren't happy with me distributing the HN database. So there really wasn't a reason to keep it up.

I'm going to rework the API such that it will work within the boundary of an acceptable scrapping rate. I'm not putting the database back up.


That's a real shame, I was thinking of building something on it too. I wish there was a way to get HN pages in simple JSON/XML.


That's exactly what my API did. JSON, JSONP, and XML for pages, comments, profiles.


Yeah, I was in the midst of playing with Gosu and thought about playing with it to get HN posts from the API. Sadly it went down.


Thank you for the confirmation. Will the ihackernews browser interface remain active?

I hope so, because I love it and use it every day.

Whether or not it will, thank you so much for creating it, as well as the API!


That was the point of my piece. Ronnie has done some great work, and deserves all our gratitude.


Thanks, I appreciate it. Yes, it will remain up unless I'm told by PG to take it down :)


I'm curious about Giles Bowkett's comment on that story - is he really blocked on HN? If so, why?


yep, he really is blocked. based on his comments, i'm guessing the reason is because he wasn't able to remain civil and respectful.

http://news.ycombinator.com/user?id=gilesgoatboy

that's just the first one i found, i'm sure there are others.


There were other accounts too that are still active but that he may be locked out of. That particular account got silently hellbanned for sarcasm: http://news.ycombinator.com/item?id=551664

Some historical threads: http://news.ycombinator.com/item?id=196390 http://news.ycombinator.com/item?id=1015591


Having read the comments there, I don't see why they had him banned. Perhaps a few were offensive, but nothing the community can't handle. I prefer reading someone like this to safe comments that just recycle bits of PG's essays. HN could use some irreverence.


Update: the firehose feed is working again.

http://scripting.com/stories/2010/11/09/myHackerNewsFirehose...


In the past, clever things that others have done have been blocked because they placed a substantial burden on the servers.


What incentive does Hacker News have to allow access via an API? At the moment, introducing an API would cost YC money because it would take time to develop and it would potentially increase server costs.

However, allowing others to do the job seems less troublesome.

Creating artificial restrictions to _prevent_ other people from doing so, seems like effort.

I would imagine that the owners of a site would only seek to stem the wholesale flow of information to another property, when inaction is likely to lead to a loss in value.

Where is the value in HN? It could be argued that the value lies in the posts and the related combined wisdom, but I think the true value is related to attention and focus. We come to HN to participate. If the content is allowed to spread to third-party applications (via an API), this primary source of value is lost.


How many Ycombinator startups build on the APIs of other sites? Wouldn't it be reasonable to expect them to reciprocate?


It would be nice to expect HN to reciprocate, but I don't think that it's likely.

API access always needs to be part of a strategy, because granting it will change the landscape within which your business operates.


The page doesn't load for me, Google Cache got it:

http://webcache.googleusercontent.com/search?q=cache:http://...


As we are seeing in here there are numerous APIs available (like the one made available via a torrent download, twitter bots, searching, or alternative look & feel) which they must have been doing their own parsing of this site. Would that actually be good to have a official API for all sorts of hacking needs?


The best way to get around the ban I think is to have low-latency distributed scraping.

Basically, a plugin that submits to RR's site whenever a user is surfing on HN. RR's site then parses it as usual.

The plugin could have some setting like max KB submitted per minute, and RR's site would only request the full page if it was needed at the time.


I think it's interesting that Dave Winer decided to do this as a blog post instead of a HN discussion.

Instead of having a conversation with the HN community he decided to have a conversation with his readers.


If you're submitting a link, put it in the url field. If you want to add initial commentary on the link, write a blog post about it and submit that instead.

http://ycombinator.com/newsguidelines.html


That's not what I "decided" at all.

I do my writing on my blog. That's the way I work.

Happy to participate in a discussion here.


I believe Dave Winer has always advocated writing on your own space and linking into the conversation. I feel this is very insightful advice and pro-web and really "getting it". I didn't get a chance yet to see what experiment Dave was building, his firehose, because scripting.com is down. But I'd give Dave the benefit of the doubt...


That's still a decision.


What is your point?


Huh? He's also the one who posted this to HN. It seems to me like he wanted to have conversations with both the HN community and his readers.


It would be stranger if he didn't post it on his blog. After all his blog has been around for almost 10 years longer than hacker news, was around before they were called blogs. Dave was also the guy that invented RSS.


Dave Winer did not invent RSS.


How is that different from Paul Graham's essays? They are always linked here and this is where the discussion takes place.


It's different because there is nowhere on Paul Graham's website where you can have a discussion about the individual essays(or at all).


FWIW, either your feed, or the feed you're pulling from is not correctly handling encoding.

"Announcing Brightbox Cloud – the UKu0027s first true cloud hosting platform"


Am I the only one who wants the firehose and the API to come back? As long as it can be done without taking the servers down and killing HN, that is :-)


Regarding the "API blocked"...

Seems to be feeding fine as of Mon eve (Nov 8)


What all of these posts fail to consider is that philosophically pg may want you writing code instead of consuming his community 101 different ways.

Just my 02. I visit hn daily, but I'm busy enough on my own work that the website suits my consumption needs.

Anyone else like me out there?


I almost expected to get down-voted. What I was trying to convey is managing your time wisely.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: