Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Setting up Django + NGinx + Green Unicorn in an Ubuntu EC2 instance (adrian.org.ar)
101 points by elopinologo on Dec 27, 2011 | hide | past | favorite | 37 comments


Good for starters but there are quite a few bad practices used in this tutorial such as declaring your apps templates directory when by default Django searches all apps for a templates directory. Also running nearly everything as root.

This will get you up and running but just be aware that some of the items may not be the "correct way" of doing it.


>>such as declaring your apps templates directory when by default Django searches all apps for a templates directory.

I don't think so. Removing the templates directory gives you an exception. Maybe you are using another loader?

>>Also running nearly everything as root.

Right, this was meant for starters so I didn't want to complicate things. But I just added some options so now Upstart will run GUnicorn under other user.


What do you suggest is the correct way? Running each app under its own user with groups for each resource?


The correct way on Linux to run services is for each service to have it's own user. If you are just going to be running a single Django project I recommend creating a user simply called "django", putting all the projects files in "/srv/www/" because that's what "srv" is for. And then running your django project as the django user. If your django project gets hacked and you have permissions set properly then the most they can do is fiddle with all your django files. If they have root access they can get your SSL certs, SSH keys, redirect your Nginx server and all kinds of other bad things.


Sure, trying to mitigate damage with sandboxes is a good goal, but the reality is that if your server is hacked at all you're screwed. There is enough local shell exploits that once you are on a box, chances are you can get past user and group permissions pretty easily. Plus, they've hacked your box! Time to wipe it and start over. Period.

As for SSL certs, don't you store those on the load balancer? If you are running a single box and it goes down, pointing the LB at a new backend box is a lot easier than copying certs around.

As for SSH keys, I'm not quite sure what damage could be done there. The only thing on there is public keys, right?

As I said below, this is a big reason why I believe in PaaS if your application can deal with it (and most can). Sysadmin has become so commoditized that there is no point in wasting your own time/money dealing with this stuff.


>> Also running nearly everything as root.

Not really. It seems everything runs from a virtual environment under the Ubuntu user which is not root btw.


Just a few notes to keep in mind...

1. You should use the nginx PPA instead of the default Ubuntu one.

2. uWSGI is an excellent alternative to Gunicorn, and can be run as a system startup service using supervisord.

3. As you get further along in your django project development, you may find you will need to apply patches to django core. In this case it helps to install from the git repo instead of pip, then checkout to the latest stable tag.

4. Instead of running `sudo /etc/init.d/nginx restart`, the idiomatic Ubuntu way is `sudo service nginx restart`.


Why PPA ? It's maintained by volunteers and there is no connection with "Igor Sysoev". What i'm missing ?


Probably because the official Nginx docs recommend it:

http://wiki.nginx.org/Install#Ubuntu_PPA


It's not actually a recommendation, Nginx docs says :

"Look, here is our official release under nginx.org and there is some wonderful work from volunteers which you can install development, stable and nightly builds without hassle"


>>1 / 4

Noted

>2

I chose GUnicorn for being more popular and easier to install. Performance difference seems to be very small and not sure how things evolved in last versions.

>3

Certainly not what I want for an introduction.


There was an article somewhere comparing the performance of various Django setups and as far as I can remember uWSGI+NGinx won out, although Gunicorn was a close second


Here is an excellent resource which made me a convert to uwsgi:

http://nichol.as/benchmark-of-python-web-servers.

Setup is a bit more difficult, but if you are already using nginx it isn't so bad.


Very often performance isn't the most important metric though. For me it's usually about what's the easiest to set up and administer.


Here's one of those articles: http://www.peterbe.com/plog/fcgi-vs-gunicorn-vs-uwsgi He actually proclaimed Gunicorn as the initial winner but after what probably was some real-life usage changed his mind to UWsgi.

the UWsgi module is also now included with NGinx by default (as of version .84 or something)

I've never used Upstart as in the original post but I believe Supervisor does the same thing and is maybe a little more commonly used (http://supervisord.org/)


As I recall, gevent support in uWSGI is still only in the -dev branch, and requires the as yet unreleased version of gevent (also -dev).

I have been using gunicorn+gevent in production for a while now and it has been outstanding.


Here's another article on setting up Django on EC2 -- but using NGinx and uWSGI: http://posterous.adambard.com/start-to-finish-serving-mysql-...


Anyone know about the performance of this stack vs Apache + mod_wsgi?


I remember reading from Instagram blog that they found Gunicorn to be less CPU-intensive:

  We use http://gunicorn.org/ as our WSGI server; 
  we used to use mod_wsgi and Apache, but found 
  Gunicorn was much easier to configure, and less 
  CPU-intensive. 
http://instagram-engineering.tumblr.com/post/13649370142/wha...


Nginx+n performs pretty well in general against Apache, and in my experience nginx as a front to something like uWSGI for Django is very easy to set up and use. There was a comparison of the various Nginx+n stacks, I think Nginx+uWSGI won out, with Gunicorn in a close second.


Gunicorn generally takes up less ram and less CPU than Apache + mod_wsgi. I've also had some major mem leaks with apache that have never occurred while using Gunicorn, I believe it restarts a thread when it gets out of control.


In short - faster, easier on ram usage and easier to configure.


Why are people still doing this kind of stuff instead of using a PaaS?


There's always the ability to tweak specifics for your setup that aren't available in many platforms.

IMHO, you shouldn't use a PaaS unless you know how to set your stack up. Essentially, if you don't recognize or have respect for the convenience you're being given (by going through a platform) then you'll potentially make bad decisions down the road. Personally, I prefer to learn how to set it up and scale it myself initially which helps me better understand what to look for in a platform.

Plus, there's something fun about putting all the pieces together and watching your apps come up (especially if you don't normally handle this part of the stack).


What do you think of Cloud Foundry that is a PaaS which is open source and you can control all the aspects of if you don't like them?


* Because i can.

* Because i can understand the technology behind the "magic".

* Because i can understand the technology behind the "magic" and design my system better.

* Because i can understand the technology behind the "magic" and design my system better and if PaaS provider goes out of business i have a chance to survive.

* Like snap said before : i've got the power...

And most important : because i'm a masochist bastard, who has trust problems...


But why not templetize the power into an open source system like Cloud Foundry, do it once and then forget about it? Why not design it once, use forever? You still get the power, you just have the time to think about other things, like actually getting shit done.


If you use Cloudfoundry.COM you are locked in, it's still in beta and we don't know about pricing model. If you use opensource version (Cloudfoundry.ORG) you still need "server guys".


I'd imagine on the whole due to cost. Heroku, for example, is a great stack, but costs a significant amount if you aren't flush with cash. This can be run on a small ec2 instance for a fraction of the cost, whilst it does have greater administrative overhead.


Why are people still think PaaS is the only way to host apps?


It isn't that people still think it is the only way. It is that people have realized that they don't want to be woken up in the middle of the night every time their server goes kaput. I have better things to do with my time.

I'd rather pay the 'extra' money to a PaaS in order to not have to a) hire a sysadmin, or pretend to be one on TV. b) get a full nights sleep. c) be able to go on vacation without worrying if the server is up or not. d) not have to worry about adding servers when my traffic has spiked and this one server isn't enough.

If you don't think EC2 instances go down, you are kidding yourself. Not only do they go down, but they 'appear' up. Talk about a mixed message. So, you also need to pay for a third party monitoring service just to tell you that your server is down.

At the end of the day, when you've added up all of those 'costs' of running your own servers, PaaS starts to look at lot more attractive. I certainly trust a company like Google AppEngine to be able to hire a lot more sysadmins, setup better monitoring and have better response times for fixing servers than I ever could.

I'd also love to never have to learn or remember the arcane list of setup configuration that this person blogged about. Instead, I'd rather be adding features to my app and making money.


Because depending on the size of what you are running PaaS can be very expensive.


Heroku support to Django is promising but it still has things to improve like database access which is a performance bottleneck.

Also I don't think comfortable having all this uncontrolled "magic" happening.

EC2 is free for a year so I think it's worth giving it a shot.


With the cost and control advantages of cloud or owned servers, PaaS seems like a tough sell. Setting up and administering servers has become much easier and continues to improve.


And what shall the PaaS providers use for their web servers? PaaSaaS? :)


I know you are joking, but the idea is that different levels of service are given by different people, and they may be layered one on top of another. I believe the term used these days is IaaS, Infrastructure as a Service. Amazon EC2 would be IaaS. Heroku or DotCloud would be PaaS. A PaaS company can have their infrastructure provided by an IaaS company.

It's Turtles as a Service all the way down.


Many parts of Heroku are actually self-hosted: https://thestrangeloop.com/sessions/running-heroku-on-heroku




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: