Node.js in Production

Posted on by in Ops, Web

When running a node application in production, you need to keep stability, performance, security, and maintainability in mind. Outlined here is what I think are the best practices for putting node.js into production.

By the end of this guide, this setup will include 3 servers: a load balancer (lb) and 2 app servers (app1 and app2). The load balancer will health check and balance traffic between the servers. The app servers will be using a combination of systemd and node cluster to load balance and route traffic around multiple node processes on the server. Deploys will be a one-line command from the developer’s laptop and cause zero downtime or request failures.

It will look roughly like this:

Photo credit: Digital Ocean

How this article is written

This article is targeted to those with beginning operations experience. You should, however, at least be basically familiar with what a process is, what upstart/systemd/init are and process signals. To get the most out of it, I suggest you follow along with your own servers (but still using my demo node app for parity). Outside of that, there are some useful configuration settings and scripts that should make for good reference in running node.js in production.

The final app is hosted here:

For this guide I will be using Digital Ocean and Fedora. However, it’s written as generically as possible so there should be value here no matter what stack you’re on.

I will be working off of vanilla Digital Ocean Fedora 20 servers. I’ve tested this guide a few times, so you should be able to follow along with each step without a problem.

Why Fedora?

All Linux distros (aside from Gentoo) are moving to systemd from various other init systems. Because Ubuntu (probably the most popular flavor in the world) hasn’t yet moved over to systemd (they’ve announced they will), I felt that it would be inappropriate to teach Upstart here.

systemd offers some significant advantages over Upstart including advanced, centralized logging support, simpler configuration, speed, and way more features.

Install Node.js

First thing you’ll need to do on the server is to set it up to run node. On Digital Ocean, I was able to get it down to just these 4 commands:

This installs node from yum (which might install an old version), then the awesome n package to install/switch node versions. Finally, n installs the latest stable build of node.

From here, run # node --version and you should see it running the latest node version.

Later we’ll see how to automate this step with Ansible.

Create web user

Because it is insecure to run your application as root, we will create a web user for our system.

To create this user: # useradd -mrU web

Adding the application

Now that we’ve added node and our user we can move on to adding our node app:

Create a folder for the app: # mkdir /var/www

Set the owner to web: # chown web /var/www

Set the group to web: # chgrp web /var/www

cd into it: # cd /var/www/

As the web user: $ su web

Clone the sample hello world app repo: $ git clone

It consists of a very simple node.js app:

Run the app: $ node app.js.

You should be able to go the server’s IP address in the browser and see the app up and running:

Screen Shot 2014-06-01 at 12.41.06 PM

Note: you may need to run # iptables -F to flush the iptables as well as firewall-cmd --permanent --zone=public --add-port=3000/tcp to open the firewall.

Another Note: This runs on port 3000. Making it run on port 80 would be possible using a reverse proxy (such as nginx), but for this setup we will actually run the app servers on port 3000 and the load balancer (on a different server) will run on port 80.


Now that we have a way to run the server, we need to add it to systemd to ensure it will stay running in case of a crash.

Here’s a systemd script we can use:

Copy this file (as root) to /etc/systemd/system/node-sample.service

Enable it: # systemctl enable node-sample

Start it: # systemctl start node-sample

See status: # systemctl status node-sample

See logs: # journalctl -u node-sample

Try killing the node process by its pid and see if it starts back up!

Clustering processes

Now that we can get a single process running, we need to use the built-in node cluster which will automatically load balance traffic to multiple processes.

Here’s a script that you can use to host a Node.js app.

Simply run that file next to app.js: $ node boot.js

This script will run 2 instances of the app, restarting each one if it dies. It will also allow you to perform a zero-downtime restart by sending SIGHUP.

Try that now by making a change to the response in app.js. You can see the server update by running: $ kill -hup [pid]. It will gracefully restart by restarting one process at a time.

You’ll need to update the systemd configuration if you want it to boot the clustered version of your app as opposed to the single instance once. Also, if you add an ExecReload=/bin/kill -HUP $MAINPID attribute to your systemd config, you can run # systemctl reload node-sample to do a zero-downtime restart!

Here’s an example of the node cluster systemd config:

Load Balancing

In production you’ll need at least 2 servers just in case one goes down. I would not deploy a real system to just a single box. Keep in mind: boxes don’t just go down because they break, perhaps you want to take one down for maintenance? A load balancer can perform health checks on the boxes and if one has a problem, it will remove it from the rotation.

First, setup another Node.js app server using all of the previous steps. Next create a new Fedora box in Digital Ocean (or wherever) and ssh into it.

Install haproxy: # yum install haproxy

Change /etc/haproxy/haproxy.cfg to the following (replacing the server IPs with your app IPs):

Now restart haproxy: systemctl restart haproxy

You should see the app running on port 80 on the load balancer. You can also go to /haproxy?stats to see the HAProxy stats page. Credentials: (my_username/my_pass)

For more information on setting up HAProxy, check out this guide I used, or the official docs.

Deploying your code with Ansible

Now most production guides would stop here, but I don’t think this is a complete setup, you still need to do a deploy! Without a deploy script, it isn’t a terrible process to update our code. It would look something like this:

  • SSH into app1
  • cd /var/www/node-hello-world
  • git pull the latest code
  • systemctl reload node-sample to restart the app

The major downside is that we have to do this on each server, making it a bit laborious. Using Ansible we can push our code out from the dev machine and properly reload the code.

Ansible tends to scare people. I think people assume it’s similar to complicated tools like Chef and Puppet, but it’s a lot closer to Fabric or Capistrano. It basically just ssh’s into boxes and runs commands. No clients, no master server, no complicated cookbooks, just commands. It does have features that make it great at provisioning too, but you can just use it to deploy code if you wish.

Here’s the Ansible files needed if you’d like to deploy code like this:

Run it with the following from your dev machine (make sure you installed Ansible): ansible-playbook -i production deploy.yml

That production file is called an inventory file in Ansible. It simply lays out the hostnames of all the servers and their role.

The yml file here is called a playbook. It defines the tasks to run. In this case, it gets the latest code from github. If there are changes, it calls the ‘notify’ task that will reload the app server. If there are no changes, that handler does not get called. If you wanted to also say, install npm packages, you could do that here as well. Make sure you use npm shrinkwrap if you don’t check your packages into the repo, by the way.

Note: if you want to pull down a private git repo, you’ll need to set up SSH Agent Forwarding.

Ansible for Provisioning

Ideally, we would have the app server building part automated so we don’t have to go through these steps every time. For that we can use the following Ansible playbook to provision the app servers like we did manually before:

Run it like so: ansible-playbook -i [inventory file] app.yml.

Here is the same for the load balancer.

Final app

Here’s the final result of all these steps. As it mentions, updating the inventory file, running the provision and deploy steps should built out a full app automatically.


Making other environments is easy. Simply add a new inventory file (ansible/production) for staging and start referencing it when calling ansible-playbook.


Test your setup. If for no other reason, it’s really fun to try to find ways to knock your cluster offline. Use Siege under load test mode. Try sending kill -9 to various processes. Knock a server offline. Send random signals to things. Run out of disk space. Just find things you can do to mess with your cluster and ensure that availability % doesn’t drop.

Improvements to be made

No production cluster is perfect, and this is no exception. I would feel pretty comfortable rolling this into production, but if I wanted to harden it further, here’s what I would do:

HAProxy Failover

Right now HAProxy (while stable) is an SPOF. We could change that with DNS Failover. DNS Failover is not instantaneous, and would result in a few seconds of downtime while DNS propogates. I am not really concerned about HAProxy failing, but I am concerned about human error in changing LB config.

Rolling deploys

In case a deploy goes out that breaks the cluster, I would setup a rolling deploy in Ansible to slowly roll out changes, health checking along the way.

Dynamic inventory

I think others would rate this higher than myself. In this setup you have to commit the hostnames of the servers into the source code. You can configure Ansible to use dynamic inventory to query the hosts from Digital Ocean (or other provider). You could also use this to create new boxes. Really though, creating a server in Digital Ocean isn’t the most difficult thing.

Centralized Logging

JSON logging is really the way to go since you can easily aggregate and search through the data. I would take a look at Bunyan for this.

It’d be nice if the logs for all of this were drained to one queryable location. Perhaps using something like Loggly, but there are lots of ways to do this.

Error Reporting and Monitoring

Again, lots of solutions for error reporting and logging. None that I’ve tried on node I have really liked though, so I’m hesitant to suggest anything. Please post in the comments if there’s a solution to either you’re a fan of.

For more tips, check out the awesome Joyent guide on running Node.js in production.

There you have it! This should make for a simple, stable Node.js cluster. Let me know if you have any tips on how to enhance it!


  Comments: 44

  1. Exactly the primer I needed. Thanks!

  2. jesus israel perales martinez

    this a great article, fedora rulez !!

  3. I’ve been using pm2 for node process management in production. It generates some nice init scripts and allows you to do some local “load balancing” by spinning up multiple instances of your program. It will also monitor your process and if it dies, pm2 will attempt a restart.

    The rest of my setup pretty much mimics your guide. Well said, sir!

  4. Why don’t you use pm2?

    • I tried it, it does way too much. I ended up in situations where I had multiple instances of the app running. Managing processes is better done by working directly with the process signals than trying to interact with a layer on top of it. Node cluster is really well designed so there’s no reason not to use it. As I show in this article, you can replace the use of it with ~50 loc that doesn’t even require altering your app code.

  5. pixelBender67

    awesome, I need to set up a VPS to host my node site

  6. Great article.
    Could you please let me know if the clustering using boot.js would work for a node.js, chat application where Redis is used as the store?


    • yes node cluster works with websockets too. The issue might be the 60 second timeout though, you probably don’t want to just kill your websocket connection. You could move it up to 60 minutes, but then deploys would take a long time. You’d probably want some way to reconnect the websocket on the client side in case it disconnects

  7. Great article, very easy to understand and follow. Thanks!

  8. Todd Werelius

    You might want to mention that ( as of today ) fedora on Digital Ocean uses firewalld that by default has all ports closed, the following command will open a port so your tutorial will work.

    firewall-cmd –zone=public –add-port=####/tcp

    The public zone will make it accessible to the net in general for the Load Balancer, to secure the other servers you will of course need to configure this all out ( sure wish digital ocean had a real private network, but hey starting $5 a month you can’t complain!)

  9. Excellent overview, clear, very well presented — Thank you!

  10. Victor Kurauchi

    Thanks for sharing ! Great article.

  11. Hey thanks for putting these steps together. This is excellent information.

  12. Doug Topalovic

    Hmm. For some reason I cannot get haproxy to forward any requests to the app servers. Getting 503. Any ideas?

  13. Victor Kurauchi

    Hey man, great article ! Thanks for sharing. I’ll be testing soon this environment in production. I had a doubt:

    To do that, i’ll ned 3 servers:
    – app1 server
    – app2 server
    – loadbalance server (using HAProxy to balance)

    ‘This script will run 2 instances of the app, restarting each one if it dies. It will also allow you to perform a zero-downtime restart by sending SIGHUP.’

    The boot.js file creates 2 instances of the app in each server ? So the result would be like:

    – app1 server (2 instances of the app)
    – app2 server (2instances of the app)
    – loadblance server (using HAProxy to balance)

    Is that it ? Thanks!!

  14. chris_gunawardena

    Thanks for the great article, I wish this was there a few months ago. I ended up using heroku free hosting but hope to follow this for my next project.

  15. Thanks, great tutorial!

  16. Just another handy primer for node production..

  17. Thanks, Jeff,

    Excellent tutorial. A bit off topic, but perhaps you can help. I’m using third-party modules (express, e.g.) in my service and am having trouble getting them loaded when running uder systemd. I can’t seem to set the NODE_PATH.

    I tried Environment=NODE_PATH=/usr/local/lib/node_modules in my service definition, but found my first require() failing as it could not resolve the file name.

    I worked around it by beginning my service script with …

    process.env.NODE_PATH = '/usr/local/lib/node_modules';

    Any thoughts?

  18. How can I add an environment variable on the systemd config?
    You have NODE_ENV=production in there but what if I need to add more environment variables?

  19. Thanks for writing this informative blog.

    I am new to node.js

    After having tried out a REST API framework(loopback)on windows , i could get one Application[REST Loopback service] associated to one running node instance[PID associated]. Starting another application on a different port starts another node process with a diffrent PID.

    Is it the behavioral of node.js that all the app instances[REST API in my case] would start as an isolated process?
    Or can we have multiple app instances running on a single node.js instances?

    Many Thanks,

  20. How do I set a event triggering for each worker to do a “cleanup” when restarting?

  21. Great article ! Really easy to understand. Thank you for the effort 🙂

  22. Great Article! Thanks.

    Why would my Ansible deploy fail for the two hostnames?

    Do I need to setup ssh for these both when I have a private repo?

    • I suspect there is some trickery around the “SSH Agent Forwarding” & Ansible which isn’t in the tutorial …

      • Okay! Back again.

        I’ve managed to get this ssh think from a private repo to work for root user. (need to progress to web user)

        So let’s say Jeff’s repo was a private bitbucket repo. If you were to ssh into it you might use the following address:

        To deploy to your digital ocean fedora server using ansible you:

        1. Need to generate a ssh-keygen for the root user on the digital ocean server.
        2. Need to add the ssh-keygen to the bitbucket repo, which you can do under the settings for the individual repo, click on the deployment keys menu item and add the web user key which you can get from cat ~/.ssh/ from within the digtal ocean droplet
        3. In Dickey’s vars.yml file you would change the value for project_repo to
        4. In deploy.yml add accept_hostkey and key-file paramters to the git section. All on the same line it would end up looking like:
        git: repo={{ project_repo }} version=master dest={{ project_root }} accept_hostkey=yes key_file=~/.ssh/id_rsa

        So yeah. I only got this going for root user, but managed to get a private repo to deploy from bitbucket using Dickey’s ansible setup

        The following was most helpful in figuring out this much.

      • Bel et Bon anniversaire conraftoblement installée dans la Cuisine de Sophie!Je suis comme toi , je ressens toujours une urgence à bien profiter des premières heures du week end même si je travaille 3 fois sur 4 le samedi matin .mais ce jour la je prends mon travail différemment …Quant à ta recette , quelle bonne idée d’avoir transposée en version sucrée, la ratatouille .

    • have you tried using ssh keys?

  23. Great tutorial – just what I needed, thanks. I’ve struggled with Ansible before but this was just what I needed to get the hang of it.

    I tried it all out on Amazon AWS. Couple of things that threw me.
    – Amazon Linux doesn’t have nodejs by default so you need to add the EPEL repository. There are a few ways to do this – but I just added “yum-config-manager –enable epel” into my app.yml to fix that.
    – You also don’t get root on AWS, so you have to add remote_user: ec2-user and sudo: yes
    – Amazon linux doesn’t use systemd, so I then switched to Fedora to confirm it all worked…. worked great apart from by default for my Fedora has SELinux on and port 3000 is not open. I added this to my loadbalancer tasks:

    – name: open selinux for port 3000
    shell: semanage port –add –type http_port_t –proto tcp 3000

    and all fixed.

    The SELinux problem was especially hard to find as I couldn’t get haproxy logging. In the end I worked out how to get to the haproxy stats page at /haproxy?stats and that gave me some more clues (privilege problem on a port).

    Anyway -just posting this in case it’s useful for others following this.

  24. Nice article, there’s very little on deploying node apps out there. I’d like to see it taken a step further, however, since git pulling is no way to deploy apps, although many do this. The app really should be prepped and packaged and simply unpacked on the server, ready to run, through whatever package mechanism you like. And you can talk about this in an abstract way without specifically going into rpm or deb or msi or whatever. That would be really nice, as it is confusing how to package this stuff. Also in deployments a huge consideration is just where the heck do you install all the node packages? The system? the repo? What are the pitfalls of each, and so forth.


  25. cheap toms shoes,cheap toms,cheap toms shoe,discount toms shoes,discount toms,toms shoes,buy toms,toms shoes for sale,cheap toms shoes for sale,toms shoes outlet,toms shoes for sale,toms shoes,toms shoes outlet,toms shoes for sale,toms shoes cheap,toms sh

    Does this show that John himself did not consider the song to be of value? This view overlooks the fact that John, between February 1975 and June 1980, did not record any songs. Hyunda Elantra SE Heated leather seats and sunroof. “Crashes are the leading cause of death for 15 to 21 year olds,” says Nunes.
    cheap toms shoes,cheap toms,cheap toms shoe,discount toms shoes,discount toms,toms shoes,buy toms,toms shoes for sale,cheap toms shoes for sale,toms shoes outlet,toms shoes for sale,toms shoes,toms shoes outlet,toms shoes for sale,toms shoes cheap,toms shoes online,toms shoes for sale,cheap toms shoes for sale,toms shoes for cheap,pre order toms shoes,pre order cheap toms shoes,cheap toms shoes,buy cheap toms shoes,authentic toms shoes,authentic cheap toms shoes,buy toms shoes,buy cheap toms shoes,order toms shoes,order cheap toms shoes,toms shoes online,buy toms shoes online,toms shoes 2015,cheap toms shoes 2015

  26. Shop the best selection of in the online store for for ladies in various styles and colors. Many in popular styles and colors on sale now with special prices and free fast shipping and returns.

  27. Great information.!!
    What is your opinion on using nginx instead of haproxy. I read some articles about serving static content with nginx

  28. Great high level overview – thanks

  29. Peter Hanneman

    PM2 + Nginx accomplishes the same effect with considerably less work. Very prone to restart loops bringing your server to it’s knees as well. Also your log configuration would saturate disk I/O under any real load…

  30. Thanks for your helpful article.

    I have couple of questions. I’m using Google Compute Engine to deploy my app.
    I’ve followed your suggestion of using a user “web”

    I’m sshing into the machine using gcloud ssh , which is allowing me to access using my username.

    then i sudo su web to run the node app

    I’m using pm2 to run the app under web user.

    seems like this variation is creating some issue …

    So what i am trying to achieve is as following: each user of my team would ssh using gcloud ssh
    applications will run under a separate user which is “web”

    what would be your suggestion to get this setup achieved ?


Your feedback