One of the most difficult challenges you may find as a Software Engineer is to manually or automatically scale your applications. This may have been a really hard task to achieve some years ago, but not anymore with Heroku.

Please read on, so you can see how we approached this problem at Typisch Touristik Cloud Platform one of the products from Typisch Touristik GmbH, a big Quality Software provider in Germany in cooperation with Schmetterling International GmbH & Co. KG where we work as Contractor.

A bit about the business concepts

Typisch Touristik Cloud Platform is a stateless scalable web application intended to be accessed from anywhere in the world by any individual, group or tour agency to publish their travel packages, vacation rentals, hotels, accept bookings from other customers and more importantly to be able to delegate the configuration of all the platform to the customer himself.

Typisch Touristik Cloud Platform is composed of two main applications, the Mid/Back office which handles all the configuration, of every customer packages, payment methods, prices, seasons, hotels, so our customers can edit their IBEs online with just a rich cutting edge HTML editor (see Figure 1 below).

Figure 2, Typisch Touristik Cloud Platform Mid/Backoffice (Sails + ExtJS)

Once the customer is done setting up the Mid/Backoffice he can go ahead and point his domain to our IBE which gets automatically a new heroku domain assigned with a brand new IBE app running on a container (see Figure 2) or if he prefers then he can simply go with his own existing website which can be in any CMS (see Figure 3).

Figure 2, Typisch Touristik Cloud Platform IBE
Figure 3, Customer custom WordPress IBE using the Typisch Touristik Cloud Platform API to fetch the data/book/pay online

At this point I have walked you a bit around our business model. As you can see we require that our app can be easily auto scaled so we can support in theory millions of users concurrently. This is because we have N Clients that have N customers therefore, our app can’t be down for any reason.

So, enough talk, let’s get to the tech stack

  • Heroku (build, deliver, monitor and scale apps)
  • Node.js
  • Sails.js (Node.js framework with an embedded REST, Websocket API, Waterline ORM and much more, very simple, flexible and extendable).
  • Socket.io shipped within Sails.js
  • Github
  • Azure Redis Cache server
  • Kue is a priority job queue backed by redis, built for node.js.
  • MongoLab for MongoDB database
  • Express HTTP Server
  • Throng for one-liner for clustered Node.js apps. Runs X workers and respawns them if they go down. Correctly handles signals from the OS.
  • CircleCI for continuos integration and delivery
  • Mocha, Sinon, Chai, Nightwatch with Selenium for the tests suite
  • and other cool technologies/frameworks.

You may say, wow, that’s a lot to take in, but I am sure most of you guys may have tried them out at some point in your career.

The sails app structure

To be able to create your scalable app, you first need to have an already existing Sails.js app or create your new one.

In Sails.js an app uses an MVC architecture, therefore is basically composed by:

Figure 4, Sails.js app structure
  • Models (a class structure representation of the database table that using a driver can fetch queries of any database engine)
  • Controllers (containing all the logic for processing users requests)
  • Services (Allowing to the user to separate the database queries/heavy work out of the controller)
  • and Views that can be rendered with any configured templating engine like swig, ejs, handlebars, and served/cached as static content via Express.

For you to have an idea of our app structure, see figure 4 with a screenshot of our folder structure.

The biggest advantage that Sails.js brings to the game is organization, it kinda forces you to respect that frame, although everything is configurable. Therefore in a future you can always change the complete app by just switching a different config file or just installing a new npm package and your app remains the same.

Heroku

In Heroku, applications run in a collection of lightweight Linux containers called dynos. Heroku allows you to deploy your application in different environments completely for free with logs, database, hosting, and many other cool features, completely for free, until your dyno gets to sleep. However, if you want to get serious, you will have to pay 😉

I am not going to cover all details how to get your app to Heroku, but it should be pretty simple, as it has a complete integration with Git and allows you to import any repository at any time, as well as detect new changes and auto-deploy every time you merge your changes to the master or the the selected branch for your app in Heroku.

In Figure 5 below you can see how Heroku manages different environments of your app using different pipelines. Each pipeline is connected to a branch in Git and detects changes and automatically deploys if desired by the user.

Figure 5, Heroku pipelines.

At this point you got an understanding how Heroku manages apps and be ready for the next step.

From stateful to stateless

This is the biggest challenge, so please keep reading and you will find out how we dealt with it.

Before scaling with Heroku, you will need to convert your app from stateful to a stateless app, so that you can run your app in different servers/process/dynos/workers sharing the same session, database and other resources.

First, our app needs different processes

You may skip this step if your app does not need to run cron tasks or any heavy processing task that may delay the performance of your web application. But if it does, then you need to consider splitting your app in small sub-apps like Clock, Worker and Web.

Heroku has a very cool feature for this called “The Process Model”, allowing you to separate your app in small sub-processes so that your web app performance is not affected by a heavy cron task running in the background. Heroku Engineers say that your app should not be executed as monolithic entities. Instead, you should run them as one or more lightweight processes. During development this can be a single process launched via the command line. In production, you can then run as many processes as you wish (see Figure 6).

Figure 6, Heroku process type diagrams

In our case, our app is configured to run 3 different scalable processes, 2+ workers, 1 clock, 2 web processes and one additional release process that runs only after every release to do some changes in the database, send deployment emails, etc. Below is the Procfile configuration file in our app that tells to Heroku how it should deploy our apps.

release: yarn run after-release
web: yarn run start
worker: yarn run worker
clock: yarn run clock 

Therefore, we need 4 different sails apps, to separate the concerns of each process. See Figure 7 for details in how the process communicate to eachother.

  1. release: Release phase enables us to run certain tasks before and after a new release of your app is deployed. Release phase can be useful for tasks such as:
    – Sending CSS, JS, and other assets from your app’s slug to a CDN or S3 bucket
    – Running database schema migrations
  2. web: Runs a sails application with only the web process but no cron tasks or workers. Any heavy work that the app needs to do it will be passed to the worker task.
  3. worker: The worker is also a Sails app but without the main hooks, only the database connection, but no http, no pub/sub mechanism, no cronjobs hook, etc.
    The worker takes the jobs that have been sent to it for processing through Redis Cache Server with a Kue Job, does the heavy work and responds to the requester by changing the Job status to “complete”.
  4. clock: the clock.js app loads a Sails application without any hook, no database connection, no http, no web app, no views, except the configured cron tasks and Kue with a valid connection to the Redis server.
Figure 7, Process communicating through Redis using Kue (abstract)
 // clock.js
process.chdir(__dirname);

const _ = require('lodash');

// Ensure a "sails" can be located:
(function () {
 var sails;
 try {
  sails = require('sails');
 } catch (e) {
   console.error(e);
   return;
 }

 // Try to get `rc` dependency
 var rc;
 try {
  rc = require('rc');
 } catch (e0) {
  try {
   rc = require('sails/node_modules/rc');
  } catch (e1) {
   console.error('Could not find dependency: `rc`.');
   console.error('Your `.sailsrc` file(s) will be ignored.');
   console.error('To resolve this, run:');
   console.error('npm install rc --save');
   rc = function () {
    return {};
   };
  }
 }// we load the default sails configuration for the configured environment
 let defaultConfig = rc('sails');
 let config = _.merge(defaultConfig, {
  bootstrapTimeout: 10000,

  generators: {
   modules: {}
  },// and as we only want to run the clock, we lift sails with only
// the cronjobs hook running, no views, no http, nothing else, 
// just a clock
  hooks: {
   cronjobs: true,
   blueprints: false,
   controllers: false,
   endpoints: false,
   cors: false,
   csrf: false,
   grunt: false,
   http: false,
   i18n: false,
   logger: false,
   policies: false,
   pubsub: false,
   request: false,
   responses: false,
   session: false,
   sockets: false,
   views: false
  }
 });

 // Load sails with the clock environment
 sails.load(config, function (err) {
  sails.emit('loaded');

  if (err) {
   sails.log.error("Error, could not load clock.js");
   sails.log.error(err);
  }

  sails.log.info('Sails running from clock.js to send cron tasks via Redis to the worker.js');
 });
})(); 

Also you can see an example of one of our cron tasks, file updateFollowups.js

/**
*
@class cron.updateFollowups
*/
/**
* With this configuration the clock.js publishes a job every 2 minutes to Kue, so the worker.js can process it
*/
exports.schedule = '*/2 * * * *';
exports.report = false;// Method executed by the worker.js once a Job of type updateFollowups arrives
exports.process = async () => {
let followups = await Followups.find({
date: {'<': new Date()},
sent: false
}).populate('user').populate('createdBy');

return Promise.all(_.map(followups, exports.sendFollowupEmail));
};

Then the worker.js takes care of tasks coming from Kue:

/**
* worker.js
*
* Use `worker.js` to run your app without `sails load`.
* To start the server, run: `node worker.js`.
*
* This is handy in situations where the sails CLI is not relevant or useful.
*
* For example:
* => `yarn run worker.js`
*/

// Ensure we're in the project directory, so relative paths work as expected
// no matter where we actually lift from.
process.chdir(__dirname);

const throng = require('throng');
const os = require('os');
const _ = require('lodash');
const WORKERS = process.env.WEB_CONCURRENCY || os.cpus().length;

// Ensure a "sails" can be located:
function boot() {
var sails;
try {
sails = require('sails');
} catch (e) {
console.error(e);
return;
}

// Try to get `rc` dependency
var rc;
try {
rc = require('rc');
} catch (e0) {
try {
rc = require('sails/node_modules/rc');
} catch (e1) {
console.error('Could not find dependency: `rc`.');
console.error('Your `.sailsrc` file(s) will be ignored.');
console.error('To resolve this, run:');
console.error('npm install rc --save');
rc = function () {
return {};
};
}
}

let defaultConfig = rc('sails');
/// @todo: as we scale up in number of workers, reduce the number of concurrent jobs
const CONCURRENT_JOBS = process.env.CONCURRENT_JOBS || 3; // process up to 3 concurrent jobs

let config = _.merge(defaultConfig, {
bootstrapTimeout: 20000,
// indicates to sails that this is the worker, so we can skip some things at runtime
isWorker: true,
generators: {
modules: {}
},

hooks: {
// No cronjobs hook
cronjobs: false,
blueprints: false,
controllers: false,
endpoints: false,
cors: false,
csrf: false,
grunt: false,
logger: false,
policies: false,
pubsub: false,
request: false,
responses: false,
session: false,
sockets: false
}
});

// Load sails with the worker environment
sails.load(config, function (err) {
if (err) {
sails.log.error('Error, could not load worker.js');
sails.log.error(err);
}

sails.log.info('Sails running from worker.js to receive and run cron tasks and other events incoming from Kue ');

QueueService.getQueue().process('cron', CONCURRENT_JOBS, async (job, done) => {
let data = job.data || {};
let cronTask = sails.config.cron[data.name] || {};
let report = true === cronTask.report;

if (job.state() === 'active' && _.isFunction(cronTask.process)){
report && sails.log.debug(`Processing cron task '${data.name}' with job id: '${job.id}'`);
try {
await cronTask.process(job);
} catch (err) {
sails.log.error(`Error found while processing job '${data.name}' with id: '${job.id}'`);
sails.log.error(err);
}
// mark the job complete
job.complete();
done();
} else {
done();}
});
});
}// Run multiple workers per app using throng
throng({
workers: WORKERS,
lifetime: Infinity
}, boot);

It is important to point out that all these processes load the same Sails application, but requiring different hooks to improve the performance and still use all the capabilities of the framework like accessing to the ORM, services, controllers, etc.

Next, for every process in Heroku, we need a Dyno, the basic dynos initially will be enough, but in the future if your application grows in number of users, this may require an auto-scale strategy, you will find this blog post explaining how to auto-scale Heroku Apps.

Figure 8, Heroku Dynos per process distribution

Last but not least, scale the web process

To use Heroku containers to the maximum you can include the Throng library so that it runs your web app

const throng = require('throng');
const os = require('os');
// Specify how many times your app should be lifte
const WORKERS = process.env.WEB_CONCURRENCY || os.cpus().length;throng({
workers: WORKERS,
lifetime: Infinity
}, liftSailsApp );
function liftSailsApp(){
// Lift your app normally
}

Make the rest of the app stateless

Take for example the following configuration file for the staging environment. You can see here that the session is stored in MongoDB, the configuration for Sails sockets either via Long Polling or WebSockets will be sent through Redis MQ, so we can actually notify from the server (spawn in worker threads) any connected client.

/**
* Staging environment settings
*/

module.exports = {

port: process.env.PORT || 1337,
skipFixtures: false,
liftTimeout: 12000,

orm: {
_hookTimeout: 50000
},// For the web process we don't want to run cronjobs hooks: {
cronjobs: false
},

sockets: {
adapter: 'socket.io-redis',
url: process.env.REDIS_URL,
transports: ['websocket'],
_hookTimeout: 50000
},

models: {
migrate: 'safe'
},// Tell to sails to lift session data from a MongoDB Clustered database
session: {
adapter: 'connect-mongo',
url: process.env.MONGODB_URI,
collection: 'sessions',
saveUninitialized: false
},

connections: {
// Setup your MongoDB cluster url
mongodb: {
adapter: 'sails-mongo',
url: process.env.MONGODB_URI
},
// And initialize Kue with the redisUrl and the prefix for the given environment, note the prefix is added to every Job type kue: {
prefix: 'staging',
redisUrl: process.env.REDIS_URL
}
}
};

Give it a try

Create all required Heroku Settings or environment variables on your local PC and run the app.

// set WEB_CONCURRENCY=4yarn run app
Figure 9, Web Process running locally in 4 different threads.

Note, this will run on Heroku as many times (4 threads * dynos count), so, you can multiply your app easily.

Note: With the provided Procfile, heroku would lift all configured process per dyno for you on every deployment.

Figure 10, worker.js app running locally to process events incoming from Redis through Kue
Figure 11, clock.js running to send cron tasks to the worker using the configured Redis server through Kue

Conclusions

With this document I expect to have covered the doubts of many developers that may be struggling to setup in an easy way a stateless scalable application with Heroku and Sails.

Looking forward to hear your comments/suggestion/question/feedback from you guys.

Thanks for reading!!!

Spread the love