How to add Readiness Checks for Google App Engine on Django

23 Jun 2020

Readiness Checks

A while ago I was tasked with adding readiness checks for a Django App. This wasn't as straight forward as I had hoped, so here's a tutorial for the next person who has to do this.

The very first thing you need to do is add a new Middleware, say, HealthCheckMiddleware to your Django app. Adding an API via a middleware and not a view might seem strange at first, but it's important that you do this, because your readiness check must be executed before anything else, including Django's inbuilt middleware chain. The skeleton for the middleware will look like:

class HealthCheckMiddleware:
  def __init__(self, get_response):
      self.get_response = get_response

  def __call__(self, request):
      return self.get_response(request)

In your settings.py, make sure this middleware is executed before everything else by adding it to the top of the list of the middlewares:

MIDDLEWARE = [
  'path.to.HealthCheckMiddleware',
  'django.middleware.common.CommonMiddleware',
    .
    .
    .
  ]

The next step is to write code to handle the request itself in the middleware. Readiness usually means that your app is up and can connect to the database. Since this API is going to be called pretty frequently by App Engine, we need to make sure the operation is as light as possible:

def __call__(self, request, conn=None):  # conn allow tests to pass custom db conn objects to test failures
  if request.method == "GET":
      if request.path == "/readiness_check/":
          try:  # try to connect to the database
              if not conn:
                  from django.db import connection as conn
              cursor = conn.cursor()
              cursor.execute("SELECT 1;")
              row = cursor.fetchone()
              if row is None:
                  return HttpResponse(503)
              return HttpResponse(status=200)  # No errors, return 200 OK

          except Exception:  # Any error while connecting, return 503
              return HttpResponse(status=503)

return self.get_response(request)

The final step is to tell Google App Engine how and when to call this API. Add the following to your app.yaml:

readiness_check:
path: "/readiness_check/"
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300

You should consider tweaking these values. For an explanation of what they mean, check out the docs here.

Unit Tests

Nothing is complete without adding unit tests. To do this, we'll add 2 unit tests, one that tests if the API returns 200 OK on a successful connection to the database, and a test for checking if there's no connection to the database (and if the 503 is indeed being returned). Simulating the broken database postgresql connection in Django was a little tricky, since I couldn't find documentation online, and using tools like destroy_test_db freaks out the tests as Django thinks something has gone wrong. I was stuck for quite a while on this when my mentor suggested this great idea of passing a database connection object to the middleware, so it can use that instead of using the default connection object. Since we now control what connection the middleware uses, we can pass a mock database object to it (and break it using a side_effect). Using mocks revealed a host of new Google search results, compared to what I was searching (and struggling with) before.

Add the following code to your tests.py:

  class HealthCheckMiddlewareTestCase(TestCase):
      readiness_url = 'https://your-domain.com/readiness_check/'

      def test_readiness_okay(self):
          request = RequestFactory().get(self.readiness_url)
          health_check = HealthCheckMiddleware(None)
          response = health_check(request)
          self.assertEqual(response.status_code, 200)

      def test_readiness_failure(self):
          with patch.object(psycopg2, 'connect') as connect_method:
              connect_method.cursor.side_effect = Exception(
                  'Random Database Connection Error')

              request = RequestFactory().get(self.readiness_url)
              health_check = HealthCheckMiddleware(None)
              response = health_check(request, connect_method)
              self.assertEqual(response.status_code, 503)

And that's it! You can see if it worked fine by running python manage.py test tests.HealthCheckMiddlewareTestCase. You can now be rest assured that App Engine will only route traffic to your instance only when it can successfully connect to your database!