How did I do?*

Introduction to request validation in Express.js

Introduction

In this guide we will cover the basics around what request validation is, and why it's needed, and create a few simple API endpoints which require validation on a number of parameters. The files and code will be structured in a way to keep logic isolated for easier maintainability.

Assumptions

  • Familiarity with JavaScript and/or TypeScript
  • Familiarity with Node.js and/or Express.js beneficial but not required
  • Familiarity with RESTful APIs and the HTTP request/response lifecycle beneficial but not required
  • If following along the steps, an Express.js application already configured to listen for requests

What is request validation?

When a HTTP request hits an API, the server compares the resource requested with the routes defined in a web application's route table. Routing processes typically manage this by matching the request's method (GET, POST etc.) and resource URL, along with any query strings, query parameters, or request body objects, against the routes defined in this configuration.

In Express applications, unlike strongly-typed frameworks like .NET where model binding happens automatically, things like parameters or JSON objects aren't type-validated out of the box, so if a request matches the rough structure of a defined route, then the endpoint assigned to that route will be executed. This behaviour can be problematic if the values provided in the request aren't of the correct type, or in the expected format to successfully process the request, as they will often throw an error further down the line.

Request validation helps ensure that these sorts of errors are identified as early as possible in the request lifecycle. Take, for example, the following endpoint signature for a GET request, which will return a user based on their ID:

// @desc Retrieve a single user.
// @route GET /api/user/:id
// @access Private
async function getUserById(req: Request, res: Response) {
  const userId = req.params.id as string

  // ...
}

Although I'll be writing my examples in TypeScript, it's all transpiled into simple JavaScript when deployed, so the API's consumer will have no knowledge of the type expected for userId, and they won't know if the format is wrong until an error occurs.

await fetch('https://example.com/api/user/1')

Will this fetch request work? The endpoint above will be hit because it matches the route /api/user/:id, so technically yes it will work, but as the value provided in the request is an integer, there's a chance this will error when querying the database because the expected type is being cast to a string. Additionally, although the userId is identified as a string in the function body, what sort of string will it be? An assortment of letters and numbers, or a UUID? These are the types of assertions we can define with request validation middleware.

Request validation should be restricted to simple checks, avoiding any complex logic which would normally be abstracted into a business logic layer or separate piece of middleware. These checks may include:

  • defining required fields
  • asserting the data type is correct
  • asserting values are of a predefined length, or
  • values fall within an acceptable range of options

Why do we need request validation?

As mentioned in the previous section, one of the primary reasons for request validation is to catch potential type-related errors before they occur, but there are a few more notable benefits, such as

  • improved user experience
  • preventing invalid function execution
  • improved response times
  • heightened security

User experience

If an error is raised by a database or ORM because the query referenced an integer rather than a string, this error will bubble up to the consumer, unless caught and handled, and typically throw a HTTP 500 Internal Server Error, or similar. This results in a poor user experience because the user doesn't know what the problem is, or how to remedy it. It also implies that there's something wrong with the application, which could impact consumer confidence and brand reputation.

However, if the request was validated before reaching the data access layer, the user would see a 400 Bad Request, ideally with information explaining what the problem is. Even without additional context however, it's clear from the status code that the issue lies with the data provided in the request, rather than a fault in the application.

Efficiency

The previous section identified issues whereby poor or insufficient request validation may result in errors as the request works its way through the application layers, but this also presents a problem with efficiency.

If a request which we know won't provide the expected result makes its way through one or more service layers, a data access layer, or to external cloud services, we've wasted time processing the request, opening and closing connections between services, along with any associated physical resources in the form of server load on CPU and RAM, all of which could have been prevented before any execution took place.

Performance

Controllers and services within an application typically make use of packages and other services, each of which needs to be imported or instantiated. Since validation is performed prior to a request reaching a controller, an error response is returned almost immediately because no time or resources are expended constructing the objects required by the controller. On the other hand, if a request needs to work its way through numerous layers of architecture before realising there's a fault, it will take longer to process, and in high volume systems, put unnecessary strain on the resources available.

Security

As the old adage goes, "never trust user input". Any value which comes from a consumer should be validated not just to prevent genuine mistakes and handle unexpected data, but also to prevent or reduce the threat of malicious intent.

Common attack vectors which can originate from improperly validated user input can include cross-site scripting (XSS), and SQL injection. Ensuring all inputs are validated for things like data types and text content help sanitise the inputs and reduce the likelihood of these attacks occurring.

A practical example using express-validator

In the following sections we'll go through the few simple steps required to configure request validation for a single resource type, using a directory and coding structure which aims to help maintain a logical separation of concerns.

Note: This section will only cover the steps required to add validation to the request pipeline. Anything else, such as Express.js or TypeScript configuration, is beyond the scope of this article. There are plenty of resources available online which cover these topics in detail already, examples of which can be found in the Resources section in the sidebar.

Define the project structure

Install the required packages

npm i express-validator

and create a folder structure to isolate controllers, routes, middleware and validators.

Example directory structure for request validation
Example directory structure for request validation

Note: the express-validator package can only be used to validate properties contained within the Express.js Request object. If you use a custom request type which includes additional properties, for example a user object, these cannot be validated using this library. However, this type of validation should ideally be performed either by separate middleware which handles authentication and authorisation logic, or a service layer which handles business logic.

Add a controller

The resource we'll be using in this example will be /user. We will have two simple methods; a GET endpoint which retrieves a single user by their ID, and a POST request to create a new user. Begin by adding a new file called userController.ts in the controllers directory, and adding the following functions, the first of which you may recognise from the beginning of the article.

async function getUserById(req: Request, res: Response) {
  const userId = req.params.id as string

  console.log('Hit endpoint for getUserById')
}

async function createUser(req: Request, res: Response) {
  const { name, email, dateOfBirth, role, comments } = req.body;

  console.log('Hit endpoint for createUser')
}

export {
  getUserById,
  createUser,
}

These functions accept the request and response arguments typical for an Express.js endpoint. The former assumes the presence of a query parameter named id, and the latter a selection of properties formatted as a JSON body. Since we're not including any unnecessary logic in this example, we'll just write to the console to indicate when each endpoint is hit.

Add request validation rules

Now for the "fun" bit. The structure of the methods available with express-validator allow you to define rules in a fluent manner, making it fairly simple to read and understand. All parts of a standard Express.js request can be validated against, these include:

  • body
  • cookies
  • headers
  • query
  • params

Additionally, the check function compares the specified keys across each of these properties if, for whatever reason, the location can vary, or is unknown ahead of time.

For each method which requires validation, create an array to group the fields which form the request data. For our GET method, all we need is the user's ID from the query parameters list. The POST request requires several fields from the request's body, some of which may need to be validated for numerous conditions.

The following example makes use of several of the more common validation options. There are plenty more available, which are detailed in the package documentation, but the validators used here are summarised below:

import { body, param } from 'express-validator'

const getUserByIdValidator = [
    param('id')
        .isUUID()
        .withMessage('Valid user ID is required (expects: UUID)'),
}

const createUserValidator = [
    body('name')
        .trim()
        .notEmpty().withMessage('Name is required')
        .isAlpha()
        .isLength({ max: 100 }).withMessage('Name cannot exceed 100 characters'),

    body('email')
        .isEmail().withMessage('Valid email is required')
        .isLength({ max: 320 }).withMessage('Email cannot exceed 320 characters')
        .normalizeEmail(),

    body('dateOfBirth')
        .isISO8601({ strict: true })
        .withMessage('Valid date of birth is required (expected: ISO8601 yyyy-MM-dd)')
        .custom((value) => {
            return value < new Date(Date.now()).getFullYear() - 21
        })
        .withMessage('User must be at least 21 years of age'),

    body('role')
        .isIn(["USER", "MANAGER", "ADMIN", "SYSTEM_ADMIN"])
        .withMessage('Valid role is required'),

    body('comments')
        .optional()
        .isLength({ max: 500 })
        .withMessage('Comments cannot exceed 500 characters in length'),
]

export {
    getUserByIdValidator,
    createUserValidator,
}

 

  • id:
    • should be formattable as a UUID, so another value such as an integer or a string which can't be parsed into a UUID will fail validation
  • name:
    • treats whitespace as invalid by trimming the value before checking for empty
    • accepts only alphabetical characters, thus excluding symbols and other special characters
    • is limited in character length to match a database column's limit to avoid truncating data
  • email:
    • matches the standardised email format
    • cannot exceed the maximum standardised allowable length for an email address
    • manipulates the address by normalising the input, which helps avoid duplicates caused by differences in casing (e.g. joe.bloggs@acme.com, Joe.Bloggs@acme.com)
  • dateOfBirth:
    • ensures the date format matches the ISO8601 standard
    • implements a custom validator function to determine if the user is over 21
  • role:
    • ensures the value matches an option from a pre-defined range of values
  • comments:
    • identifies the field as optional, but if populated with a value, limited to 500 characters

Add request validation middleware

The rules defined in the previous section are just that, definitions. They don't do anything on their own, so we need to add some middleware which identifies the errors, and generates a response to react to any validation errors which are detected.

In the middleware directory, create a new file called requestValidationMiddleware.ts with the following code:

import { NextFunction, Request, Response } from 'express'
import { validationResult } from 'express-validator'

export default function requestValidationMiddleware(req: Request, res: Response, next: NextFunction) {
    const errors = validationResult(req)

    if (!errors.isEmpty()) {
        return res.status(400).json({ errors: errors.array() })
    }

    next()
}

If you have experience with Express.js middleware then the signature of this function will look familiar. If not, I recommend having a quick look at the official documentation, which is referenced in the Resources section.

The logic contained within the function creates an object of type Result<ValidationError>, which is populated based on the result of the rules defined in the previous section, and returns a 400 Bad Request to the consumer if any errors are found. This response contains an array of details which helps identify the affected fields, an example of which may look something like this:

"errors": [
  {
    "type": "field",
    "value": "",
    "msg": "Valid email is required",
    "path": "email",
    "location": "body"
  }
]

If no errors are detected, the middleware triggers next(), which executes the next piece of middleware in the request pipeline. Our example doesn't include any additional middleware, so the next step would be to execute the controller function.

Define the route

In order to bring all of this together, we need to define the route which allows Express.js to trigger validation before matching the request URL to our controller function. Create a file under the routes directory called userRoutes.ts with the following code:

import express from 'express'
const router = express.Router()

import { getUserById, createUser } from '../controllers/userController.js'
import requestValidationMiddleware from '../middleware/requestValidationMiddleware.js'
import { getUserByIdValidator, createUserValidator } from '../middleware/validators/userValidators.js'

router.route('/:id')
    .get(getUserByIdValidator, requestValidationMiddleware, getUserById)
    .post(createUserValidator, requestValidationMiddleware, createUser)

export default router

Ignoring the standard Express.js routing syntax for brevity - each route (only one in our case) and each method (GET and POST) identifies the Validator ruleset applicable to that function, followed by the middleware which actually performs the validation, which then either returns a 404 Bad Request, or allows the application to proceed to controller execution.

If you send a few requests to these endpoints using something like Postman, curl, or Powershell's Invoke-RestMethod with valid parameters, you should see messages appear in the app console confirming the endpoints executed successfully, and sending invalid data should return errors detailing the offending fields.

Conclusion

In summary, we've learned what request validation involves and why it's important, and implemented a couple of examples using the express-validator package with a few of the more common validation methods.