Schema Validations

Schema Validations in AWS API Gateway

Dave Svatora - Lead Software Engineer

Validating inputs is a common problem that must be addressed in application software. Like many problems in information technology, there are many patterns available that provide a solution. This article will focus on one available solution for Application Programming Interfaces (API’s) deployed to AWS’s API Gateway. We will walk through setting up JSON Schema Validations, which is a good pattern for any applications leveraging API Gateways and serverless architectures. It follows OpenAPI 3 specifications, and makes validation logic part of an API’s interface, rather than custom logic that lives with your application code.

Experiments


Getting Started

Let’s start out by looking at an example JSON Schema Validation.

"myAttribute": {
    "type": "string",
    "enum": ["FOO"],
    "description": "My attribute is a string input, FOO is the only option"
}

In this example an attribute named myAttribute is defined as a string, and has 1 allowed value of FOO. We also provide an informative description, to aid our consumer on the purpose of the attribute. If you were to create a request with the myAttribute attribute that was not a value in the array, you would get this error message.

{
  "message":"[instance value (\"BAR\") not found in enum (possible values: [\"FOO\"])]"
} 

What are other types of JSON Schema Validations?

Pattern

"myId": {
    "type": "string",
    "pattern": "^(.{13}|.{10}|.{6})$",
    "default": "SOMEID",
    "description": "Id associated to another fictitious attribute in my api. 13, 10, and 6 character Ids are allowed"
}

In this example myId is defined as a string. That string must match the pattern which is a regular expression. Our pattern here requires the myId to be a length of 13, 10, or 6 characters. Also notice the default attribute, if the SOMEID attribute is not part of the request payload then it will automatically be populated with a value of SOMEID

Caveat: Using a regular expression pattern for a string fulfills the validation requirement, BUT the resulting error message is a bit ugly.

{
  "message":"[ECMA 262 regex \"^(.{13}|.{10}|.{6})$\" does not match input string \"FOOBARBAZ\"]"
} 

If your use case requires a more descriptive error message then you would be better off not using JSON Schema Validations

Enum Array

"myEnum": {
    "type": "string",
    "enum": [
      "FOO",
      "BAR",
      "BAZ"
    ],
    "description": "The allowed values example"
}

In this example an attribute named myEnum is defined as a string, and has an array of allowed values. If you were to create a request with the myEnum attribute that was not a value in the array, you would get this error message.

{
  "message":"[instance value (\"QUX\") not found in enum (possible values: [\"FOO\",\"BAR\",\"BAZ\"])]"
} 

Max Min

"myThirteenId": {
    "type": "string",
    "maxLength": 13,
    "minLength": 13,
    "default": "ID12345678901",
    "description": "My Identifier that must be exactly 13 characters."
}

In this example an attribute named myThirteenId is defined as a string, and has a maxLength and minLength of 13. If you were to create a request with the myThirteenId attribute that was not exactly 13 characters in length you would get one of two error messages.

{
  "status": 400,
  "message": "body.myThirteenId should NOT be shorter than 13 characters"
}

or

{
  "status": 400,
  "message": "body.myThirteenId should NOT be longer than 13 characters"
}

Additional Schema References

Required

Another schema validation that can be helpful is making an input attribute required. This can be accomplished directly in the schema object referencing the properties that are required.

{
    "$schema": "http://json-schema.org/draft-04/schema#",
        "type": "object",
        "properties": {
            "myAttribute": {
            ...
            },
        },
        "required": [
            "myAttribute",
            ...
        ]
}

Based on the example above, if you tried sending a request where myAttribute was not included, you would get this error message back.

{
  "message":"[object has missing required properties ([\"myAttribute\"])]"
} 

API Gateway

So how does this apply to AWS’s API Gateway you ask? Well good question, for API Gateways, you can associate Models with an api Method. Then for the API method you need to enable a Request Validator. Here are a couple guides that walk through the steps in the AWS Console.

Defining the model

We are almost ready to dig into Terraform, but before we do, let’s attempt to map our examples thus far to a model schema. At the time of writing this article AWS API Gateway supports model schemas referencing JSON schema draft 4

/* my_schema.json */

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "myAttribute": {
      "type": "string",
      "enum": ["FOO"],
      "description": "My attribute is a string input, FOO is the only option"
    },
    "myId": {
      "type": "string",
      "pattern": "^(.{13}|.{10}|.{6})$",
      "default": "SOMEID",
      "description": "Id associated to another fictitious attribute in my api. 13, 10, and 6 character Ids are allowed"
    },
    "myEnum": {
      "type": "string",
      "enum": [
        "FOO",
        "BAR",
        "BAZ"
      ],
      "description": "The allowed values example"
    },
    "myThirteenId": {
      "type": "string",
      "maxLength": 13,
      "minLength": 13,
      "default": "ID12345678901",
      "description": "My Identifier that must be exactly 13 characters."
    }
  },
  "required": [
    "myAttribute",
    "myId",
    "myEnum",
    "myThirteenId"
  ]
}

Terraform

Reading through the above docs helped me understand what is required to set up the validations in API Gateway. Now we need to build it using Terraform. Here is a visual depiction showing the specific terraform resources needed for JSON schema validations.

Terraform Resource Structure

Resource Description
api_gateway_rest_api Defines the rest api
api_gateway_resource Defines the rest resource, more specifically the resource path. ex: https://myapi/foo foo would be the resource.
api_gateway_gateway_response Defines responses that come directly from the gateway rather than integration (lambda) responses
api_gateway_model Defines the JSON Schema
api_gateway_request_validator Defines a validator and what types of validations to enable (Body and Parameters)
api_gateway_method Defines the REST api method (GET, POST etc) and ties together the validator and model
api_gateway_method_response Defines the method responses (200, 400, etc) and models associated with each response

Rest API

We will not go into depth of how to define an api_gateway_rest_api other than its central role in managing all the terraform resources for an api gateway. Each terraform resource needed for JSON schema validations gets tied to the rest_api_id.

resource "aws_api_gateway_rest_api" "my_api" {
  name = "myapi"
  ...
}

Resource

Define your rest resource paths, here we create foo resource.

resource "aws_api_gateway_resource" "my_resource" {
  parent_id   = aws_api_gateway_rest_api.my_api.root_resource_id
  path_part   = "foo"
  rest_api_id = aws_api_gateway_rest_api.my_api.id
}

Model

Next let’s create the Model, the api_gateway_model has attributes that define each model. JSON schema’s will always be content type of application/json, and the schema for the model can be read in from a file. The model is associated to the dependent rest_api_id.

resource "aws_api_gateway_model" "my_model" {
  rest_api_id  = aws_api_gateway_rest_api.my_api.id
  name         = "my_model_name"
  description  = "my_model_description"
  content_type = "application/json"

  schema = file("${path.root}/my_schema.json")
}

Validator

The api_gateway_request_validator resource is basically a toggle for what types of validations to perform for a rest api. You can validate a method’s request body or request parameters. The validator is associated to the dependent rest_api_id.

resource "aws_api_gateway_request_validator" "my_validator" {
  name                        = "my_validator_name"
  rest_api_id                 = aws_api_gateway_rest_api.my_api.id
  validate_request_body       = true
  validate_request_parameters = false
}

Method

The api_gateway_methods of a rest api define its behaviors like authorization, request parameters, http method (GET, POST, etc), and request models. It also can be associated with a request validator, api resource, and rest api.

resource "aws_api_gateway_method" "my_method" {
  rest_api_id          = aws_api_gateway_rest_api.my_api.id
  resource_id          = aws_api_gateway_resource.my_resource.id
  http_method          = "POST"
  request_parameters   = {}
  authorization        = "NONE"
  request_models       = tomap({ "application/json" = aws_api_gateway_model.my_model.name })
  request_validator_id = aws_api_gateway_request_validator.my_validator.id
}

At this point we have terraform to create an api_gateway that validates requests based on a JSON schema. When requests come in to a resource method, each attribute in the request body will have to conform to its validations in order to move on to the next step (integration). However, we also need to handle the responses if validations fail. The resources below allow detailed validation error messages to be returned to api consumers.

Method Responses

An api_gateway_method_response describes a possible status code and response model used when returning that response. In our case we will just use the default Empty and Error models.

resource "aws_api_gateway_method_response" "my_method_response_200" {
  rest_api_id     = aws_api_gateway_rest_api.my_api.id
  resource_id     = aws_api_gateway_resource.my_resource.id
  http_method     = aws_api_gateway_method.my_method.http_method
  status_code     = "200"
  response_models = { "application/json" = "Empty" }
}

resource "aws_api_gateway_method_response" "my_method_response_400" {
  rest_api_id     = aws_api_gateway_rest_api.my_api.id
  resource_id     = aws_api_gateway_resource.my_resource.id
  http_method     = aws_api_gateway_method.my_method.http_method
  status_code     = "400"
  response_models = { "application/json" = "Error" }
}

Gateway Response

A api_gateway_gateway_response is needed to map any validation error messages. If this is not included you will get the default message of Invalid request body. Each schema validation provides a detailed error message, so this is desired over the default. The api-gateway-mapping-template-reference is a great reference document to find $context variables available for api gateways. In this case we are mapping the detailed $context.error.validationErrorString to the Error model’s message attribute. Also note the response_type of BAD_REQUEST_BODY which was found in this list of valid gateway responseTypes

resource "aws_api_gateway_gateway_response" "my_gateway_response_400" {
  rest_api_id   = aws_api_gateway_rest_api.my_api.id
  status_code   = "400"
  response_type = "BAD_REQUEST_BODY"

  response_templates = {
    "application/json" = "{ \"message\": \"$context.error.validationErrorString\" }"
  }
}

Apply

We now have the resources ready to apply schema validations to our api gateway using terraform apply. Once the AWS infrastructure is deployed, automated tests can be created to exercise the validations as part of a continuous delivery pipeline.

Manual Test

A simple curl command can suffice for manually testing.

Invalid input

Request

curl -H "Content-type: application/json" \
-d @my_json.json \
-X POST https://myapi/foo 

# my_json.json
{
  "myAttribute": "BAR", # invalid
  "myId": "SOMEID",
  "myEnum": "FOO",
  "myThirteenId": "ID12345678901"
}

Response

{ "message": "[instance value (\"BAR\") not found in enum (possible values: [\"FOO\"])

Missing required

Request

curl -H "Content-type: application/json" \
-d {} \ # empty
-X POST https://myapi/foo 

Response

{ "message": "[object has missing required properties ([\"myAttribute\",\"myId\",\"myEnum\",\"myThirteenId\"])]" }

Conclusions

  • JSON Schema Validations are an effective pattern to validate request input in AWS API Gateway.
  • These validations help shield your function code from malicious attackers by stopping harmful input before they can get into your functions.
  • Schema Validations are standardized and conform to OpenAPI 3 specifications.
  • Schema Validation error messages can be somewhat cryptic especially for regex patterns

Additional notes

An optional setting of the api_gateway_stage resource configures logging API Gateway messages to cloudwatch. This might be useful if you want to see logs related to schema validation errors. The api_gateway_stage is also used to to enable X-Ray.

Testing

After deploying the API Gateway through terraform we need to validate that each of our schema validations works as we expected. One thing to note is previously these validations existed in the application code so there were associated Unit level tests, now those tests are required in integration / acceptance tests.

Clean up

One final piece of work is to clean up the obsolete validations and unit tests in the lambda functions.


To learn more about technology careers at State Farm, or to join our team visit, https://www.statefarm.com/careers.