Jul 26, 2021

Actuators in Action

A Site Reliability Engineer's guide to using actuators effectively

Billy Malone

Senior Technology Engineer

Note: Some content in this article could potentially be outdated due to age of the submission. Please refer to newer articles and technical documentation to validate whether the content listed below is still current and/or best practice.

Want to learn about actuators and which ones may benefit you from a Site Resiliency Engineering (SRE) perspective? This article is for you!

After reading, you will have:

The ability to set up actuators in your Spring Boot project
A basic understanding of all endpoints provided by Spring
Useful actuator endpoints to help your application with resiliency

What is an actuator?

An actuator is a component of a machine that is responsible for moving and controlling a mechanism or system. Spring uses the term ‘actuator’ to label a feature set that provides “a number of additional features to help you monitor and manage your application when you push it to production.” These are also commonly referred to as “Production-ready Features.” Let’s dive in!

Security

Make sure the endpoints are secured and not open to the public if you are going to proceed with exposing actuator endpoints in your application. The information provided by the actuator endpoints should only be exposed on a ‘need to know’ basis. A best practice is to limit access to a group of identified users.

For Starters

Begin by adding the spring-boot-starter-actuator dependency to your pom:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

By bringing in that dependency, you will get two default actuators (/health and /info) exposed without any additional configuration.

What is available?

An override of the “management” stanza in your application yml or application.properties is required to configure the endpoints that are exposed:

management:
  endpoints:
    web:
      exposure:
        include: "*"  #NEVER DEPLOY WITH THIS WILDCARDED.  ONLY EXPOSE THE ENDPOINTS YOU NEED TO

Take a moment and look at some of the available endpoints in actuator. Change your above configuration to include all endpoints to observe this locally. DO NOT deploy your application to production with all endpoints enabled, given the nature of some of the information exposed. For convenience, the table below provides the name and description of endpoints you may find useful, along with examples of how to use them.

endpoint	description
beans	Displays a complete list of all the Spring beans in your application.
conditions	Shows the conditions that were evaluated on configuration and auto-configuration classes, and the reasons why they did or did not match.
health	Shows application health information.
info	Displays arbitrary application info.
loggers	Shows and modifies the configuration of loggers in the application.
metrics	Shows ‘metrics’ information for the current application.
prometheus	Exposes metrics in a format that can be scraped by a Prometheus server. Requires a dependency on micrometer-registry-prometheus.
mappings	Displays a collated list of all @RequestMapping paths.
threaddump	Performs a thread dump.
heapdump	Returns an hprof heap dump file.
httptrace	Displays HTTP trace information (by default, the last 100 HTTP request-response exchanges). Requires an HttpTraceRepository bean.

Big deal. What do these really do for me?

Here are examples of the information that is provided with each endpoint.

beans

An HTTP GET to /actuator/beans gives you a more detailed insight into the application’s spring context. Helpful information such as the scope, the resource it is loaded from, and any other bean dependencies are provided.

{
  "processingController": {
    "aliases": [],
    "scope": "singleton",
    "type": "org.example.resiliencyengineering.controller.ProcessingController$$EnhancerBySpringCGLIB$$ab316cf4",
    "resource": "file [/Users/someUserName/code/sample-rest-api/target/classes/org/example/resiliencyengineering/controller/ProcessingController.class]",
    "dependencies": [
      "demoNotificationService",
      "sreManifestProperties",
      "tracing"
    ]
  },
  "apiListingReader": {
    "aliases": [],
    "scope": "singleton",
    "type": "springfox.documentation.spring.web.scanners.ApiListingReader",
    "resource": "URL [jar:file:/Users/someUserName/.m2/repository/io/springfox/springfox-spring-web/3.0.0/springfox-spring-web-3.0.0.jar!/springfox/documentation/spring/web/scanners/ApiListingReader.class]",
    "dependencies": []
  }
}

conditions

An HTTP GET to /actuator/conditions provides you information on why an auto-configured class is being loaded, or why not. This could help with troubleshooting running configuration.

	"contexts": {
		"sample-rest-api-1": {
			"positiveMatches": {
				"AbstractBulkheadConfigurationOnMissingBean#bulkheadAspect": [{
					"condition": "OnBeanCondition",
					"message": "@ConditionalOnMissingBean (types: io.github.resilience4j.bulkhead.configure.BulkheadAspect; SearchStrategy: all) did not find any beans"
				}],
				"AbstractBulkheadConfigurationOnMissingBean#bulkheadRegistry": [{
					"condition": "OnBeanCondition",
					"message": "@ConditionalOnMissingBean (types: io.github.resilience4j.bulkhead.BulkheadRegistry; SearchStrategy: all) did not find any beans"
				}]
            },
            "negativeMatches": {
            "BulkheadMetricsAutoConfiguration#registerBulkheadMetrics": {
            "notMatched": [{
                          "condition": "OnPropertyCondition",
                          "message": "@ConditionalOnProperty (resilience4j.bulkhead.metrics.legacy.enabled=true) did not find property 'resilience4j.bulkhead.metrics.legacy.enabled'"
                          }],
            "matched": []
            }
            }
}

health

The health endpoint, when exposed, gives a list of health endpoints (both built-in and custom) that spring is aware of. Exposing this endpoint on your web service will also allow you to provide it as an http health check endpoint!

{
    "status": "UP",
    "components": {
        "clientConfigServer": {
            "status": "UNKNOWN",
            "details": {
                "error": "no property sources located"
            }
        },
        "discoveryComposite": {
            "description": "Discovery Client not initialized",
            "status": "UNKNOWN",
            "components": {
                "discoveryClient": {
                    "description": "Discovery Client not initialized",
                    "status": "UNKNOWN"
                }
            }
        },
        "diskSpace": {
            "status": "UP",
            "details": {
                "total": 499963174912,
                "free": 351552659456,
                "threshold": 10485760,
                "exists": true
            }
        },
        "ping": {
            "status": "UP"
        },
        "refreshScope": {
            "status": "UP"
        },
        "threadHealthCheck": {
            "status": "UP",
            "details": {
                "CurrentCount": 20
            }
        }
    }
}

Health check example for Pivotal Cloud Foundry (PCF)

An example of providing an http health check endpoint through your app’s manifest yaml is below. Consider providing a custom health check when doing this and not just tying to /health. Each level after component in the above JSON is a named health endpoint. For additional information on health checks in PCF, consult the product documentation

---
applications:
  - name: sample-rest-api
    buildpacks:
      - java_buildpack_offline
    health-check-type: http
    health-check-http-endpoint: /sample-rest-api/actuator/health/threadHealthCheck
    path: target/sample-rest-api-1.0.0-SNAPSHOT.jar
    services:
      - my-scs3-config-server

loggers

This endpoint provides a list of all configured loggers along with data regarding their configured or inherited level.

A GET on /actuator/loggers will give you a list of loggers.
A GET on /actuator/loggers/«namedLogger» will give you current configuration data for that logger.
A POST to /actuator/loggers/«namedLogger» allows you to change that named logger’s level.

For example, a get request to /actuator/loggers/org.example.resiliencyengineering yields the following JSON:

{
    "configuredLevel": null,
    "effectiveLevel": "INFO"
}

This JSON tells us that there is not currently a configured LOG level at this level in the code, so it inherits from the ROOT logger configuration for its effective level. INFO and above statements will get logged out.

In addition to viewing, you are also able to change the level of logging to your application by posting a JSON payload with a configured level to the named logger.

For example, a POST to /actuator/loggers/org.example.resiliencyengineering with the following payload:

{
"configuredLevel":"ERROR"
}

If you do a subsequent GET to loggers after the post, you see: • org.example.resiliencyengineering has a configured and effective level of ERROR • anything starting with org.example.resiliencyengineering inherits from the parent scope and gets an effective level of ERROR • the parent org.example logger retains its ‘effectiveLevel’ from ROOT logger

"org.example": {
            "configuredLevel": null,
            "effectiveLevel": "INFO"
        },
        "org.example.resiliencyengineering": {
            "configuredLevel": "ERROR",
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.DemoApplication": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.ThreadHealthCheck": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.alerts": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.alerts.service": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.alerts.service.DemoNotificationService": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.config": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.config.TracingAspect": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.controller": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        },
        "org.example.resiliencyengineering.controller.ProcessingController": {
            "configuredLevel": null,
            "effectiveLevel": "ERROR"
        }

Consult Changing the Logging Level at the Runtime for a Spring Boot Application for more context on log levels.

metrics and prometheus

The metrics endpoint will give you a list of named metrics that are provided for spring-boot apps:

{
    "names": [
        "http.server.requests",
        "jvm.buffer.count",
        "jvm.buffer.memory.used",
        "jvm.buffer.total.capacity"
      }

You can also view the current value of that metric by issuing a GET to /actuator/metrics/«metric-name».

For example, HTTP GET to /actuator/metrics/http.server.requests yields the following:

{
    "name": "http.server.requests",
    "description": null,
    "baseUnit": "seconds",
    "measurements": [
        {
            "statistic": "COUNT",
            "value": 4.0
        },
        {
            "statistic": "TOTAL_TIME",
            "value": 0.111817032
        },
        {
            "statistic": "MAX",
            "value": 0.004718371
        }
    ],
    "availableTags": [
        {
            "tag": "exception",
            "values": [
                "None"
            ]
        },
        {
            "tag": "method",
            "values": [
                "GET"
            ]
        },
        {
            "tag": "uri",
            "values": [
                "/actuator",
                "/actuator/loggers",
                "/actuator/metrics",
                "/actuator/httptrace"
            ]
        },
        {
            "tag": "outcome",
            "values": [
                "SUCCESS"
            ]
        },
        {
            "tag": "status",
            "values": [
                "200"
            ]
        }
    ]
}

The prometheus endpoint is a special endpoint requiring a micrometer dependency on your classpath. Think of it as a convenience endpoint that takes the metrics being exposed by spring and formats them for a prometheus time-series database.

An http GET to /actuator/prometheus yields the same metrics as above, but specially formatted metrics for prometheus. By exposing the metrics in this format, it is easy to provide the metrics to a prometheus stack and report/alert on them with a dashboard visualization software such as grafana. For more information, check out Actuator Metrics Monitoring with Prometheus and Grafana


# HELP http_server_requests_seconds  
# TYPE http_server_requests_seconds summary
http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/httptrace",} 1.0
http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/httptrace",} 0.02820752
http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/metrics",} 1.0
http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/metrics",} 0.004718371
http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/loggers",} 1.0
http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/loggers",} 0.02013252
http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator",} 1.0
http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator",} 0.058758621
http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/metrics/{requiredMetricName}",} 1.0
http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/metrics/{requiredMetricName}",} 0.025782743
# HELP http_server_requests_seconds_max  
# TYPE http_server_requests_seconds_max gauge
http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/httptrace",} 0.0
http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/metrics",} 0.0
http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/loggers",} 0.0
http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator",} 0.0
http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/metrics/{requiredMetricName}",} 0.025782743

mappings

This endpoint gives you a JSON description based on the methods you are exposing. This includes descriptions on actuator endpoints you are exposing. It describes which class in your application handles the request and provides conditional information on the operation:

[
  {
    "handler": "Actuator web endpoint health",
    "predicate": "{GET /actuator/health, produces [application/vnd.spring-boot.actuator.v3+json || application/vnd.spring-boot.actuator.v2+json || application/json]}",
    "details": {
      "handlerMethod": {
        "className": "org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping.OperationHandler",
        "name": "handle",
        "descriptor": "(Ljavax/servlet/http/HttpServletRequest;Ljava/util/Map;)Ljava/lang/Object;"
      },
      "requestMappingConditions": {
        "consumes": [],
        "headers": [],
        "methods": [
          "GET"
        ],
        "params": [],
        "patterns": [
          "/actuator/health"
        ],
        "produces": [
          {
            "mediaType": "application/vnd.spring-boot.actuator.v3+json",
            "negated": false
          },
          {
            "mediaType": "application/vnd.spring-boot.actuator.v2+json",
            "negated": false
          },
          {
            "mediaType": "application/json",
            "negated": false
          }
        ]
      }
    }
  },
  {
    "handler": "org.example.controller.ProcessingController#hello(String)",
    "predicate": "{GET /retrieve}",
    "details": {
      "handlerMethod": {
        "className": "org.example.controller.ProcessingController",
        "name": "hello",
        "descriptor": "(Ljava/lang/String;)Lorg/springframework/http/ResponseEntity;"
      },
      "requestMappingConditions": {
        "consumes": [],
        "headers": [],
        "methods": [
          "GET"
        ],
        "params": [],
        "patterns": [
          "/retrieve"
        ],
        "produces": []
      }
    }
  }
]

Proceed With Caution

The next three endpoints can provide useful troubleshooting information.
Due to the administrative functionality being exposed by these endpoints, remember to use caution when using them on a deployed application. Ensure access is limited to those who need it.

threaddump

By issuing an HTTP GET to /actuator/threaddump you can get a JSON representation of what that threaddump looks like. Additionally, if you pass an ‘Accept’ header set to text/plain, you can get the threaddump in a ‘tdump’ format which can be interpreted and analyzed by tooling. Keep in mind, you may need to target that specific instance while troubleshooting if you are deployed on multiple application instances.

Here’s an example of the json format:

{
    "threads": [
        {
            "threadName": "Reference Handler",
            "threadId": 2,
            "blockedTime": -1,
            "blockedCount": 16,
            "waitedTime": -1,
            "waitedCount": 0,
            "lockName": null,
            "lockOwnerId": -1,
            "lockOwnerName": null,
            "daemon": true,
            "inNative": false,
            "suspended": false,
            "threadState": "RUNNABLE",
            "priority": 10,
            "stackTrace": [
                {
                    "classLoaderName": null,
                    "moduleName": "java.base",
                    "moduleVersion": "11.0.2",
                    "methodName": "waitForReferencePendingList",
                    "fileName": "Reference.java",
                    "lineNumber": -2,
                    "className": "java.lang.ref.Reference",
                    "nativeMethod": true
                },
                {
                    "classLoaderName": null,
                    "moduleName": "java.base",
                    "moduleVersion": "11.0.2",
                    "methodName": "processPendingReferences",
                    "fileName": "Reference.java",
                    "lineNumber": 241,
                    "className": "java.lang.ref.Reference",
                    "nativeMethod": false
                },
              ...

Here’s an example of the tdump format:

2021-02-15 13:48:46
Full thread dump OpenJDK 64-Bit Server VM (11.0.2+9 mixed mode):

"Reference Handler" - Thread t@2
   java.lang.Thread.State: RUNNABLE
	at java.base@11.0.2/java.lang.ref.Reference.waitForReferencePendingList(Native Method)
	at java.base@11.0.2/java.lang.ref.Reference.processPendingReferences(Reference.java:241)
	at java.base@11.0.2/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:213)

   Locked ownable synchronizers:
	- None

"Finalizer" - Thread t@3
   java.lang.Thread.State: WAITING
	at java.base@11.0.2/java.lang.Object.wait(Native Method)
	- waiting on <68c9a538> (a java.lang.ref.ReferenceQueue$Lock)
	at java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
	at java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
	at java.base@11.0.2/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:170)

   Locked ownable synchronizers:
	- None

"Signal Dispatcher" - Thread t@4
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"Common-Cleaner" - Thread t@11
   java.lang.Thread.State: TIMED_WAITING
	at java.base@11.0.2/java.lang.Object.wait(Native Method)
	- waiting on <21e77d48> (a java.lang.ref.ReferenceQueue$Lock)
	at java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
	at java.base@11.0.2/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:148)
	at java.base@11.0.2/java.lang.Thread.run(Thread.java:834)
	at java.base@11.0.2/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:134)

   Locked ownable synchronizers:
	- None

heapdump

An entire heapdump can be taken on an application. Be advised that these can be quite large and will likely require loading into a specialized tool for analysis. Keep in mind, you may need to target that specific instance while troubleshooting if you are deployed on multiple application instances.

httptrace

If you have an HttpTraceRepository bean in your application, context spring provides information about your last 100 http interactions via an http trace. Here’s an example of the type of information you may get:

{
            "timestamp": "2021-02-15T19:19:53.754776Z",
            "principal": null,
            "session": null,
            "request": {
                "method": "GET",
                "uri": "https://localhost:8443/sample-rest-api/actuator/loggers",
                "headers": {
                    "cookie": [
                        "Cookie_1=value"
                    ],
                    "host": [
                        "localhost:8443"
                    ],
                    "connection": [
                        "keep-alive"
                    ],
                    "accept-encoding": [
                        "gzip, deflate, br"
                    ],
                    "user-agent": [
                        "PostmanRuntime/7.26.8"
                    ],
                    "accept": [
                        "*/*"
                    ]
                },
                "remoteAddress": null
            },
            "response": {
                "status": 200,
                "headers": {
                    "X-Frame-Options": [
                        "DENY"
                    ],
                    "Transfer-Encoding": [
                        "chunked"
                    ],
                    "Keep-Alive": [
                        "timeout=60"
                    ],
                    "Strict-Transport-Security": [
                        "max-age=31536000 ; includeSubDomains"
                    ],
                    "Cache-Control": [
                        "no-cache, no-store, max-age=0, must-revalidate"
                    ],
                    "X-Content-Type-Options": [
                        "nosniff"
                    ],
                    "Connection": [
                        "keep-alive"
                    ],
                    "Pragma": [
                        "no-cache"
                    ],
                    "Expires": [
                        "0"
                    ],
                    "X-XSS-Protection": [
                        "1; mode=block"
                    ],
                    "Date": [
                        "Mon, 15 Feb 2021 19:19:53 GMT"
                    ],
                    "Content-Type": [
                        "application/vnd.spring-boot.actuator.v3+json"
                    ]
                }
            },
            "timeTaken": 19
        }

What does it all mean?

Use a subset of the endpoints to help support your application if you are deployed as a spring-boot API. This article highlighted the endpoints spring provides that have been useful in my experience, but refer to Spring.io documentation to consider other endpoints.

I hope this post leaves you with a better understanding of actuator endpoints provided by Spring and provides a good starting point for utilizing them.

Resources

To learn more about technology careers at State Farm, or to join our team visit, https://www.statefarm.com/careers.

resiliency

actuators

SRE