Skip to main content

Triage AWS Lambda Error Rate LAMBDA_HIGH_ERROR_RATE

  1. Configure AWS Lambda Error Rate
  2. Use Case for AWS Lambda Error Rate
  3. Triage Guides by Violation Type
    1. Triage AWS Lambda Error Rate LAMBDA_HIGH_ERROR_RATE

Triage Guide: Lambda Function with High Error Rate

0) Understand

High error rates in AWS Lambda functions can lead to increased costs due to retries and indicate potential issues in the function's logic or the code it executes. Identifying and resolving these errors is crucial for maintaining optimal performance and cost-efficiency.

1) Validate

  • How to Validate: Check the Lambda function's CloudWatch metrics (Errors and Invocations) manually over the past 7 days to calculate the error rate.
  • Data Source: This policy uses AWS CloudWatch metrics for Errors and Invocations for the Lambda function.
  • Scan Schedule: The policy scans once every 24 hours.

2) Triage

  • Impact: Increased costs due to excessive retries, potential unavailability of services relying on this function, and possible underlying logic/code issues.
  • Worsening: Yes, the issue can get worse over time if not addressed, leading to higher costs and service disruptions.
  • Remediation Impact: Remediation may involve debugging and updating code, which could temporarily affect the function's availability.

3) Act

  • Steps:
    1. Review the code and logic implemented in the Lambda function.
    2. Fix any discovered bugs or inefficient code segments causing increased error rates.
    3. Test the function thoroughly to ensure the issue is resolved.

4) Reflect

  • Evaluate Thresholds: If this alert appears frequently but is not a significant issue, consider adjusting the error rate threshold.
  • Disable Policy: If the policy does not seem relevant for certain functions, consider disabling the policy for those functions.