Skip to main content

CloudWatch Log Parser

A Python 3.12 AWS Lambda that splits PHP-FPM-bundled CloudWatch log events into individual, queryable Wonolog JSON entries.


Backgroundโ€‹

Ymir runs WordPress on AWS Lambda using PHP-FPM. PHP-FPM buffers all stderr output for an entire request and flushes it as one blob when the request ends. This means a single CloudWatch log event may contain dozens of Wonolog JSON objects concatenated together, making per-entry searching impossible.

This Lambda subscribes to the Ymir website log group via a CloudWatch Logs subscription filter. For every incoming event it:

  1. Decodes the base64+gzip CloudWatch Logs subscription payload.
  2. Scans each raw log event for embedded Wonolog JSON objects using brace-depth tracking.
  3. Re-publishes each individual JSON entry as its own event to a clean destination log group (/scottsdalemint/{environment}/structured-logs).

AWS Resourcesโ€‹

ResourceName
Lambda function (development)scottsdalemint-development-cloudwatch-log-parser
Lambda function (staging)scottsdalemint-staging-cloudwatch-log-parser
Lambda function (production)scottsdalemint-production-cloudwatch-log-parser
IAM role (shared)scottsdalemint-cloudwatch-log-parser-role
Destination log group (development)/scottsdalemint/development/structured-logs
Destination log group (staging)/scottsdalemint/staging/structured-logs
Destination log group (production)/scottsdalemint/production/structured-logs

Prerequisitesโ€‹

  • AWS CLI installed and configured (aws configure) with a user/role that has:
    • lambda:* on the target functions
    • iam:CreateRole, iam:PutRolePolicy, iam:GetRole
    • logs:PutSubscriptionFilter, logs:DescribeSubscriptionFilters
  • Bash (WSL is fine on Windows)
  • Python 3.12 is not required locally โ€” boto3 is built into the Lambda runtime and there are no extra pip dependencies.

Deploying / Re-deployingโ€‹

Run from the tools/cloudwatch-log-parser/ directory:

# Deploy or update staging
./deploy.sh staging

# Deploy or update production
./deploy.sh production

The script is fully idempotent โ€” it will create resources on first run and update them on subsequent runs.

What the script doesโ€‹

  1. Zips lambda_function.py into build/function.zip.
  2. Creates the shared IAM role (scottsdalemint-cloudwatch-log-parser-role) if it does not exist, with a policy allowing:
    • Write to /aws/lambda/scottsdalemint-* log groups (Lambda execution logs).
    • Write to /scottsdalemint/* log groups (parsed structured output).
  3. Creates or updates the Lambda function with a 60-second timeout and 256 MB memory.
  4. Grants CloudWatch Logs resource-based permission to invoke the function.
  5. Prints the aws logs put-subscription-filter command you need to run manually (see below).

Attaching the Subscription Filterโ€‹

After deploying, attach a subscription filter on the Ymir website log group. The Ymir log group name follows this pattern:

/aws/lambda/ymir-scottsdalemint-{environment}-website

Find the exact name in the CloudWatch Log Groups console or via CLI:

aws logs describe-log-groups \
--log-group-name-prefix '/aws/lambda/ymir-scottsdalemint' \
--region us-west-2 \
--query 'logGroups[*].logGroupName' \
--output table

Then attach the filter (replace the log group name as needed):

# Development
aws logs put-subscription-filter \
--log-group-name '/aws/lambda/ymir-scottsdalemint-development-website' \
--filter-name 'scottsdalemint-development-parser' \
--filter-pattern '' \
--destination-arn 'arn:aws:lambda:us-west-2:253490776993:function:scottsdalemint-development-cloudwatch-log-parser' \
--region us-west-2

# Staging
aws logs put-subscription-filter \
--log-group-name '/aws/lambda/ymir-scottsdalemint-staging-website' \
--filter-name 'scottsdalemint-staging-parser' \
--filter-pattern '' \
--destination-arn 'arn:aws:lambda:us-west-2:253490776993:function:scottsdalemint-staging-cloudwatch-log-parser' \
--region us-west-2

# Production
aws logs put-subscription-filter \
--log-group-name '/aws/lambda/ymir-scottsdalemint-production-website' \
--filter-name 'scottsdalemint-production-parser' \
--filter-pattern '' \
--destination-arn 'arn:aws:lambda:us-west-2:253490776993:function:scottsdalemint-production-cloudwatch-log-parser' \
--region us-west-2
note

Each log group can have at most 2 subscription filters. Check existing filters before adding:

aws logs describe-subscription-filters \
--log-group-name '/aws/lambda/ymir-scottsdalemint-production-website' \
--region us-west-2 \
--query 'subscriptionFilters[*].{Filter:filterName,Destination:destinationArn}' \
--output table

Verifying It Worksโ€‹

1. Check the Lambda's own logsโ€‹

Trigger a page load on the environment, then:

aws logs tail '/aws/lambda/scottsdalemint-production-cloudwatch-log-parser' \
--since 5m \
--region us-west-2

If the log group does not exist yet, the Lambda has never been invoked โ€” check that the subscription filter is attached.

2. Check the structured outputโ€‹

aws logs tail '/scottsdalemint/production/structured-logs' \
--since 5m \
--region us-west-2

Each event should be an individual Wonolog JSON object, for example:

{
"message": "Fatal error: ...",
"level": 500,
"level_name": "CRITICAL",
"channel": "SUMA",
"datetime": "2026-04-27T12:00:00+00:00",
"extra": {}
}

3. Sanity-test Lambda invocationโ€‹

No real payload is needed:

aws lambda invoke \
--function-name 'scottsdalemint-production-cloudwatch-log-parser' \
--region us-west-2 \
--payload '{"awslogs":{"data":"H4sIAAAAAAAAE6tWKkktLlGyUlIqS00uLUpVslIqLU4tykvMTQUA4fENRiEAAAA="}}' \
--cli-binary-format raw-in-base64-out \
/tmp/lambda-response.json && cat /tmp/lambda-response.json

A BadGzipFile error is expected with this dummy payload โ€” it confirms the function invoked and IAM permissions are correct. A real CloudWatch payload will always decompress cleanly.


IAM Roleโ€‹

The shared role scottsdalemint-cloudwatch-log-parser-role is used by all Lambda functions across environments. The inline policy uses wildcards so no updates are needed when adding a new environment.

StatementResources
WriteLambdaLogsarn:aws:logs:us-west-2:*:log-group:/aws/lambda/scottsdalemint-*:*
WriteStructuredLogsarn:aws:logs:us-west-2:*:log-group:/scottsdalemint/* and โ€ฆ:*

Inspect the live policyโ€‹

aws iam get-role-policy \
--role-name 'scottsdalemint-cloudwatch-log-parser-role' \
--policy-name 'scottsdalemint-cloudwatch-log-parser-role-policy'

Update the policy manuallyโ€‹

Use this if the role already exists and you need to fix permissions without re-running deploy.sh:

aws iam put-role-policy \
--role-name 'scottsdalemint-cloudwatch-log-parser-role' \
--policy-name 'scottsdalemint-cloudwatch-log-parser-role-policy' \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "WriteLambdaLogs",
"Effect": "Allow",
"Action": ["logs:CreateLogGroup","logs:CreateLogStream","logs:PutLogEvents"],
"Resource": "arn:aws:logs:us-west-2:253490776993:log-group:/aws/lambda/scottsdalemint-*:*"
},
{
"Sid": "WriteStructuredLogs",
"Effect": "Allow",
"Action": ["logs:CreateLogGroup","logs:CreateLogStream","logs:PutLogEvents","logs:DescribeLogStreams"],
"Resource": [
"arn:aws:logs:us-west-2:253490776993:log-group:/scottsdalemint/*",
"arn:aws:logs:us-west-2:253490776993:log-group:/scottsdalemint/*:*"
]
}
]
}'

Querying Logs in CloudWatch Insightsโ€‹

Navigate to CloudWatch โ†’ Logs Insights in the AWS console and select the structured log group:

/scottsdalemint/production/structured-logs

Example queriesโ€‹

# All errors and above
fields @timestamp, level_name, message, channel
| filter level >= 400
| sort @timestamp desc
| limit 50
# Fatal errors only
fields @timestamp, message
| filter level_name = "CRITICAL"
| sort @timestamp desc
| limit 20
# Errors from a specific channel
fields @timestamp, level_name, message
| filter channel = "SUMA"
| sort @timestamp desc
| limit 50

Adding a New Environmentโ€‹

  1. Run ./deploy.sh {environment} โ€” the IAM role already exists and will be reused automatically.
  2. Attach the subscription filter on the new Ymir log group (see Attaching the Subscription Filter above).

No IAM changes are needed.


Filesโ€‹

FilePurpose
lambda_function.pyLambda handler โ€” extracts and re-publishes Wonolog JSON entries
deploy.shDeploy/update script โ€” creates IAM role, Lambda function, and prints subscription filter command
README.mdSource README for this tool