Fastly - Platform for Developers

This post extends on the Project honeypots.work kick-off and describes the first utility site instrumented to feed the honeypot data pipeline.

The Itch

Google Cloud Platform has thousands of IAM permissions. Finding the right one is tedious.

Google’s documentation works, but it requires multiple clicks to navigate. They have a permission reference with search, but it still feels slow when you need to quickly check which roles grant compute.instances.setMetadata or find all permissions containing bigquery.tables.

I wanted something faster. Type a few characters, see results. No extra page loads.

So I built gcpiam.com.

Architecture

The site is simple. No backend servers, no databases to maintain, no containers. Just compiled Rust running at the edge.

Fastly Compute

The application runs on Fastly Compute - their edge computing platform. The code is Rust, compiled to WebAssembly, deployed globally.

Why Fastly Compute?

  • Fast: Sub-50ms responses globally. The search index is compiled into the WASM binary.
  • Free tier: Generous (10 million requests per month).
  • Rust: Type safety, performance, low memory footprint.
  • Rich logging: Client telemetry including JA4 fingerprints, latencies, request/response sizes, headers.

The search index contains every GCP role and permission. Rather than loading JSON and parsing it on each request, the index is pre-built at compile time using bincode. Fastly Compute starts fresh instances for each request, so avoiding JSON deserialization on the hot path matters.

Fastly VCL for Logging

Here’s where it connects to honeypots.work.

Fastly Compute (WASM) services don’t have native access to rich VCL variables like TLS fingerprints. VCL services do. The solution is to chain them:

Client → VCL Service → Compute Service
              ↓
          BigQuery

The VCL service sits in front of the Compute service and captures request telemetry:

  • JA4 fingerprint - TLS fingerprinting that identifies client implementations
  • JA3 MD5 - Classic TLS fingerprint hash
  • TLS protocol and cipher - TLSv1.3, CHACHA20_POLY1305_SHA256, etc.
  • Geolocation - Country, city, region
  • Edge metadata - Which Fastly POP served the request

This streams directly to BigQuery via Fastly’s native integration. No intermediate services.

Example log entry:

{
  "timestamp": "2026-01-31T14:24:12",
  "client_ip": "2a01:4b00:...",
  "request_uri": "/api/v1/search?q=compute",
  "response_status": 200,
  "edge_location": "LHR",
  "client_country": "GB",
  "client_city": "tower hamlets",
  "tls_ja4": "t13d4907h2_0d8feac7bc37_7395dae3b2f3",
  "tls_ja3_md5": "375c6162a492dfbf2795909110ce8424",
  "tls_protocol": "TLSv1.3",
  "tls_cipher": "TLS_CHACHA20_POLY1305_SHA256"
}

Every request - bots, crawlers, scanners - gets its TLS fingerprint captured for later analysis.

The Pipeline

GitHub Actions for Builds

The deployment workflow:

  1. Push to main triggers a build
  2. GitHub Actions compiles Rust to wasm32-wasip1
  3. Manual deployment via fastly compute publish

Deployment is manual. The site serves real traffic and I prefer to review changes before they go live.

Nightly Data Updates

GCP permissions change regularly. New services launch, permissions get added, roles evolve. A scheduled GitHub Action runs nightly at 2 AM UTC:

  1. Authenticates to GCP using Workload Identity Federation
  2. Fetches all roles via gcloud iam roles list
  3. Fetches all permissions from the IAM API
  4. Commits the updated JSON to the repository

The Rust application generates individual permission and role pages dynamically. URLs like /permissions/compute.instances.create and /roles/roles/editor are rendered on request. The sitemap lists all 15,000+ pages for search engines to crawl.

The Journey

Part of the motivation was learning Fastly. I’d used their CDN at work, but never built out Compute or VCL from scratch for a personal project. Building something real is the fastest way to learn a platform.

The first version was naive. Load a 3MB JSON file, parse it, search through arrays. Most response time went to JSON deserialization.

The current version pre-computes everything at build time:

  • Search indices for permission and role names
  • Lowercase variants for case-insensitive matching
  • Pre-computed role summaries to avoid nested lookups
  • The entire index serialized with bincode and embedded in the WASM binary

The difference between “acceptable” and “fast”.

The entire project - Rust code, Terraform infrastructure, VCL configuration, GitHub Actions - was built using Claude Code without manually editing a single line.

What’s Next

gcpcost.com is next for an update. Same approach - solve a real problem, make it useful, instrument it for honeypots.work.

These sites serve a dual purpose. They address problems I encounter daily working with GCP. They also generate traffic from developers with similar needs. That traffic becomes data for bot detection research.


Try it: gcpiam.com