Python / FastAPI / Cloud Run / PostgreSQL / Redis

CryptoPrism API

FastAPI microservices backend deployed on Cloud Run. Serves analytics data, portfolio management, and trading endpoints.

<50ms

p99 latency

Live App View Source

Leadership Lens

01 The Call

Chose to build a unified, stateless FastAPI backend on Cloud Run rather than coupling each product feature to a dedicated service or BaaS, consolidating 40+ endpoints across 15 router modules into a single auto-scaling container.

02 The Bet

Bet that a two-tier Redis + in-memory-fallback caching strategy with tuned TTLs per data category would achieve sub-50ms p99 latency without a dedicated CDN or query-level optimisation pass — and committed to that architecture from day one.

03 The Trade-off

Accepted min-instance=0 (scale-to-zero) in dev/staging, accepting cold-start latency in non-production environments, in exchange for zero idle cost and a simpler infrastructure footprint.

04 The Outcome

40+ live endpoints covering prices, on-chain analytics, DMV scoring, ML inference, and AI-assisted features — all from a single stateless container serving the React frontend at sub-50ms p99, with 99.9% uptime on the Cloud Run SLA.

05 Coordinated

Sole engineer-of-record; coordinated GCP project config, Cloud SQL connection pooling, Memorystore Redis provisioning, Firebase auth integration, and GitHub Actions CI/CD pipeline.

06 Where this goes next

Expand ML inference endpoints, wire portfolio management routes to the on-chain pipeline, and add Arbitrum/Base chain coverage through the on-chain service module.

01 Chapter 1

A Unified Backend for Real-Time Crypto Intelligence

CryptoPrism.io needed a single API layer to serve its React frontend with live prices, on-chain analytics, proprietary DMV scoring, AI-generated signals, and portfolio management. The requirements were demanding: sub-50ms p99 latency under burst traffic, horizontal auto-scaling during market volatility events, a layered caching strategy, and zero-downtime deployments via CI/CD.

P99 Latency

<50ms

cached responses

API Endpoints

40+

across 15 routers

Chains Covered

on-chain metrics

Uptime

99.9%

Cloud Run SLA

Design Constraint

The API must remain stateless and read-only against PostgreSQL so that Cloud Run can scale instances from 0 to N without coordination overhead or write conflicts.

02 Chapter 2

Layered Microservices on Cloud Run

The codebase follows a clean three-layer architecture: routes (HTTP interface), services (business logic + caching), and core (config, DB, auth, Redis). Each layer has a single responsibility, making the system testable and independently deployable.

Request Flow

React Client

Cloud Run (auto-scale)

FastAPI + Uvicorn

Service Layer

PostgreSQL

Redis Cache

In-Memory Fallback

Project Structure

src/api/ — HTTP layer app.py — FastAPI app, middleware, router registration middleware.py — Request logging (X-Response-Time header) routes/ — 15 router modules

src/core/ — Shared infrastructure config.py — Pydantic Settings (env-driven) database.py — Async SQLAlchemy engine + session factory redis.py — get_or_compute pattern + in-memory fallback auth.py — Firebase token verification dependency constants.py — Chain/token mappings

src/services/ — Business logic screener.py — Dynamic SQL builder + cached queries signals.py — Heatmap, consensus, divergences onchain.py — Cross-chain metrics, whale alerts prices.py — Live prices via CoinGecko scores.py — CryptoScore leaderboard ml.py — ML model inference endpoints ... + 6 more service modules

Middleware Stack

MIDDLEWARE 1 — CORS: Whitelist-based origin control for app.cryptoprism.io, localhost:3000, and Firebase hosting domains. Credentials enabled for auth cookies.

MIDDLEWARE 2 — Request Logging: Custom Starlette middleware injects X-Response-Time header and logs method, path, status, and latency in ms for every request.

MIDDLEWARE 3 — Firebase Auth: FastAPI dependency (get_current_user) verifies Bearer tokens via firebase-admin SDK. Returns uid, email, name. Optional variant for public endpoints.

03 Chapter 3

Sub-50ms at Scale

Every performance decision cascades from one principle: never hit the database on a hot path if the data hasn't changed. The API implements a two-tier caching strategy with tuned TTLs per data category, plus connection pooling to eliminate cold-connect overhead.

Cache Architecture: Get-or-Compute

async def get_or_compute(key, ttl, compute_fn): # Layer 1: Try Redis cached = await cache_get(key) if cached: return cached

# Layer 2: In-memory fallback (Redis down) mem = _mem_get(key) if mem: return mem

# Layer 3: Compute + cache in both layers result = await compute_fn() _mem_set(key, result, ttl) await cache_set(key, result, ttl) return result

TTL Strategy by Data Type

Data Category	TTL	Rationale
Live Prices	60s	Market data refreshes every minute from CoinGecko
Market Overview	300s	Global stats (total cap, BTC dominance) change slowly
Screener	300s	DMV scores recomputed every 5 minutes upstream
CryptoScores	1800s	Composite scores updated twice per hour
On-Chain Metrics	21600s	Daily chain data; 6-hour TTL balances freshness vs load

Connection Pooling

Pool Size

persistent connections

Max Overflow

burst capacity

Pool Recycle

30m

prevent stale connections

Pre-Ping

validate before use

Performance Result

With Redis cache hits, typical response times are 3-8ms. Even cache-miss queries with PostgreSQL round-trips stay under 50ms p99 thanks to connection pooling and indexed queries.

04 Chapter 4

Containerized CI/CD to Cloud Run

The API ships as a minimal Docker image, deployed automatically to Google Cloud Run via GitHub Actions. The pipeline handles linting, testing, building, and deploying — all triggered on push to main.

CI/CD Flow

git push main

GitHub Actions

Lint + Test

Docker Build

Cloud Run Deploy

Docker Configuration

FROM python:3.12-slim WORKDIR /app

# Install dependencies COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt

# Copy application COPY src/ ./src/

# Run with Uvicorn CMD ["uvicorn", "src.api.app:app", "--host", "0.0.0.0", "--port", "8080"]

Cloud Run Configuration

AUTO-SCALING — Scale to Zero: Min instances: 0 (dev/staging), 1 (prod). Max instances: 10. Scales based on concurrent requests per instance (80 target). Cold start mitigated by min-instance in production.

ENVIRONMENT MANAGEMENT — Three Environments: Development (local Docker), Staging (Cloud Run with test DB), Production (Cloud Run + Cloud SQL + Memorystore Redis). Secrets injected via GCP Secret Manager.

Zero-Downtime Deploys

Cloud Run performs rolling updates with traffic splitting. New revisions receive a canary percentage before full cutover, ensuring zero downtime and instant rollback capability.

05 Chapter 5

RESTful Endpoints with Pydantic Validation

The API exposes 15 route modules covering every data domain of the CryptoPrism platform. All endpoints follow consistent patterns: JSON responses, query-param filtering, Pydantic validation, and structured error handling with appropriate HTTP status codes.

Prices & Market

GET /api/v1/prices/live — Top tokens by market cap (paginated) GET /api/v1/prices/market-overview — Global stats: total cap, BTC dominance GET /api/v1/prices/{token_id} — Single token with 24h change, volume

On-Chain Analytics

GET /api/v1/on-chain/cross-chain — Compare metrics across 14 chains GET /api/v1/on-chain/{chain}/summary — Latest metrics + WoW/MoM/QoQ trends GET /api/v1/on-chain/{chain}/whale-alerts — Large transaction detection GET /api/v1/on-chain/{chain}/exchange-flow — Inflow/outflow to exchanges

Signals & Scoring

GET /api/v1/signals/heatmap — Top N tokens x 6 signal categories GET /api/v1/signals/divergences — On-chain vs TA divergence detection GET /api/v1/scores/leaderboard — CryptoScore ranking (top N) GET /api/v1/screener — Dynamic filter by DMV scores + presets

Auth & System

POST /api/v1/auth/verify — Verify Firebase token GET /api/v1/auth/me — Current user profile GET /health — Service health (PG + Redis status)

Error Handling Pattern

# 401 — Auth failure {"detail": "Token expired"}

# 404 — Resource not found {"detail": "Token 'xyz' not found"}

# 422 — Validation (auto from Pydantic) {"detail": [{"loc": ["query","limit"], "msg": "...", "type": "..."}]}

Dynamic Screener Query Builder

The screener endpoint builds parameterized SQL dynamically based on filter combinations (momentum, durability, valuation, bullish signal count), with results cached by MD5 hash of the filter set for 5 minutes.

06 Chapter 6

Production Dependencies

Every dependency was chosen for async performance, type safety, and minimal cold-start footprint in a containerized environment.

Core Components

FRAMEWORK — FastAPI + Uvicorn: Async Python framework with automatic OpenAPI docs. Uvicorn ASGI server for high-concurrency request handling.

DATABASE — PostgreSQL + asyncpg: Cloud SQL PostgreSQL with async driver via SQLAlchemy 2.0. Connection pooling with pool_pre_ping for reliability.

CACHE — Redis (Memorystore): GCP Memorystore Redis for distributed caching. 20 max connections. In-memory dict fallback if Redis is unreachable.

AUTH — Firebase Admin SDK: Server-side ID token verification. FastAPI dependency injection for both required and optional auth flows.

VALIDATION — Pydantic v2: Type-safe settings management (BaseSettings) and request validation. Regex-constrained query parameters.

INFRA — Cloud Run + Docker: Serverless containers with auto-scaling. GitHub Actions CI/CD. Python 3.12-slim base image for minimal footprint.

Python 3.12FastAPIUvicornPostgreSQLSQLAlchemy 2.0asyncpgRedisaioredisFirebase AdminPydantic v2DockerCloud RunGitHub ActionsCloud SQLGCP MemorystoreRuff (linting)pytest (async)

Key Metric

The full application serves 40+ endpoints across prices, on-chain analytics, proprietary scoring, ML inference, and AI-assisted features — all from a single stateless container with sub-50ms p99 latency.