Troubleshooting
API Returns 503 REVOCATION_UNAVAILABLE
Symptom:
{"code": "REVOCATION_UNAVAILABLE", "message": "Revocation check unavailable"}Root cause: Redis is unreachable, and the AuthStrategy is configured with failOpen: false (the default, which is safe).
Solution:
-
Check Redis health:
Terminal window kubectl get pod -n velocity-system -l app=redis -
If Redis pod is not running, restart it:
Terminal window kubectl rollout restart statefulset/redis -n velocity-system -
Check Redis logs for errors:
Terminal window kubectl logs -n velocity-system -l app=redis --tail=100 -
Verify network connectivity from API pod to Redis:
Terminal window kubectl exec -n velocity-system deployment/velocity-api -- \redis-cli -h redis:6379 pingExpected:
PONG -
If the issue persists, check your AuthStrategy configuration:
apiVersion: velocity.sh/v1kind: AuthStrategymetadata:name: jwt-internalspec:revocation:failOpen: false # This is correct (safe default)To allow requests when Redis is down (dangerous), set
failOpen: true. Only do this in non-production environments.
API Returns 401 Invalid Bearer Token
Symptom:
{"code": "AUTH_INVALID_TOKEN", "message": "Invalid or expired token"}Root cause:
- Token is expired
- Token signature is invalid
- Token issuer is not configured in AuthStrategy
- JWKS endpoint returned a key that doesn’t match
Solution:
-
Check token expiration:
Terminal window # Decode the JWT (base64 decode the middle section)echo <token> | awk -F'.' '{print $2}' | base64 -d | jq .Look for
"exp"field. Ifexp < now(), the token is expired. -
Verify AuthStrategy is configured for your issuer:
Terminal window kubectl get authstrategy -Akubectl describe authstrategy jwt-internal -n velocity-systemCheck that
spec.issuermatches your token’sissclaim. -
Test token verification:
Terminal window curl -H "Authorization: Bearer <token>" \https://api.velocity.acme.com/versionIf it returns 401, the token is invalid. If it returns 200, auth is working.
-
Check JWKS cache:
Terminal window kubectl logs -n velocity-system -l app=velocity-api | grep -i jwksShould show periodic refreshes every 5 minutes. If you see errors, the JWKS endpoint may be down.
Reconciler Hot-loops / Operator CPU High
Symptom:
- Operator pod CPU usage > 50% continuously
- Logs show repeated reconciliation of the same CRD:
{"message": "reconciling", "schema": "acme/supply-chain/procurement/purchase-order/v1", "attempt": 1234}
Root cause: Schema has drifted from its spec. The operator reconciles, makes changes, detects a diff, reconciles again, etc.
Solution:
-
Check the operator logs for the error:
Terminal window kubectl logs -n velocity-system -l app=velocity-operator --tail=200 | \grep -A5 "error" -
Common causes:
- Postgres table has uncommitted changes (e.g., a manual
ALTER TABLEby DBA) - Operator role lacks permission to create/modify a resource
- A required RLS policy is missing
- Archive or history table is out of sync
- Postgres table has uncommitted changes (e.g., a manual
-
Run the drift check:
Terminal window velocity drift check \--schema acme/supply-chain/procurement/purchase-order/v1This reports what is drifted.
-
Quarantine the schema to stop reconciliation:
Terminal window kubectl annotate schemadefinition purchase-order \-n acme-supply-chain-procurement \velocity.sh/quarantine=true \--overwrite -
Fix the drift manually (with DBA):
-- Example: add a missing RLS policyCREATE POLICY acme_supply_chain_procurement_policyON acme_supply_chain_procurement.purchase_order_v1FOR SELECT USING (...); -
Remove the quarantine:
Terminal window kubectl annotate schemadefinition purchase-order \-n acme-supply-chain-procurement \velocity.sh/quarantine- \--overwrite -
Trigger a manual reconcile:
Terminal window velocity reconcile --schema acme/supply-chain/procurement/purchase-order/v1
Outbox Table Grows Unbounded
Symptom:
SELECT COUNT(*) FROM acme_supply_chain_procurement.purchase_order_v1_outboxWHERE published_at IS NULL;Returns thousands of unpublished rows.
Root cause: CDC worker is not running or has crashed. Outbox rows are not being sent to Typesense.
Solution:
-
Check CDC worker health:
Terminal window kubectl logs -n velocity-system -l app=velocity-api -c cdc-worker --tail=100Look for
panic,error, or connection issues. -
Verify Typesense is reachable:
Terminal window kubectl exec -n velocity-system deployment/velocity-api -- \curl https://typesense.example.com:8108/healthShould return status 200 with health info.
-
Check if the CDC worker is running:
Terminal window kubectl exec -n velocity-system deployment/velocity-api -- \ps aux | grep cdcShould show a running process.
-
Restart the API deployment:
Terminal window kubectl rollout restart deployment/velocity-api -n velocity-system -
Monitor the outbox shrinking:
Terminal window watch -n 5 'psql -U velocity_api -d velocity -c "SELECT COUNT(*) FROM acme_supply_chain_procurement.purchase_order_v1_outbox WHERE published_at IS NULL;"'The count should decrease as rows are processed.
-
If the count doesn’t shrink, check Typesense logs:
Terminal window kubectl logs -n typesense -l app=typesense --tail=100The collection may be read-only, or there may be a permission issue.
Schema Apply Succeeds but Table Not Created
Symptom:
kubectl apply -f schema.yaml# Output: schemadefinition.velocity.sh/purchase-order created# Status shows "Ready"
# But the table doesn't exist:psql -U velocity_api -d velocity -c \ "SELECT to_regclass('acme_supply_chain_procurement.purchase_order_v1');"Returns NULL.
Root cause: Operator does not have permission to CREATE TABLE in the domain schema.
Solution:
-
Check operator logs:
Terminal window kubectl logs -n velocity-system -l app=velocity-operator --tail=200 | \grep -i "permission\|denied" -
Verify operator’s Postgres role has CREATE permission:
-- As superuserSELECT * FROM information_schema.role_table_grantsWHERE grantee = 'velocity_operator' AND table_schema LIKE 'acme_%';Should show
CREATEin the privileges. -
Grant the permission:
-- As superuserGRANT CREATE ON SCHEMA acme_supply_chain_procurement TO velocity_operator; -
Retry the apply:
Terminal window velocity reconcile --schema acme/supply-chain/procurement/purchase-order/v1
Time Machine Shows Empty History
Symptom:
velocity history list --schema ... --id PO-00000001Returns 0 events, even though records were created and updated.
Root cause: History trigger is not firing, or history table is missing.
Solution:
-
Check if the history table exists:
SELECT to_regclass('acme_supply_chain_procurement.purchase_order_v1_history');If NULL, the operator did not create it. Reconcile the schema:
Terminal window velocity reconcile --schema acme/supply-chain/procurement/purchase-order/v1 -
Check if the trigger exists:
SELECT trigger_name FROM information_schema.triggersWHERE event_object_table = 'purchase_order_v1'AND trigger_schema = 'acme_supply_chain_procurement';Should show a trigger like
purchase_order_v1_history_trigger. -
Check if the trigger is enabled:
SELECT tgenabled FROM pg_triggerWHERE relname = 'purchase_order_v1'AND tgname = 'purchase_order_v1_history_trigger';Should return
O(enabled). If it returnsD, the trigger is disabled:-- Enable itALTER TABLE acme_supply_chain_procurement.purchase_order_v1 ENABLE ALWAYS TRIGGER purchase_order_v1_history_trigger; -
Manually verify the trigger works by creating a test record:
Terminal window velocity record create \--schema acme/supply-chain/procurement/purchase-order/v1 \--data '{"id": "TEST-001", "status": "draft"}' -
Check the history table:
SELECT COUNT(*) FROM acme_supply_chain_procurement.purchase_order_v1_historyWHERE entity_id = 'TEST-001';Should show 1 (the CREATE event).
Archive Worker Silently Skips Policy
Symptom:
- ArchivePolicy is applied and shows Ready
- Archive worker pod is running
- No errors in logs
- But records are not being archived to S3
Root cause:
- Archive policy type is
cel(deferred to Phase 10) - S3 destination is not configured
ARCHIVE_S3_BUCKETenvironment variable is not set
Solution:
-
Check the archive policy:
Terminal window kubectl get archivepolicy -Akubectl describe archivepolicy <name>Look at
spec.trigger.type. If it’scel, this is deferred:spec:trigger:type: cel # Deferred; only 'age' and 'size' are implemented -
Use
ageorsizeinstead:spec:trigger:type: agedays: 30 -
Verify S3 bucket is configured:
Terminal window kubectl get deployment velocity-archive-worker -n velocity-system -o yaml | \grep -A5 "ARCHIVE_S3" -
If not set, update the Helm values:
archiveWorker:env:ARCHIVE_S3_BUCKET: velocity-archivesARCHIVE_S3_REGION: us-east-1 -
Reapply Helm:
Terminal window helm upgrade velocity velocity/velocity \--namespace velocity-system \--values values.yaml -
Check archive run status:
SELECT schema_path, status, started_at, completed_at FROM platform.archive_runsORDER BY started_at DESC LIMIT 10;Should show recent runs with status
success.
velocity context add Fails with “Invalid Bearer Token”
Symptom:
velocity context add \ --name prod \ --api-url https://api.velocity.acme.com \ --bearer-token eyJhbGc...Error:
Error: invalid bearer token (contains CRLF)Root cause: The token contains newlines (common when copying from email or docs).
Solution:
-
Remove newlines from the token:
Terminal window TOKEN=$(cat token.txt | tr -d '\n\r')velocity context add --name prod --api-url ... --bearer-token "$TOKEN" -
Verify the token is valid:
Terminal window echo "$TOKEN" | awk -F'.' '{if(NF!=3) exit 1}' && echo "Valid JWT structure" -
Retry the context add.
velocity api-key create Times Out Waiting for Secret
Symptom:
velocity api-key create --name prod --ttl 30dHangs for 5+ minutes, then times out:
Error: timeout waiting for secret to appearRoot cause: Operator is not running, or API server cannot create Kubernetes secrets.
Solution:
-
Check if operator is running:
Terminal window kubectl get deployment velocity-operator -n velocity-systemShould show 1 replica.
-
Check operator logs:
Terminal window kubectl logs -n velocity-system -l app=velocity-operator --tail=100Look for FATAL or ERROR.
-
Check API server RBAC:
Terminal window kubectl describe rolebinding velocity-api -n velocity-systemShould show permission to create secrets.
-
If RBAC is missing, add it:
apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata:name: velocity-apinamespace: velocity-systemrules:- apiGroups: [""]resources: ["secrets"]verbs: ["get", "list", "watch", "create", "update", "patch"] -
Retry with a longer timeout:
Terminal window velocity api-key create \--name prod \--ttl 30d \--wait-secs 120
API Server Crashes with “RLS will not work”
Symptom:
API server logs show:
FATAL velocity_api role has BYPASSRLS=true — RLS will not work. Fix the role.Then the pod restarts continuously.
Root cause: The database role velocity_api has BYPASSRLS=true (allowing it to bypass RLS).
Solution:
-
Connect to Postgres as superuser:
Terminal window psql -h <pg-host> -U <superuser> -d velocity -
Fix the role:
ALTER ROLE velocity_api NOBYPASSRLS; -
Verify:
SELECT rolname, rolbypassrls FROM pg_roles WHERE rolname = 'velocity_api';Should show
rolbypassrls = f. -
Restart the API deployment:
Terminal window kubectl rollout restart deployment/velocity-api -n velocity-system -
Tail logs to verify startup:
Terminal window kubectl logs -f -n velocity-system -l app=velocity-apiShould show “schema informer ready” when healthy.
Validating Webhook Rejects Valid Manifest
Symptom:
Applying a manifest fails with:
Error from server: error when creating "schema.yaml": admission webhook "schemadefinition.velocity.sh" denied the request: ...But the manifest looks correct.
Solution:
-
Get more details:
Terminal window kubectl apply -f schema.yaml -v=6 2>&1 | grep -A10 "denied" -
Common webhook rejections:
-
Namespace mismatch:
{org}-{app}-{domain}namespace must match CRD metadata# Must be in namespace acme-supply-chain-procurementmetadata:namespace: acme-supply-chain-procurementlabels:velocity.sh/org: acmevelocity.sh/app: supply-chainvelocity.sh/domain: procurement -
CEL syntax error: Fix the rule
validation:rules:- rule: "self.amount >" # Missing right side -
Quota exceeded: Too many SchemaDefinitions in the Application
SELECT COUNT(*) FROM platform.schema_definitionsWHERE org = 'acme' AND app = 'supply-chain';If you need more, update the quota in the Application CRD.
-
Cross-org reference (multi-tenant): In multi-tenant mode, you cannot reference a schema from another org
# This is forbidden in multi-tenant:spec:refs:- path: other-org/app/domain/schema/v1
-
-
Check webhook configuration:
Terminal window kubectl get validatingwebhookconfigurations | grep velocitykubectl describe validatingwebhookconfigurations velocity.sh-schemadefinition -
If the webhook is misconfigured, delete and restart:
Terminal window kubectl rollout restart deployment/velocity-webhook -n velocity-system
Next Steps
If you can’t find your issue here, check:
- Security — Auth and RLS-related issues
- Hardening — CEL, input validation
- API Reference — REST endpoint issues
- Operator logs:
kubectl logs -f -n velocity-system -l app=velocity-operator - API logs:
kubectl logs -f -n velocity-system -l app=velocity-api
Or reach out to the team.