Per-task daily spend caps, graceful deferral when the cap hits, and prompt caching that cuts batch job costs 80–90%.
An AI integration with no cost controls is a liability waiting to express itself. A batch job that loops over 200 listings, a bug that sends the same request in a loop, a staff member clicking a button repeatedly — any of these can generate an unexpected bill. Cost discipline means the system can’t spend more than you’ve decided it should, and the decisions about what runs on which model are explicit.
The description generation batch job was the first AI feature that could generate significant cost. 166 listings, each requiring one API call. At Haiku pricing, that batch costs about $0.40. At Sonnet pricing, it costs about $12. Using the wrong model by accident — or having someone add a batch job that calls Sonnet without thinking — was a real risk.
Daily caps eliminate the category of “I didn’t know that was happening” cost overruns. Haiku vs. Sonnet routing ensures the expensive model is reserved for tasks that justify it.
The cap is checked before every AI call in the central wrapper. The daily spend is tracked in a transient that resets at midnight:
function [client]_ai_budget_available( string $task ): bool {
$caps = [
'description' => 2.00, // $2/day max on description generation
'nl_query' => 1.00, // $1/day max on search queries
'lead_summary' => 0.50, // $0.50/day max on lead summaries
'general' => 1.00,
];
$daily_limit = $caps[ $task ] ?? 0.50;
$spent_today = [client]_get_daily_spend( $task );
return $spent_today < $daily_limit;
}
function [client]_get_daily_spend( string $task ): float {
$cache_key = '[client]_ai_spend_' . $task . '_' . date( 'Y-m-d' );
$cached = get_transient( $cache_key );
if ( $cached !== false ) {
return (float) $cached;
}
// Recompute from log table
global $wpdb;
$spend = (float) $wpdb->get_var( $wpdb->prepare(
"SELECT SUM(cost_usd) FROM [client]_ai_call_log
WHERE task = %s AND called_at >= CURDATE() AND status = 'ok'",
$task
) );
set_transient( $cache_key, $spend, 3600 ); // Cache for 1 hour
return $spend;
}
When a cap is hit, the system returns a clear signal that the caller can handle without crashing:
// In the description generation caller:
$result = [client]_ai_request( [
'task' => 'description',
'messages' => [ [ 'role' => 'user', 'content' => $prompt ] ],
] );
if ( ! $result['ok'] && $result['error'] === 'daily_cap' ) {
// Schedule for tomorrow instead of failing hard
wp_schedule_single_event(
strtotime( 'tomorrow midnight' ) + rand( 0, 3600 ),
'[client]_generate_description',
[ $unit_id ]
);
return; // Don't alert — this is expected behavior
}
if ( ! $result['ok'] ) {
[client]_alert_ai_failure( $unit_id, $result['error'] );
return;
}
The wrapper needs to log cost without making an extra API call. Estimated cost is computed from token count and known pricing:
function [client]_estimate_cost( string $model, int $tokens ): float {
// Prices per 1M tokens (input + output blended estimate)
$rates = [
'claude-haiku-4-5-20251001' => 0.00080, // $0.80 / 1M tokens
'claude-sonnet-4-6' => 0.01200, // $12 / 1M tokens
'claude-opus-4-7' => 0.07500, // $75 / 1M tokens
];
$rate = $rates[ $model ] ?? 0.01;
return round( ( $tokens / 1000000 ) * $rate, 6 );
}
When running a description batch job, the system prompt is identical for every unit. Anthropic’s prompt caching reduces cost by 90% on cached input tokens:
function [client]_generate_description_batch( array $unit_ids ): void {
$system_prompt = [client]_get_description_system_prompt();
foreach ( $unit_ids as $unit_id ) {
$specs = [client]_get_unit_specs_for_description( $unit_id );
$result = [client]_ai_request( [
'task' => 'description',
'system' => $system_prompt,
'cache_system' => true, // Cache the system prompt across batch calls
'messages' => [
[
'role' => 'user',
'content' => "Write a listing description for this boat:\n\n" . $specs,
],
],
] );
if ( $result['ok'] ) {
update_post_meta( $unit_id, '[client]_ai_description', $result['content'] );
update_post_meta( $unit_id, '[client]_ai_description_date', current_time( 'mysql' ) );
}
}
}
The actual cache-control headers are set inside the wrapper when cache_system is true:
if ( ! empty( $args['cache_system'] ) && $args['system'] ) {
$request_body['system'] = [
[
'type' => 'text',
'text' => $args['system'],
'cache_control' => [ 'type' => 'ephemeral' ],
]
];
}
A daily cap converts an open-ended cost exposure into a predictable one. The system can spend at most $4/day ($2 descriptions + $1 search + $0.50 summaries + buffer). That’s $120/month maximum, regardless of what the batch jobs do or how many leads come in.
Prompt caching on batch jobs is the highest-leverage cost optimization available. A 166-unit description batch without caching costs $0.40. With caching (system prompt cached after the first call), it costs under $0.05. The system prompt is ~800 tokens; caching it at 90% discount on every subsequent call in the batch produces most of the savings.
The daily cap has been in production since the wrapper shipped. It fired three times: once when a staff member clicked “Generate All Descriptions” twice in the same day (second batch deferred to the next day, correct behavior), once during a code bug that would have sent 400 identical requests (cap stopped it after $2), and once during a period when inbound leads spiked and the lead summary task hit its cap by 3pm.
The log table answered all three incidents: what ran, how much it cost, when the cap hit.
Every lesson stays free — no account, no paywall, no email gate, ever. But if you’d rather have this system standing on your business than wire all 48 lessons yourself, leave your email. We’ll send you a direct line to a build — and you’ll be first to hear when we add new tools to the curriculum.
None of this gates a single lesson. The curriculum was free before you got here and it stays that way.
You came here to understand the system, and now you do. If you’d rather have it standing on your business than spend the next three months wiring it yourself, GAP Concierge is the same architecture from these lessons — a white-label AI agent that knows your catalog and captures your leads — set up for you, from $97/mo.
See GAP Concierge →