Cost optimization patterns for LLM API usage combining model routing, budget tracking, retry logic, and prompt caching.