Adds InceptionLLMService, an OpenAI-compatible service for Inception's Mercury-2 diffusion-based reasoning model. Supports reasoning_effort (instant/low/medium/high) and realtime mode for reduced TTFT.
144 B
144 B
- Added
InceptionLLMServicefor Inception's Mercury 2 diffusion reasoning model, with support forreasoning_effortandrealtimesettings.