Deploy autonomous AI agents that reason, exploit, and validate complex vulnerability chains — not another scanner, an agentic system that thinks like a senior pentester.
CVE-2026-34755 is a low severity vulnerability with a CVSS score of 0.0. No known exploits currently, and patches are available.
Very low probability of exploitation
EPSS predicts the probability of exploitation in the next 30 days based on real-world threat data, complementing CVSS severity scores with actual risk assessment.
The VideoMediaIO.load_base64() method at vllm/multimodal/media/video.py:51-62 splits video/jpeg data URLs by comma to extract individual JPEG frames, but does not enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by the load_bytes() code path at line 47-48, is completely bypassed in the video/jpeg base64 path. An attacker can send a single API request containing thousands of comma-separated base64-encoded JPEG frames, causing the server to decode all frames into memory and crash with OOM.
# video.py:51-62
def load_base64(self, media_type: str, data: str) -> tuple[npt.NDArray, dict[str, Any]]:
if media_type.lower() == "video/jpeg":
load_frame = partial(self.image_io.load_base64, "image/jpeg")
return np.stack(
[np.asarray(load_frame(frame_data)) for frame_data in data.split(",")]
# ^^^^^^^^^^
# Unbounded split — no frame count limit
), {}
return self.load_bytes(base64.b64decode(data))
The load_bytes() path (line 47-48) properly delegates to a video loader that respects self.num_frames (default 32). The load_base64("video/jpeg", ...) path bypasses this limit entirely — data.split(",") produces an unbounded list and every frame is decoded into a numpy array.
video/jpeg is a vLLM-specific MIME type, not IANA-registered. However it is part of the public API surface:
encode_video_url() at vllm/multimodal/utils.py:96-108 generates data:video/jpeg;base64,... URLstests/entrypoints/openai/test_video.py:62 and tests/entrypoints/test_chat_utils.py:153 both use this formatEach JPEG frame decodes to a full numpy array. For 640x480 RGB images, each frame is ~921 KB decoded. 5000 frames = ~4.6 GB. then creates an additional copy. The compressed JPEG payload is small (~100 KB for 5000 frames) but decompresses to gigabytes.
Please cite this page when referencing data from Strobes VI. Proper attribution helps support our vulnerability intelligence research.
np.stack()POST /v1/chat/completions
→ chat_utils.py:1434 video_url type → mm_parser.parse_video()
→ chat_utils.py:872 parse_video() → self._connector.fetch_video()
→ connector.py:295 fetch_video() → load_from_url(url, self.video_io)
→ connector.py:91 _load_data_url(): url_spec.path.split(",", 1)
→ media_type = "video/jpeg"
→ data = "<frame1>,<frame2>,...,<frame10000>"
→ connector.py:100 media_io.load_base64("video/jpeg", data)
→ video.py:54 data.split(",") ← UNBOUNDED
→ video.py:55-57 all frames decoded into numpy arrays
→ video.py:56 np.stack([...]) ← massive combined array → OOM
connector.py:91 uses split(",", 1) which splits on only the first comma. All remaining commas stay in data and are later split by video.py:54.
| Code Path | Frame Limit | File |
|-----------|-------------|------|
| load_bytes() (binary video) | Yes — num_frames (default 32) | video.py:46-49 |
| load_base64("video/jpeg", ...) | No — unlimited data.split(",") | video.py:51-62 |