June 18, 2026

Multi-modal datasets

Tobias Wochinger

Create Langfuse dataset items with images, audio, video, documents, and other attachments for SDK-based multi-modal experiments.

You can now add media attachments to Langfuse dataset items and use them in SDK-based multi-modal experiments. Dataset item input, expectedOutput, and metadata can include media uploaded from the UI or via the Python and JS/TS SDKs.

Use this to build visual QA datasets, compare generated images against reference files, or run evaluations over audio, documents, and other multi-modal inputs. The SDKs can resolve dataset media references back into signed media handles so your experiment code can pass bytes, base64, or data URIs to model providers.

Multi-modal datasets are supported for SDK-based experiments with Python SDK >= 4.10.0 and JS/TS SDK @langfuse/client >= 5.5.0. UI-based experiments do not yet support dataset items with media attachments.

Multi-modal datasets

Get started

Datasets

Experiments via SDK