Multi-modal datasets
Create Langfuse dataset items with images, audio, video, documents, and other attachments for SDK-based multi-modal experiments.
You can now add media attachments to Langfuse dataset items and use them in SDK-based multi-modal experiments. Dataset item input, expectedOutput, and metadata can include media uploaded from the UI or via the Python and JS/TS SDKs.
Use this to build visual QA datasets, compare generated images against reference files, or run evaluations over audio, documents, and other multi-modal inputs. The SDKs can resolve dataset media references back into signed media handles so your experiment code can pass bytes, base64, or data URIs to model providers.
Multi-modal datasets are supported for SDK-based experiments with Python SDK
>= 4.10.0 and JS/TS SDK @langfuse/client >= 5.5.0. UI-based
experiments do not yet support dataset items with media attachments.