New • Mosaico Alchemy is out

The Data Platform for Robotics and Physical AI

Mosaico is the open source infrastructure that transforms
petabytes of robotic sensor data into training-ready assets.

Overview

Unlock the full
potential of your data

A purpose-built data platform for robotics. Stop building fragile workarounds and handle petabyte-scale sensor data on your own infrastructure. From ingestion to retrieval, everything your team needs in one place.

Mosaico data pipeline: ingestion, querying, and retrieval

Your data, on your own infrastructure

Mosaico runs entirely on-premise. Ontology, ingestion, and query. Every layer of the platform runs on your servers, your cloud account, or your edge hardware. No data ever leaves your infrastructure.

Querying sensor data across recordings with timestamp windows

Find any event, across every recording

Filter by physical sensor values across your entire dataset: acceleration spikes, GPS drops, joint overloads, you name it. Mosaico returns exact timestamp windows, not files to scrub through manually.

Ingesting MCAP, DB3, HDF5, ROS bag, and binary formats into structured topics

Write code for new ideas, not format converters

ROS bags, MCAP, custom sensors, Mosaico ingests them all through a single structured interface. Schema translation is automatic. One integration, any source, for any project.

Mosaico SDK

From raw sensor data
to structured insight

Mosaico's Python SDK covers the full data lifecycle without SQL, schemas, or boilerplate. Three primitives: push, find, stream.

Ingest data at speed,
worry-free

from mosaicolabs import MosaicoClient, IMU, SessionLevelErrorPolicy
 
with MosaicoClient.connect("localhost", 6726) as client:
    with client.sequence_create(
        sequence_name="imu_session_01",
        metadata={"source": "sensor_rig_v3"},
        on_error=SessionLevelErrorPolicy.Delete,
    ) as swriter:
        imu_writer = swriter.topic_create("sensors/imu", ontology_type=IMU)
        for msg in stream_imu("imu.csv"):  # user-supplied generator
            imu_writer.push(message=msg)

Strongly-typed Ontology

Built-in models for IMU, GPS, Image, Pressure. Extend with your own in a few lines of Python.

Chunked Streaming

Generator-based I/O from CSV, MCAP, or ROS bags. Ingest datasets that exceed RAM without loading a single full file.

Atomic Commits

Session and topic-level error policies define what happens on failure. Delete cleanly or retain partial data for recovery.

Stack

The Data Engine for Robotics, explained

Mosaico is built around two components: the SDK, which your application uses directly for ingestion, retrieval, and querying, and mosaicod, the daemon that sits between your code and your data layer, handling storage, indexing, state, and retrieval. Data moves between the two without serialization overhead or format conversion.

Your application

IngestionRetrievalQuerying

Mosaico SDK

mosaicod

Your data layer

DatabaseObject store

Your application

IngestionRetrievalQuerying

Mosaico SDK

mosaicod

Your data layer

DatabaseObject store
Platform

One platform.
Entire data lifecycle.

Mosaico handles every stage of your robotics data lifecycle, so your team can focus on building robots, not data infrastructure.

Data Ontology

Standardized representation for the most common robotics data models. Speak one language across all sensors.

Multimodal Search

Search complex scenarios using simple text queries. Find the needle in petabytes of sensor data.

Data Lineage

True root cause debugging with full traceability. Track every transformation from raw sensor to training set.

Orchestrator

Coordinate data containers and third-party tools. Automate labeling, segmentation, and inference pipelines.

Certifiability

Build certifiable data pipelines against rigorous industry standards. Enterprise-grade compliance from day one.

Data Containers

Native support for ROS, ROS2, and MCAP formats. Middleware agnostic with optimal data compression.

Pricing

Start free.
Scale without limits.

From open source to enterprise, we've got you covered.

Open Source

For community and internal projects

AGPL licensed · Community support


  • Core Mosaico engine
  • Zero vendor lock-in
  • Middleware agnostic
  • Open source integrations
  • Full ROS compatibility
  • Community support (Discord)
Get Mosaico
Commercial license

For enterprise,
commercial projects

Dedicated support, development available.


  • AGPL bypass
  • Custom DB and data connectors
  • Mosaico Link (online acquisition)
  • IAM for end-clients
  • Dashboard toolkit
  • Custom pipeline certification
Contact us
Team

A decade of autonomous
driving experience

Researchers from SISSA, Pisa, and Parma. Engineering experience at Ambarella and Magneti Marelli. At some point we all ended up debugging the same broken data pipelines, so we decided to fix them once and for all.

Francesco Di Corato

Francesco Di Corato

CSO & CO-FOUNDER

Ambarella · Magneti Marelli · Univ. Pisa

LinkedIn Profile
Federico Cabassi

Federico Cabassi

CTO & CO-FOUNDER

Ambarella · Magneti Marelli · Univ. Parma

LinkedIn Profile

Backed by

Open source is the foundation

We didn't think it was right to put a paywall on something this fundamental. Data management isn't a premium feature of robotics development, it's the foundation. It felt wrong to have teams lose months of work, or entire datasets, just because they couldn't justify a license fee at an early stage.

Ready to tame
your data chaos?

Join disruptive companies and robotics leaders.
Start building with Mosaico in minutes.

Open source · Deploy in < 5 min