Skip to content

Home / GDPR & compliance

GDPR compliance for robotics video datasets.

Compliance is not a checkbox at EgoVista, it is the architecture. Every dataset we deliver was collected, processed, and stored under EU law from day one, and the audit trail ships with the data.

1 of 8 sections

Why GDPR matters for robotics datasets.

In 2026 the regulatory pressure on AI training data is no longer theoretical. The European AI Act is in application, GDPR enforcement has matured, and compliance audits of robotics products and software systems are now a standard step in enterprise procurement. For an ML team, using a dataset that was collected or processed in a non-compliant way creates a chain of risk: dataset recall, model retraining, deployment blockage, and in the worst case, fines that scale with company revenue.

The risk is not abstract. A robotics startup that ships a manipulation policy trained on a non-compliant dataset can find itself in a position where the policy must be retired before launch, because the dataset cannot be re-licensed under acceptable conditions, or because a contributor withdrawal request cannot be honored in the trained model. EgoVista was designed so the dataset side never becomes that bottleneck.

2 of 8 sections

Face anonymization before any external processing.

The cornerstone of the EgoVista pipeline is a simple invariant: no frame containing an identifiable third party face ever crosses an external boundary. Anonymization runs locally, before the frame reaches any cloud service, any LLM, or any GPU inference endpoint.

GDPR-safe annotation pipeline flowHorizontal pipeline of five stages from raw contributor capture to client delivery. Face anonymization runs locally before any external API call, marked as the GDPR pivot point. All four downstream stages run on EU-only infrastructure on anonymized frames.EU infrastructureRaw captureContributor devicepersonal dataFace anonymizationMediaPipe · on-premiseGDPR pivot pointAnnotationVertex AI EU · RunPod EUQuality checkSchema + QA scoringDeliverySigned URL · R2 EUbefore any external APIanonymized frames onlyraw frameslocal on-premise pivotEU infrastructure
Anonymization runs before any external API call in the EgoVista pipeline.

The technical implementation in plain terms:

3 of 8 sections

European-first infrastructure.

Every component that touches personal data runs in the European Union. The choice is not branding, it is the simplest way to satisfy GDPR transfer rules and the data sovereignty expectations of enterprise clients in the EU. The full infrastructure stack:

The full transfer mechanism for each sub-processor is documented in the privacy policy, so a compliance officer can audit the chain without having to ask.

4 of 8 sections

Legal basis for each processing operation.

GDPR requires that every processing operation has a clear legal basis. The table below summarises the basis we rely on per step, with the GDPR article reference. The full version, including retention periods and sub-processor names, sits in the privacy policy.

ProcessingLegal basisJustification
Contributor video captureExplicit consent (Art. 6.1.a)Each contributor signs a mission-specific consent form before recording. Consent is recorded, dated, and revocable.
Face anonymizationLegal obligation (Art. 6.1.c) and data protection by design (Art. 25)Anonymisation is mandated to make subsequent processing safe and to honour data protection by design.
Hand pose, depth, segmentationLegitimate interest (Art. 6.1.f)The data is already anonymized at this point. The processing is necessary to produce a usable dataset, the impact on data subjects is minimal, and the balancing test documents this.
Action labelling on Vertex AILegitimate interest (Art. 6.1.f)Processing runs on EU infrastructure (europe-west4), only on anonymized frames. The balancing test and the sub-processor agreement are documented.
Delivery to clientContract (Art. 6.1.b)The delivery is the core of the contractual relationship between EgoVista and the client. The dataset is produced and shipped for that purpose.
Post-delivery retentionLegitimate interest (Art. 6.1.f)Retention windows are short, documented, and serve a defined purpose: re-packing into another format, quality re-review on dispute, contractual support.

A DPIA (Data Protection Impact Assessment) covering the full pipeline is available on request for enterprise clients under NDA.

5 of 8 sections

Contributor rights and data subject access.

Contributors keep the standard set of GDPR data subject rights. The way each right is honoured on the EgoVista side:

6 of 8 sections

AI Act and high-risk system data requirements.

Under the EU AI Act, robotics products that perform safety-critical functions can fall under the high-risk system category. Such systems require structured data governance, transparency about training data, and risk management. EgoVista contributes to that governance by shipping each dataset with:

The dataset card is intended to slot into your AI Act compliance documentation without rework. We do not certify the downstream system, that is your team's responsibility, but we make sure the dataset side does not become the missing piece.

7 of 8 sections

How EgoVista handles client confidentiality.

On the client side, the same posture applies. Datasets are produced for the commissioning client and are not reused for another client, with the contractual exclusivity terms agreed at engagement. Raw footage used to build a delivered dataset is purged ninety days after delivery, unless an extended retention is part of the engagement. NDAs are available before any technical conversation, and most enterprise engagements start with a mutual NDA. The storage and processing cost during the project window is absorbed in the delivery fee, with no per-gigabyte surprise on the invoice.

8 of 8 sections

GDPR and compliance frequently asked questions.

Is your dataset legal to use in EU production deployments?

Yes, under standard conditions. Every processing step in our pipeline has a documented legal basis under GDPR, and the data is collected with informed contributor consent. The dataset card we ship documents what was collected, on which legal basis, with which retention. For deployment in a high-risk AI system under the EU AI Act, your team is responsible for the broader governance (risk management, transparency, post-market monitoring), but the dataset side is built to slot into that governance without rework.

Can you provide a DPIA for our compliance review?

Yes. A Data Protection Impact Assessment covering the EgoVista capture and annotation pipeline is available on request for enterprise clients, under a mutual NDA. The DPIA describes the data flows, the legal bases per processing operation, the risks identified and the mitigations applied, including the local anonymization step and the EU-only compute path. Your DPO or compliance team can use the document as a starting point for your own DPIA.

What happens if a contributor withdraws consent after delivery?

Contributors can withdraw consent at any time. For data still in our pipeline that has not been delivered, withdrawal triggers deletion within thirty days. For data already delivered to a client and contractually owned by that client, the contractual chain explains what the client can and cannot do, and we facilitate a deletion request to the client when withdrawal applies. The contributor agreement documents both paths in plain language so no party is surprised.

Are facial features completely removed or just blurred?

Faces are blurred with a conservative Gaussian filter that prevents identification while keeping the body context intact for hand and object segmentation. The blur radius is calibrated to defeat off-the-shelf face recognition models on the anonymized frame, and we verify the result before any external API call. For projects with stronger anonymization needs, we can apply a stronger filter or a mask-and-fill technique, with documented impact on downstream annotation quality.

Do you process any data outside the EU?

No. Every step that handles personal data runs in the EU: storage on Cloudflare R2 in the EU region, segmentation inference on RunPod data centres in Amsterdam and Frankfurt, action labelling on Vertex AI in europe-west4 (Netherlands), database on Supabase EU. Email notifications are sent via Resend, with the email address being the only piece of personal data exposed to a US sub-processor, covered by Standard Contractual Clauses. The full data flow is documented in the privacy policy.

Can you sign a DPA (Data Processing Agreement)?

Yes. We provide a standard DPA aligned with GDPR article 28, covering the scope of processing, the sub-processors involved, the location of processing, the security measures applied, and the procedure for handling data subject requests. The DPA is signed before any client data crosses into our pipeline. Custom amendments to the standard DPA are accepted on request, within the limits of our compliance posture.

How long is raw footage retained before deletion?

Raw footage that was used for a delivered dataset is purged ninety days after delivery, unless the engagement explicitly requires longer retention for re-export or quality re-review. Anonymized frames are retained alongside the dataset for thirty days post-delivery to allow re-packing into a different format. Logs and metadata used to reconstruct the annotation provenance are retained longer, but they do not contain identifiable content.

Request a compliance brief.

Your compliance team can review the EgoVista pipeline before any data is exchanged. We can send a DPA template, a high-level architecture description, and a DPIA summary under NDA. For related material, see the product overview, the LeRobot format details, and the RLDS format details.

Request a compliance briefBack to home