IEEE FG 2026

Privacy-Compliant Human Data Synthesis in Images for GDPR

with Rendering Refined Stable Diffusion (RefSD)

1University of California, Davis     2Karlsruhe Institute of Technology
3Stanford     4Sony AI, Sony Research

*Equal Contribution. Partial work done during internship at Sony AI.

RefSD teaser figure

RefSD pseudonymizes humans in commercially reusable public datasets by fully removing the original person, inserting a pose-aligned identity-free 3D avatar, and refining that avatar with text-guided diffusion. The result preserves scene layout and posture while pushing privacy closer to full masking and keeping the image useful for downstream learning.

Abstract

Privacy regulations (e.g., GDPR, CCPA) mandate that public datasets with permissive commercial licenses (e.g., CC BY 4.0) containing humans be pseudononymized before use. However, existing anonymization methods have notable limitations: blurring or masking degrade downstream utility, GAN-based synthesis offers limited control and photorealism, and diffusion editors may retain identity traces. To overcome these limitations, we propose Rendering Refined Stable Diffusion (RefSD), a three-stage pipeline that (1) removes real humans via segmentation and inpainting, (2) reconstructs pose-aligned, identity-free avatars through SMPL-based 3D rendering, and (3) refines appearance with text-guided diffusion for photorealism. By using rendering, RefSD provides explicit control over body shape, clothing and pose, enabling diverse yet structured avatar generation. To validate human alignment, we introduce HumanGenAI, a human-annotation suite for evaluating privacy preservation, perceptual satisfaction, and attribute-generation fidelity. Beyond HumanGenAI, we conduct re-identification and downstream task benchmarks, demonstrating that RefSD matches the re-ID performance of complete masking while achieving competitive utility relative to real images. Together, RefSD and HumanGenAI establish a scalable pipeline and benchmark for privacy-compliant human synthesis in image datasets.

Highlights

  • RefSD is a three-stage, training-free pseudonymization pipeline that sanitizes the scene, inserts a pose-aligned synthetic avatar, and refines it with diffusion for photorealism.
  • The rendering stage gives explicit control over pose, body shape, clothing, and attribute diversity while avoiding reuse of source-human textures or identity cues.
  • HumanGenAI provides a unified benchmark for privacy, pose preservation, perceptual satisfaction, prompt controllability, and fine-grained attribute fidelity.
  • RefSD reaches re-ID privacy nearly identical to full masking, outperforms prior methods in human evaluation, and improves downstream classification and detection when used as synthetic training data.

RefSD Pipeline

RefSD pipeline figure

RefSD is designed as a privacy-first full-body pseudonymization pipeline that replaces humans with pose-aligned synthetic counterparts rather than editing the original subject in place. The paper frames this as a privacy-utility bridge: the original human is completely removed, posture is preserved through 3D body estimation, and diffusion is used only after a synthetic avatar has already been inserted into the scene.

This separation between structure and appearance matters. Rendering gives RefSD explicit geometric control, while text-guided diffusion improves realism and attribute controllability without copying source identity. The resulting images stay closer to the original scene layout than pure masking, while avoiding the identity leakage risks of direct image editing.

1. Pose Estimation & Sanitization

Detect the person, recover 3D body parameters, then fully remove the original human with segmentation and inpainting so no identifiable traces remain in the background.

2. Rendering-Based Synthesis

Build a pose-aligned SMPL avatar, sample from a diverse bank of base bodies and textures, and composite that identity-free avatar back into the sanitized scene.

3. Diffusion-Based Refinement

Use Canny edges from the rendered avatar plus text prompts for demographics, clothing, and context to refine the synthetic person into a more photorealistic yet still privacy-compliant human.

HumanGenAI Evaluation Framework

HumanGenAI evaluation framework figure

HumanGenAI is introduced to benchmark whether pseudonymized humans are not only private, but also aligned with human expectations for realism, posture, and controllable attribute generation. The framework combines structured human evaluation with downstream task analysis so privacy, quality, and utility are judged together instead of in isolation.

Generated Attribute Fidelity (φ)

Tests how well RefSD follows prompts and preserves attribute intent.

  • φA: Prompt complexity across simple, medium, and complex prompts.
  • φB: Facial attribute fidelity over 50 single-attribute face prompts.
  • φC: Fine-grained attribute translation for close pairs such as ethnicity, age, emotion, and skin tone.
  • φD: Full-body attribute representation across 100 subcategories including clothing and occupation.

Generic Property Assurance (ψ)

Measures whether pseudonymized images remain private, aligned, and visually acceptable.

  • ψA: Privacy assessment of how distinguishable the synthesized human is from the original.
  • ψB: Pose preservation using human side-by-side comparisons.
  • ψC: Human satisfaction for realism, quality, and preference.

HumanGenAI Results

These plots summarize the HumanGenAI findings for facial fidelity, fine-grained attribute translation, and generic human judgments over privacy, pose, and overall satisfaction.

HumanGenAI facial attribute fidelity results
φB facial attribute fidelity. RefSD achieves the strongest facial attribute-following scores across the 50 single-attribute prompts compared with SG-GAN, TriA-GAN, and Mask-SD.
HumanGenAI fine-grained attribute translation results
φC fine-grained translation. Human evaluators find ethnicity and emotion transfers easiest, while age and nearby skin-tone or ethnicity pairs remain the hardest distinctions.
HumanGenAI generic property assurance results
ψ generic property assurance. RefSD leads on privacy, pose preservation, and user satisfaction in both the face-focused and full-body HumanGenAI studies.

Quantitative Results

RefSD pairs strong privacy with usable synthetic data. In the paper’s human evaluation, RefSD scores highest across privacy, pose preservation, and perceptual satisfaction, and in the re-identification benchmark it nearly matches the privacy of completely masking the person.

Privacy person re-identification results
Privacy via person re-identification. On Market-1501, RefSD nearly matches the privacy of full masking while staying stronger than the prior synthesis baselines.
Utility classification results on RAF-DB
Utility for classification. Pre-training with RefSD synthetic people and then fine-tuning on real RAF-DB improves emotion and gender recognition over real-only training.
Utility detection results on OpenImages
Utility for detection. RefSD synthetic pre-training improves OpenImages detection, yielding gains over both synthetic-only and real-only setups.

Qualitative Results

Original Mask-SD DP2: SG-GAN DP2: TRiA-GAN RefSD (Ours)
Scrollable qualitative comparisons across Original, Mask-SD, DP2 SG-GAN, DP2 TRiA-GAN, and RefSD

Scroll through the panel to browse more side-by-side qualitative comparisons across all five methods.

Original DP2: TRiA-GAN RefSD (Ours)
Three-column qualitative comparison across Original, DP2 TriA-GAN, and RefSD

Additional close-up comparisons across the original image, DP2: TRiA-GAN, and RefSD.

Original Mask-SD DP2: TRiA-GAN RefSD (Ours)
Wide row-wise qualitative comparison across Original, Mask-SD, DP2 TriA-GAN, zoomed crops, and RefSD

Rows: Original, Mask-SD, DP2: TRiA-GAN, zoomed crops, and RefSD (Ours).

RefSD preserves pose and scene layout better than Mask-SD and DP2 while producing more natural-looking people. The hardest remaining cases are subtle attribute edits, occlusions, and occasional body mismatches.

BibTeX

@inproceedings{patwari2026refsd,
  title     = {Privacy-Compliant Human Data Synthesis in Images for GDPR},
  author    = {Patwari, Kartik and Schneider, David and Sun, Xiaoxiao and Chuah, Chen-Nee and Lyu, Lingjuan and Sharma, Vivek},
  booktitle = {IEEE International Conference on Automatic Face and Gesture Recognition (FG)},
  year      = {2026}
}