CURATE is a facial region-aware editing pipeline that combines facial landmark detection and Stable Diffusion inpainting to enable targeted image manipulation (e.g., adding eyeliner, adjusting skin tone, modifying lips, etc.) while preserving realism.
While AI models such as ChatGPT or Gemini offer good image creation and manipulation services, just like diffusion models, they have a tendency to change the entire image rather than a selected portion. And though minor fluctuations are relatively tolerable in case of abstract images, even the slightest bit of fluctation in human faces can be clearly noticed.
Thus, targeted manipulation - only modifying the relevant pixels becomes relevant.
It leverages face alignment, region mask generation, and diffusion-based inpainting to precisely edit specific parts of the face.
This project demonstrates a workflow where:
- Facial landmarks are detected using
face-alignment. - A binary mask is generated for specific facial regions (e.g., eyes, lips, skin).
- A Stable Diffusion Inpainting model edits only the masked region based on a user prompt (e.g., “add eyeliner”).
The result is a photorealistic facial edit while preserving the rest of the face unchanged.
- Region-aware editing — precisely target eyes, lips, nose, or full face.
- Facial landmark detection — powered by
face-alignment. - Dynamic mask generation — creates convex hull-based masks with optional dilation.
- Prompt interpretation — maps natural language prompts (e.g., “add eyeliner”) to facial regions.
- Diffusion-based inpainting — high-quality edits using the
StableDiffusionInpaintPipeline. - GPU acceleration — runs on CUDA for faster processing.
Run the following command to install all dependencies:
pip install face-alignment diffusers transformers accelerate safetensors pillow opencv-python matplotlibDetects 68 facial keypoints using face-alignment.
fa = FaceAlignment(landmarks_type, device='cuda')
preds = fa.get_landmarks(np.array(img_pil))[0]Creates a binary mask for a chosen region such as eyes, lips, or face_skin.
mask = create_region_mask(img_pil.size, preds, region="face_skin", expand_pixels=5)Parses user input to identify what region and attribute to edit.
user_prompt = "add eyeliner"
intent = interpret_prompt(user_prompt)
# Output: {'action': 'add', 'attribute': 'eyeliner', 'region': 'eyes'}Applies the edit using the inpainting model from Hugging Face.
pipe = StableDiffusionInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-inpainting",
torch_dtype=torch.float16
).to("cuda")
result = pipe(
prompt=edit_prompt,
image=img_pil,
mask_image=mask,
strength=0.8,
guidance_scale=0.75
).images[0]Output:
| Region Keyword | Landmarks Used | Description |
|---|---|---|
eyes |
36–47 | Both eyes |
lips |
48–59 | Outer lips |
nose |
27–35 | Nose ridge and base |
eyebrows |
17–26 | Both eyebrows |
face_skin |
Convex hull excluding eyes, nose, lips, and eyebrows | Full face surface |
jawline |
0–16 | Chin and jaw outline |
| User Prompt | Interpreted Region | Example Edit |
|---|---|---|
| "add eyeliner" | eyes | Adds dark eyeliner |
| "apply lipstick" | lips | Changes lip color |
| "darken skin tone" | face_skin | Adjusts skin color |
| "highlight nose bridge" | nose | Brightens nose region |
- Python 3.8+
- CUDA-compatible GPU (for
torch.float16acceleration) - Internet access for model download (first run only)
This project is open-source under the MIT License.



