Skip to content

Simplify the transformation skill#7

Open
carlevison wants to merge 2 commits into
mainfrom
simplify_transformation_skill
Open

Simplify the transformation skill#7
carlevison wants to merge 2 commits into
mainfrom
simplify_transformation_skill

Conversation

@carlevison
Copy link
Copy Markdown
Contributor

Simplified cloudinary-transformations into a compact agent playbook that treats Cloudinary docs as the source of truth for exact syntax, current limitations, and costs.

Key changes:

  • Reduced the skill from large duplicated reference content to a concise SKILL.md.
  • Replaced eight bulky reference files with three focused playbooks:
    - agent-playbook.md
    - debugging-playbook.md
    - cost-and-caching.md
  • Preserved core agent guidance: requirement gathering, defaults, validation, debugging order, common footguns, and cost/caching heuristics.
  • Added guidance to present transformation parameters in SDK-style alphabetical order within components.
  • Validated the skill with quick_validate.py.

Copy link
Copy Markdown
Collaborator

@njb90 njb90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment but otherwise it is definitely an improvement.

Comment thread skills/cloudinary-transformations/references/agent-playbook.md Outdated
@strausr strausr self-requested a review June 5, 2026 18:12
Copy link
Copy Markdown
Collaborator

@strausr strausr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran 15 prompts against the new skill (resize, face crop, bg removal, gen fill / 16:9, w_auto, named transforms, baseline strategy, video trim + mute, debugging, g_auto)
First, using only SKILL.md + references, then checking whether the docs-first path (llms.txt) fills the gaps.

Result: path works well (resize, fill crop, overlays, bg removal, f_auto:video, debugging). Without a doc fetch, several common cases were partial or failed: face-focused crops (no g_face), gen fill, w_auto + Client Hints setup, named/baseline URL shape, video trim, and invalid g_auto + c_scale.

Left some comments on where I found some issues.

- Use automatic gravity for varied content.
- Use face/person/object-specific gravity only when it matches the asset and task.
- Use compass gravity or explicit offsets for predictable layouts and overlays.
- Verify `g_auto` compatibility in the docs for the chosen crop mode.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agents often produce invalid c_scale,g_auto,w_800. Can we add an explicit pitfall here?
g_auto only works with c_fill, c_lfill, c_crop, c_thumb, c_auto — not c_scale, c_fit, c_limit, c_pad.
“Verify in docs” is correct but doesn’t stop the most common mistake.

l_logo/c_scale,w_120/fl_layer_apply,g_south_east,x_20,y_20/f_auto/q_auto
co_white,l_text:Arial_48:Hello%20World/fl_layer_apply,g_south/f_auto/q_auto
e_background_removal/f_png/q_auto
c_scale,w_720/vc_auto/f_auto:video/q_auto
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simulation failed on several common asks without a doc fetch. Consider adding skeletons for:

Gen fill: c_pad,ar_16:9,b_gen_fill,w_/f_auto/q_auto
Responsive: c_limit,w_auto/f_auto/q_auto (+ Client Hints prerequisite)
Named: t_/f_auto/q_auto
Video trim + mute: so_,du_/ac_none/vc_auto/f_auto:video/q_auto

- Suggest `ac_none` for autoplay or silent preview use cases.
- Prefer automatic codec selection unless the user has a compatibility requirement.

Responsive images:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section warns about Client Hints but doesn’t show w_auto syntax or setup. Agents invented URLs without the HTML prerequisite.
Suggest adding:

Example: c_limit,w_auto/f_auto/q_auto

- Transformation reference: exact parameter names, actions, qualifiers, values.
- Image transformations, resizing/cropping, layers, effects, background removal, generative AI transformations.
- Video manipulation/delivery, video resizing/cropping, trimming/concatenating, video layers, audio transformations.
- Responsive images and Client Hints docs for `w_auto`, `dpr_auto`, and breakpoints.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name the doc page explicitly

**Example:** `c_fill,g_auto,w_400,h_300/f_auto/q_auto`
1. Identify asset type: image, video, raw, animated image, or fetched media.
2. Clarify only blocking requirements: dimensions, crop behavior, focal point, transparency, output format, video duration/audio, and whether AI edits are acceptable.
3. Look up exact syntax in Cloudinary docs when using anything beyond stable, common patterns.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workflow doesn’t say when to open the reference files. Simulation showed agents often read only SKILL.md and miss intent mapping in agent-playbook.md.
Suggest step 3 or 4: “For resize/crop, overlays, AI, responsive, named/baseline, or video trim → read agent-playbook.md before building the chain.”

- Whether the crop should favor faces, a known subject, center, a compass position, or Cloudinary automatic gravity.

## Gathering Requirements
For AI requests, determine:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old skill proactively suggested AI for aspect-ratio changes without cropping. New skill only lists questions agents may suggest c_pad,b_white instead of b_gen_fill.
Suggest: when user needs a new aspect ratio without cropping, mention generative fill as an option (with cost warning).

- The referenced named transformation includes a concrete output format.
- Automatic runtime parameters such as `f_auto`, `w_auto`, and `dpr_auto` remain outside the named/baseline transformation.
- Baselines may not apply to every delivery type or transformation type.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simulation struggled to write baseline URLs. Consider adding a skeleton.
Old named-transformations.md had worked examples, worth one minimal pattern here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants