Skip to content

Add classical code generation workflow for Java SDK#70

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/add-classical-code-gen-workflow
Draft

Add classical code generation workflow for Java SDK#70
Copilot wants to merge 4 commits intomainfrom
copilot/add-classical-code-gen-workflow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 12, 2026

Implements the classical code-gen pipeline for the Java SDK, analogous to the dotnet/nodejs/python generators in copilot-sdk/scripts/codegen/. Consumes session-events.schema.json and api.schema.json from the @github/copilot npm package to generate typed Java source files.

Before the change?

  • All Java event and RPC types were hand-authored and manually kept in sync with upstream schema changes — no automated generation or drift detection.

After the change?

Codegen script (scripts/codegen/java.ts)

  • Reads session-events.schema.json → generates com.github.copilot.sdk.generated (76 files): SessionEvent base with full @JsonTypeInfo/@JsonSubTypes polymorphism, one typed event class per variant, UnknownSessionEvent fallback
  • Reads api.schema.json → generates com.github.copilot.sdk.generated.rpc (111 files): params/result DTOs for all 61 RPC methods
  • All generated files carry // AUTO-GENERATED FILE - DO NOT EDIT, schema-source comment, and @javax.annotation.processing.Generated("copilot-sdk-codegen")
  • Nested objects, enums, arrays, UUID, and date-time types are handled; inner classes are emitted inline

Maven (pom.xml)

  • build-helper-maven-plugin adds src/generated/java as a source root
  • Spotless excludes src/generated/java/**

Infrastructure

  • config/checkstyle/checkstyle.xml — excludes generated and rpc packages from Javadoc enforcement
  • .gitattributessrc/generated/java/** eol=lf linguist-generated=true
  • .github/workflows/codegen-check.yml — installs deps, re-runs npm run generate, fails on any diff (runs on push/pull_request paths that touch codegen or generated files)
  • .github/workflows/update-copilot-dependency.ymlworkflow_dispatch with version input; updates @github/copilot in scripts/codegen (via npm install to update package.json and lock file, then npm ci for a clean reproducible install), regenerates, opens a PR automatically

Pull request checklist

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been reviewed and added / updated if needed (for bug fixes / features)
  • mvn spotless:apply has been run to format the code
  • mvn clean verify passes locally

Does this introduce a breaking change?

  • Yes
  • No

Copilot AI linked an issue Apr 12, 2026 that may be closed by this pull request
1 task
- Add scripts/codegen/java.ts - Java code generator from session-events and api schemas
- Add scripts/codegen/package.json with @github/copilot dependency
- Add scripts/codegen/package-lock.json
- Generate src/generated/java/com/github/copilot/sdk/generated/ - 76 session event files
- Generate src/generated/java/com/github/copilot/sdk/generated/rpc/ - 111 RPC type files
- Update pom.xml: add build-helper-maven-plugin, exclude generated from Spotless
- Update config/checkstyle/checkstyle.xml: exclude generated packages
- Update .gitattributes: mark src/generated/java/** as generated
- Add .github/workflows/codegen-check.yml - CI check workflow
- Add .github/workflows/update-copilot-dependency.yml - update workflow
- Update docs/WORKFLOWS.md with new workflow documentation
- Update .gitignore to exclude scripts/codegen/node_modules/"

Agent-Logs-Url: https://github.com/github/copilot-sdk-java/sessions/aee4deda-40d4-4ecd-a831-2af9cb9461e7

Co-authored-by: edburns <75821+edburns@users.noreply.github.com>
Copilot AI changed the title [WIP] Add classical code generation to workflow Add classical code generation workflow for Java SDK Apr 12, 2026
Copilot AI requested a review from edburns April 12, 2026 20:48
- name: Update @github/copilot in scripts/codegen
env:
VERSION: ${{ inputs.version }}
working-directory: ./scripts/codegen
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot add a comment explaining the necessity of the unusual step of running npm install before npm ci

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment in 6e3c50c explaining the two-step approach: npm install updates package.json and package-lock.json to pin the new version, then npm ci does a clean reproducible install from the updated lock file. Running npm ci alone would fail because it cannot change a pinned version, and npm install alone leaves behind stale packages.

Copy link
Copy Markdown
Collaborator

@edburns edburns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot

Epic #69: Improve Quality of Java Code Generator

Context

The file scripts/codegen/java.ts is a TypeScript code generator that reads JSON Schema files from the @github/copilot npm dependency and emits Java source files under src/generated/java/. Its purpose is to replace hand-written Java classes in src/main/java/com/github/copilot/sdk/events/ and src/main/java/com/github/copilot/sdk/json/ with auto-generated equivalents.

The current generator produces Java code that is significantly less idiomatic than the hand-written classes it is meant to replace. This prompt describes the required improvements.

Reference files

Before making any changes, read and understand these files:

  • scripts/codegen/java.ts — the generator to modify
  • src/main/java/com/github/copilot/sdk/events/AbstractSessionEvent.java — the hand-written sealed event hierarchy (the gold standard)
  • src/main/java/com/github/copilot/sdk/events/AssistantMessageEvent.java — example of record-based data payload inside a final event class
  • src/main/java/com/github/copilot/sdk/events/SessionIdleEvent.java — minimal event with empty record data
  • src/main/java/com/github/copilot/sdk/events/UnknownSessionEvent.java — forward-compatibility fallback
  • src/main/java/com/github/copilot/sdk/events/SessionEventParser.java — manual type-map-based deserialization
  • src/main/java/com/github/copilot/sdk/json/ToolDefinition.java — top-level record DTO
  • src/main/java/com/github/copilot/sdk/json/Attachment.java — top-level record DTO
  • src/main/java/com/github/copilot/sdk/json/SessionConfig.java — mutable config class with fluent setters
  • src/main/java/com/github/copilot/sdk/json/ModelInfo.java — mutable bean with fluent setters
  • src/main/java/com/github/copilot/sdk/json/PermissionRequestResultKind.java — extensible enum pattern (string-backed value type with static constants and @JsonCreator)
  • src/main/java/com/github/copilot/sdk/json/ElicitationResultAction.java — classic enum with string backing value

Also read .github/copilot-instructions.md for project conventions.

Requirements

1. Use sealed class for the session event hierarchy

The generated SessionEvent.java base class must be declared as:

public abstract sealed class SessionEvent permits
    AssistantMessageEvent, SessionIdleEvent, ...
    UnknownSessionEvent {

The generator must collect all variant class names and emit them in the permits clause. UnknownSessionEvent must always be the last entry.

The hand-written AbstractSessionEvent.java is the reference for this pattern.

2. Use record types for event data payloads

Every event variant's inner Data class must be generated as a Java record, not a mutable bean. For example, ToolExecutionStartEvent should contain:

@JsonIgnoreProperties(ignoreUnknown = true)
public record ToolExecutionStartData(
    @JsonProperty("toolCallId") String toolCallId,
    @JsonProperty("toolName") String toolName,
    @JsonProperty("arguments") Object arguments
) {}

Not:

public static class ToolExecutionStartData {
    private String toolCallId;
    public String getToolCallId() { return toolCallId; }
    public void setToolCallId(String toolCallId) { this.toolCallId = toolCallId; }
    // ...
}

Records are appropriate here because event data payloads are immutable value types deserialized from JSON — they are never constructed by SDK users or mutated after creation.

3. Use record types for RPC params and result classes

The RPC ...Params and ...Result classes generated from api.schema.json should also be records. These are request/response DTOs that are deserialized from JSON and not mutated.

4. Distinguish integer from number in type mapping

The current generator maps both "integer" and "number" to Double. This is incorrect.

  • JSON Schema "integer" → Java Long (or Integer if the schema constrains the range, but default to Long)
  • JSON Schema "number" → Java Double

Using floating point for integer values is dangerous and unacceptable. Fix schemaTypeToJava() to distinguish these cases.

5. Use primitive types for required fields, boxed types for optional

The required parameter is already computed in schemaTypeToJava() but not used to inform the Java type. Change this:

  • Required boolean field → boolean (primitive)
  • Optional boolean field → Boolean (boxed, nullable)
  • Required integer field → long (primitive)
  • Optional integer field → Long (boxed, nullable)
  • Required number field → double (primitive)
  • Optional number field → Double (boxed, nullable)

Note: this applies to mutable bean classes only. Records always use boxed types because record components are nullable by default and there is no way to distinguish "field absent from JSON" vs "field present with default value" with primitives.

6. Make event subclasses final

Every generated event variant class must be declared public final class ...Event extends SessionEvent. This cooperates with the sealed hierarchy and prevents accidental subclassing.

7. Reduce fallback to Object

The current generator falls back to Object for:

  • Multi-type anyOf unions with more than one non-null branch
  • Unrecognized schema patterns
  • Untyped schemas

For each fallback site, add a console.warn() logging the schema path and context so that silent type erasure is visible during generation. Do not silently swallow schema information.

Additionally, for anyOf unions of exactly two concrete types where one is string, prefer String over Object — strings are the most common wire-level fallback.

8. Remove dead code

The function generateDataClass (around line 215) appears to be unused — it is not called from generateSessionEvents, generateRpcTypes, or main. Verify this claim:

  • Search for all call sites of generateDataClass in java.ts and any other files in scripts/codegen/.
  • If truly dead, remove it entirely.
  • If it is used, fix its return type (currently returns an object but is typed as returning string, cast via as any).

9. Improve enum generation

Current enums use @JsonValue on a getValue() method but lack @JsonCreator for deserialization. Add a static @JsonCreator factory method to each generated enum:

@JsonCreator
public static MyEnum fromValue(String value) {
    for (MyEnum v : values()) {
        if (v.value.equals(value)) return v;
    }
    throw new IllegalArgumentException("Unknown value: " + value);
}

10. Generate Javadoc

  • Each generated class must have a Javadoc comment. If the schema provides a description, use it. Otherwise, generate a sensible default (e.g., /** The {@code session.idle} session event. */).
  • Each record component or field with a schema description must have a Javadoc comment.
  • Add @since tags based on the current SDK version.

11. Add @JsonInclude(NON_NULL) consistently

The hand-written records use @JsonInclude(JsonInclude.Include.NON_NULL) at the class level. The generated code should do the same on all generated types.

Constraints

  • Do not modify any hand-written files in src/main/java/. This task is only about improving the generator.
  • Do not change the output directory (src/generated/java/com/github/copilot/sdk/generated/).
  • Do not change the package name (com.github.copilot.sdk.generated).
  • After making changes, run the generator and verify the output compiles: cd scripts/codegen && npm ci && npx tsx java.ts && cd ../.. && mvn compile -pl .
  • Run mvn spotless:apply before committing.
  • The generated code must be compatible with Java 17 (the project baseline).
  • Jackson 2.x must be able to deserialize the generated records and sealed class hierarchy correctly. Test this by running mvn verify.

Verification checklist

After completing the changes, verify:

  • SessionEvent.java is sealed with a permits clause listing all event classes plus UnknownSessionEvent
  • All event ...Data inner classes are record types
  • All RPC ...Params and ...Result classes are record types
  • "integer" schema type maps to Long, "number" maps to Double
  • Required primitive fields use unboxed types in mutable classes
  • All event subclasses are public final class
  • generateDataClass is removed (or fixed if actually used)
  • Enums have @JsonCreator factory methods
  • No silent Object fallbacks — each one logs a warning during generation
  • mvn verify passes with all existing tests
  • mvn spotless:apply produces no changes after a clean generation

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 13, 2026

@copilot

Epic #69: Improve Quality of Java Code Generator...

All eleven improvements have been implemented in commit 75e5dba:

  1. Sealed abstract hierarchySessionEvent.java is now abstract sealed class SessionEvent permits ...all variants..., UnknownSessionEvent; each variant overrides public String getType().
  2. Records for event data — every ...Data inner class is now a Java record; empty records (SessionIdleEventData()) are handled too.
  3. Records for RPC types — all ...Params and ...Result classes, plus their nested objects, are records.
  4. integerLong, numberDouble — the old conflation of both to Double is fixed.
  5. Primitive vs boxedschemaTypeToJava honours the required flag for primitive/boxed choice; record component generation always passes required=false so record components are uniformly boxed.
  6. final variant classes — already generated as public final class; now confirmed sealed-compatible.
  7. anyOf String preference — for exactly-two-branch unions that include string, the generator now returns String instead of Object. Wider unions (e.g. the elicitation.completed content field with 4 branches) still fall back to Object, which is safe.
  8. console.warn() for Object fallbacks — every silent Object fallback now logs the schema path and context during generation.
  9. Dead generateDataClass removed — confirmed no call sites; function deleted.
  10. @JsonCreator on enums — every generated enum has a static fromValue(String) factory.
  11. Javadoc with @since 1.0.0 — all top-level generated types carry a Javadoc block; @JsonInclude(NON_NULL) is applied consistently.

Copilot AI requested a review from edburns April 13, 2026 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT]: Add classical code-gen to workflow

2 participants