🥐 BioCroissant — Metadata Sharing for (Bio)Medical AI

BioCroissant is a sub‑working group of the Croissant initiative with strong ties to the MLCommons Medical working group that is developing benchmarks and best practices. BioCroissant focuses on exploring and extending the Croissant standard for the biomedical domain, strengthening the technological link between AI and biomedical communities.

Its goal is to make medical and life‑science data standards work seamlessly with the AI ecosystem—across ML frameworks, platforms, and Responsible AI tooling.

Built with real‑world clinical, research, and regulatory needs in mind, BioCroissant bridges the gap between ML engineers, healthcare professionals, and policy makers working to build trustworthy health data ecosystems (e.g. academic health data consortia or international initiatives such as the European Health Data Space, EHDS).

🚀 Why BioCroissant?

Modern biomedical AI depends on high‑quality, interoperable data—yet many datasets today remain siloed, inconsistently documented, or difficult to reuse.

BioCroissant provides:

A simple, structured metadata schema for biomedical datasets
Built‑in FAIR alignment (Findable, Accessible, Interoperable, Reusable)
Compatibility with national and cross‑border health data initiatives
Seamless integration with ML workflows through clean, machine‑readable metadata
Governance‑friendly transparency, supporting reproducibility, auditability, and clinical trust

Whether you are building models, conducting clinical research, or shaping data policy, BioCroissant helps ensure your data is ready for science, ready for AI, and ready for regulation.

Who Is This For?

ML Developers & Data Scientists

Standardized, machine‑readable metadata
Faster onboarding of new datasets
Clear documentation of modalities, provenance, consent, and known limitations

Healthcare & Life Science Professionals

Improved dataset documentation across clinical and research settings
Reduced ambiguity when collaborating across institutions
A shared language for describing multimodal biomedical data

Policy Makers & Regulators

A practical tool aligned with FAIR principles and health data space policies
Increased trustworthiness and auditability of AI training data
A foundation for responsible, transparent biomedical AI ecosystems

🎯 Our Mission

Create a global, open, and interoperable foundation for biomedical AI—one that accelerates scientific discovery while upholding privacy, ethics, and clinical rigor.

BioCroissant is community‑driven, open‑source, and designed to evolve with the ecosystem.
You can help shape both the standard and its tooling.

Contributing

We welcome contributions of all kinds, including:

Schema extensions and refinements
Validator implementations
Dataset examples
Tooling (parsers, generators, linters)
Documentation improvements
Policy and governance feedback

Get Involved

Open an issue or discussion around your use case
Submit a pull request

Get Involved even more

Join MLCommons: https://mlcommons.org/user-profile-settings-entrypoint/
Register for the BioCroissant group: https://groups.google.com/a/mlcommons.org/g/croissant-bio
Join community calls (announced via the BioCroissant group)

Let’s make biomedical data FAIR, transparent, and AI‑ready—for everyone.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥐 BioCroissant — Metadata Sharing for (Bio)Medical AI

🚀 Why BioCroissant?

Who Is This For?

ML Developers & Data Scientists

Healthcare & Life Science Professionals

Policy Makers & Regulators

🎯 Our Mission

Contributing

Get Involved

Get Involved even more

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🥐 BioCroissant — Metadata Sharing for (Bio)Medical AI

🚀 Why BioCroissant?

Who Is This For?

ML Developers & Data Scientists

Healthcare & Life Science Professionals

Policy Makers & Regulators

🎯 Our Mission

Contributing

Get Involved

Get Involved even more

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages