OAuth application intelligence for defense teams. Hosted at oauthsentry.github.io.
OAuthSentry is a static, search-first inventory of OAuth Application IDs across identity platforms (Microsoft Entra to start, with Google Workspace and Github. Every app is bucketed into one of three defender-oriented categories:
| Category | Meaning |
|---|---|
| Compliance | Legitimate first-party or vetted third-party apps. Reference data for allowlists and hunt-tuning. |
| Risky | Legitimate apps that are repeatedly seen in attacker tradecraft - mailbox sync clients, cloud-storage sync tools, first-party apps abused via AADInternals/EvilProxy. |
| Malicious | Confirmed in-the-wild malicious apps: consent phishing, AiTM lures, homoglyph impersonations, threat-actor redirect apps. |
The site is a single static page (no backend, no build step) that fetches the upstream community-curated CSV at load time and falls back to a snapshot mirrored in this repo. Every list is also exported as plain-text and CSV feeds under feeds/ for direct ingestion into a SIEM/EDR.
oauthsentry.github.io/
├── index.html single-page search UI
├── assets/
│ ├── css/style.css
│ └── js/app.js vanilla JS, no framework
├── data/
│ ├── sources.json registry: per-service curated + compliance_fill sources
│ ├── entra/
│ │ ├── curated_mthcht.csv mirrored from mthcht/awesome-lists (the opinion)
│ │ ├── compliance_fill_merill.csv mirrored from merill/microsoft-info (the catalogue)
│ │ └── oauth_apps.csv merged source-of-truth (rebuilt every run)
│ └── google/curated_mthcht.csv mirrored from mthcht/awesome-lists
├── feeds/ auto-generated outputs
│ ├── entra/{compliance,risky,malicious}.{txt,csv}
│ ├── entra/all.json
│ ├── google/{compliance,risky,malicious}.{txt,csv}
│ ├── all/{compliance,risky,malicious}.{txt,csv}
│ ├── all/index.json
│ └── summary.json
├── scripts/build_feeds.py merges curated + compliance_fill, regenerates feeds/
└── .github/workflows/update.yml pulls every source daily and rebuilds
Once published at https://oauthsentry.github.io, the following stable URLs are available:
Each service ships its own three category feeds in both .txt (one app id per line) and .csv (full schema), plus a per-service _all.json. Feed filenames include the service name so multiple feeds can be downloaded into the same directory without colliding.
# Microsoft Entra
https://oauthsentry.github.io/feeds/entra/entra_compliance.txt
https://oauthsentry.github.io/feeds/entra/entra_compliance.csv
https://oauthsentry.github.io/feeds/entra/entra_risky.txt
https://oauthsentry.github.io/feeds/entra/entra_risky.csv
https://oauthsentry.github.io/feeds/entra/entra_malicious.txt
https://oauthsentry.github.io/feeds/entra/entra_malicious.csv
https://oauthsentry.github.io/feeds/entra/entra_all.json
# Google Workspace (beta)
https://oauthsentry.github.io/feeds/google/google_compliance.txt
https://oauthsentry.github.io/feeds/google/google_compliance.csv
https://oauthsentry.github.io/feeds/google/google_risky.txt
https://oauthsentry.github.io/feeds/google/google_risky.csv
https://oauthsentry.github.io/feeds/google/google_malicious.txt
https://oauthsentry.github.io/feeds/google/google_malicious.csv
https://oauthsentry.github.io/feeds/google/google_all.json
# GitHub (beta)
https://oauthsentry.github.io/feeds/github/github_compliance.txt
https://oauthsentry.github.io/feeds/github/github_compliance.csv
https://oauthsentry.github.io/feeds/github/github_risky.txt
https://oauthsentry.github.io/feeds/github/github_risky.csv
https://oauthsentry.github.io/feeds/github/github_malicious.txt
https://oauthsentry.github.io/feeds/github/github_malicious.csv
https://oauthsentry.github.io/feeds/github/github_all.json
Note: GitHub _malicious.txt contains OAuth application names (lower-cased), not numeric IDs. GitHub's audit log emits oauth_application_name on OAuth-lifecycle events but does not include the numeric OAuth App ID, so the matchable IOC is the name.
https://oauthsentry.github.io/feeds/all/all_compliance.txt
https://oauthsentry.github.io/feeds/all/all_compliance.csv
https://oauthsentry.github.io/feeds/all/all_risky.txt
https://oauthsentry.github.io/feeds/all/all_risky.csv
https://oauthsentry.github.io/feeds/all/all_malicious.txt
https://oauthsentry.github.io/feeds/all/all_malicious.csv
https://oauthsentry.github.io/feeds/all/all_index.json
https://oauthsentry.github.io/feeds/summary.json
.txt feeds are one App ID (or app name, for GitHub) per line with # comment headers - drop straight into a SIEM lookup or a watchlist.
# Pull the cross-service malicious feed directly into a Splunk lookup
curl -s https://oauthsentry.github.io/feeds/all/all_malicious.txt | grep -v '^#' > malicious_oauth_apps.csv
# Or pull just the Entra malicious list when your SIEM is Microsoft-only
curl -s https://oauthsentry.github.io/feeds/entra/entra_malicious.txt | grep -v '^#' > entra_malicious.csv// Sentinel/Defender - flag OAuth consent grants for known-malicious app IDs
let malicious = externaldata(appid:string)[
"https://oauthsentry.github.io/feeds/entra/entra_malicious.txt"
] with (format="txt");
AuditLogs
| where OperationName == "Consent to application"
| extend appid = tostring(parse_json(TargetResources)[0].id)
| where appid in (malicious | project appid)Every app in the catalog is also exposed as a callable JSON endpoint, hosted on GitHub Pages with Access-Control-Allow-Origin: * and Content-Type: application/json. No authentication, no rate limit beyond GitHub's defaults, no custom infrastructure - just static JSON files generated by scripts/build_feeds.py at build time.
GET /feeds/api/v1/apps/{slug}.json single-app record (404 = not in catalog)
GET /feeds/api/v1/lookup.json { slug: record } - bulk lookup keyed by slug
GET /feeds/api/v1/lookup_by_appid.json { appid: record } - bulk lookup keyed by raw id
GET /feeds/api/v1/meta.json dataset metadata (counts, generated_at, version)
The {slug} in the single-app URL is computed from the appid with a stable, documented transform. Defenders compute the same rule client-side and hit the right URL directly:
Lower-case the appid, then replace any run of characters outside
[a-z0-9._-]with a single hyphen. Strip leading and trailing hyphens.
| Service | Example raw appid | Slug |
|---|---|---|
| Entra | c5393580-f805-4401-95e8-94b7a6ef2fc2 |
c5393580-f805-4401-95e8-94b7a6ef2fc2 (unchanged) |
1084253493764-ipb2ntp4...apps.googleusercontent.com |
(unchanged - already URL-safe) | |
| GitHub | Heroku Dashboard |
heroku-dashboard |
Every endpoint that returns an app record uses the same schema:
{
"appid": "c5393580-f805-4401-95e8-94b7a6ef2fc2",
"appname": "Office 365 Management APIs",
"service": "entra",
"category": "compliance",
"severity": "info",
"comment": "Microsoft first-party app, used by SIEM connectors and Compliance Center.",
"references": [
"https://learn.microsoft.com/...",
"https://github.com/mthcht/awesome-lists/blob/main/Lists/OAuth/entra_oauth_apps.csv"
],
"slug": "c5393580-f805-4401-95e8-94b7a6ef2fc2"
}# Single-app lookup - the right pattern for SOAR alert enrichment
curl -s https://oauthsentry.github.io/feeds/api/v1/apps/heroku-dashboard.json | jq .category
# Bulk lookup - fetch once, cache in memory, query many times (SIEM-side enrichment)
curl -s https://oauthsentry.github.io/feeds/api/v1/lookup_by_appid.json > oauthsentry.json
# Dataset metadata - useful for dashboards
curl -s https://oauthsentry.github.io/feeds/api/v1/meta.json | jq .by_category# Python: enrich an alert with OAuthSentry classification
import json, requests
CATALOG = requests.get("https://oauthsentry.github.io/feeds/api/v1/lookup_by_appid.json").json()
def classify(appid: str) -> dict | None:
"""Returns None if appid is uncategorized (worth investigating!)"""
return CATALOG.get(appid.lower())
# In your alert pipeline
record = classify(consent_event["app_id"])
if record and record["category"] == "malicious":
page_oncall(record)The same lookup is also exposed as an interactive paste-and-classify tool at https://oauthsentry.github.io/#/triage.
OAuthSentry uses category labels different from the upstream CSV. The mapping is fixed in code:
| Upstream label | OAuthSentry label |
|---|---|
legitimate |
compliance |
risky |
risky |
malicious |
malicious |
severity is preserved from upstream (info, low, medium, high, critical) and shown alongside the category.
Each service composes one or more upstream sources, declared in data/sources.json with a role field:
| Role | What it does |
|---|---|
curated |
The opinion list. Each row carries an explicit category, severity, comment and reference. mthcht's awesome-lists is the curated source for both Entra and Google today. |
compliance_fill |
Catalogue-style first-party app inventory. Every row that the curated source has not already classified is added to compliance, with metadata indicating its provenance. merill's microsoft-info plays this role for Entra. |
planned |
Service is on the roadmap but no upstream source is wired up yet. |
Curated wins on every conflict. If mthcht classifies an AppId as risky (e.g. Microsoft Azure CLI - it's part of the FOCI family and abused in token-theft chains), that classification holds even though merill lists the same AppId as a Microsoft first-party app. Fill rows only fill gaps.
Currently active:
- Microsoft Entra - curated by
mthcht/awesome-lists, compliance-filled frommerill/microsoft-info(~600+ Microsoft first-party AppIds). - Google Workspace - curated by
mthcht/awesome-lists.
Planned: Slack, GitHub, Salesforce, Okta. See CONTRIBUTING.md.
The update.yml workflow runs daily at 04:17 UTC:
- Iterates every source in
data/sources.jsonwhoseroleiscuratedorcompliance_fill. - Pulls each
remoteURL and writes the body to itslocalpath. Logs which sources changed, which were unchanged, and which failed. - Runs
python3 scripts/build_feeds.py, which:- Loads each curated source as the source of truth (mthcht's combined categories like
Phishing - complianceare normalized to the last token). - Loads each compliance_fill source and inserts only AppIds the curated source has not classified, defaulting them to
compliance/infowith aMicrosoft first-party app (via merill/microsoft-info)comment. - Writes the merged result to
data/<service>/oauth_apps.csvand regenerates everything underfeeds/.
- Loads each curated source as the source of truth (mthcht's combined categories like
- Commits any diff back to
mainwith[skip ci]to avoid loops.
Run it locally:
python3 scripts/build_feeds.pyThis repo is structured as a GitHub user/organization site:
- Create the GitHub organization or user
oauthsentry. - Create a public repo named exactly
oauthsentry.github.io. - Push this repo's contents to
main. - In repository Settings → Pages, set the source to
main/ root. - The site is live at
https://oauthsentry.github.iowithin a minute or two.
OAuthSentry is a thin, defender-oriented frontend over work done by:
- mthcht/awesome-lists - the primary Entra dataset
- randomaccess3/detections
- Cyera-Research-Labs/m365-malicious-app-iocs
- anak0ndah/EntraHunt
- merill/microsoft-info
- Wiz, Proofpoint/RH-ISAC, Huntress, ByteIntoCyber and many others for individual reports.
Defenders, not vendors. PRs welcome.
MIT for the code in this repo. The mirrored data preserves the upstream license/terms of each source.