-
Notifications
You must be signed in to change notification settings - Fork 434
MSC4421: Standardize the spec on US English #4421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For interest/reference, I created a PR bringing the spec into line with the current documentation style (i.e. en_GB), as far as the word "authori[zs]ation" goes: matrix-org/matrix-spec#2351
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a discussion in the internal Spec Core Team room about this MSC. @richvdh was initially concerned about referencing terms from other specs with slightly different spelling (OAuth spec defines "authorization server", "authorization grant", etc.). But that concern abated after reading https://auth0.com/fr/intro-to-iam/what-is-oauth-2, which just translates the terms to French (authorization code grant -> attribution de code d'autorisation). This appears to be fine in practice. @anoadragon453 initially said that they were indifferent on British English vs. US English being used for prose, but then conceded that US English would be better from a technical standpoint, as most other internet-defining specs are written in, or default to, US English (IETF RFCs, WHATWG, W3G, Khronos, etc.). So to avoid time-wasting footguns in the future, that was likely the easiest to work with for the Matrix spec as well. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| # MSC4421: Standardize the spec on US English 🇺🇸 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm weakly against this change. From what I can see, the rationale is:
Consistency could go either way so isn't a convincing argument. Searchability is a weird one: I personally do not CTRL+F "authorisation server" then get sad that it's actually "authorization server". How we relate Matrix to other technical standards would be a compelling argument were it not for the later comparison that ISO is British English and RFCs are either. ANSI and IEEE would obviously be American English because they were founded in the States. W3C is a weird one given Berners-Lee and CERN, but it was established... in the States. Matrix was notably not established in the States, so it's not unreasonable for it not to follow American English. The spec's recommendation of British English pretty much settles it for me, if anything we should be more strongly enforcing it. Enforcement itself isn't an argument since after all, even if we did follow American English we would still need to strongly enforce it for consistency. All considered, there isn't enough here to overcome inertia imo.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I hadn't looked at it this way before but if you think of the choice as an extension of heritage, that actually makes for a decent argument. This thing was invented in the UK. Therefore, it uses British English. Period.
The haziness is that the spec recommends it but we're actually doing the opposite in practice. There was a longer discussion in the Matrix Spec & Docs Authoring room when the OAuth APIs were introduced which ended with an en-US momentum which, at least in my experience, has then semi-officially been applied in spec PRs.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
...which is why we need to enforce it. :) A slight tangent: enforcement is unfortunately bureaucratic. I would strongly oppose having MSC authors or reviewers manually check for en-GB compliance because quite frankly, it's not a good use of human time imo. I'd much rather we enforced other things which can have a material impact on the protocol. When MSCs get converted into Spec prose, that seems like a good time to get out the en-GB spell checker and have a checklist item for conformity: this is still bureaucratic but it affects the spec writer rather than everyone.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'd be totally fine with that option, too. My goal here is to force a decision to settle the current confusion and Americanization just seemed like the most likely outcome based on the previous chats. Assuming we FCP-close this proposal and (actually) stick to British English, we should reiterate the house rules to explicitly exclude non-localizable terms and identifiers inherited from other standards. I think that would qualify as a clarification and shouldn't require an MSC itself.
Yes, agreed. I'm only concerned with the spec text here. Not proposals.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm aligned with kegan's position here. The use of UK English reflects the language of those who wrote the spec in the first place, and I don't see enough of a reason to change that; instead we should improve the consistency - at least in the spec itself.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The main argument for US English is that the spec currently has currently 54 instances of Much of the problem here is that we are naturally constrained by other specifications, notably OAuth2. We have to talk about concepts like an "authorization server", which is a defined concept in OAuth2. If we were writing in, say, German, then (I gather from native German speakers) we'd probably still call it an "authorization server" rather than ein "Autorisierungsserver" or something, so by extension we should probably do the same even if the body of the doc is en_GB. And of course the So we get into this whole question of where exactly we draw the line, which makes authoring and reviewing tricky, and it just goes away if we settle on en_US across the board.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Maybe the question is whether we expect this to stay true in future. If we switch to en_US but then integrate with another standard that uses en_GB, we'll be back to the same problem. I cannot say how likely that is. It looks like the most probable cause would be an RFC that uses en_GB. There seem to be fairly few such RFCs around though. From a quick search I've only found RFC1484, RFC1781 and RFC2076 – all of which use "organisation". So maybe this is not very likely to happen after all.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, for better or worse the majority of specs seem to be in en_US
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've now come across https://auth0.com/fr/intro-to-iam/what-is-oauth-2. If they can talk about "Serveur d'autorisation" (for authorization server) and "attribution de code d'autorisation" (for authorization code grant), then I guess there's no reason we can't spell those terms with an s. |
||
|
|
||
| The spec's house style currently recommends British over American English[^1]. This has historically | ||
| not been strongly enforced, however. For instance, there are multiple instances of both "authorize" | ||
| (🇺🇸) and "authorise" (🇬🇧) in the spec text as well as in identifiers such as `M_UNAUTHORIZED` and | ||
| `m.unauthorised`. While this inconsistency usually doesn't hinder readability, it negatively impacts | ||
| searchability and general consistency of the spec. | ||
|
|
||
| Standardizing on British English is difficult though because many other technical standards use the | ||
| American spelling. For instance, RFC6749 defines the term "authorization server"[^2] as well as the | ||
| `authorization_code` grant type[^3] as an identifier. Using the British spelling when covering these | ||
| in the Matrix spec would be confusing for terms and impossible for identifiers. | ||
|
|
||
| For comparison, the following noteworthy standards use American English: | ||
|
|
||
| - W3C[^4] | ||
| - IEEE[^5] | ||
| - ANSI[^6] | ||
|
|
||
| In contrast, the following standards enforce British English: | ||
|
|
||
| - ISO[^7] | ||
|
|
||
| Lastly, these standards allow either of the two as long as they are used consistently within a | ||
| document: | ||
|
|
||
| - RFCs[^8] | ||
|
|
||
| Given the dominant use of US English in other standards and the unsolvable problem of localizing | ||
| identifiers, this proposal seeks to standardize Matrix on US English. | ||
|
|
||
| ## Proposal | ||
|
|
||
| The spec's house rules are updated to RECOMMEND the American over the British spelling. Existing | ||
| spec text and identifiers are not updated but MAY be migrated in future. Any new spec text or | ||
| identifiers SHOULD use the American spelling. | ||
|
|
||
| ## Potential issues | ||
|
|
||
| This proposal doesn't directly resolve the current inconsistency of both spellings being used in the | ||
| spec simultaneously. It paves the way for an eventually consistent spelling without the need for | ||
| busywork, however. | ||
|
|
||
| Many of the spec contributors and especially members of the core team have a British background. | ||
| Departing from their native spelling might feel odd for some. | ||
|
|
||
| Matrix has a huge center of mass in Europe. In a time of transatlantic tension, committing to the | ||
| American spelling might feel uncomfortable to some. Language and politics should not be conflated, | ||
| however. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "should not" and yet it is conflated, you can't avoid that. There's been plenty of high profile cases in the tech world:
All of these changes had to overcome inertia in order to happen. I wouldn't dismiss the impact of politics on choice of language, especially when there isn't a compelling reason to fall into one or the other. |
||
|
|
||
| ## Alternatives | ||
|
|
||
| We could enforce the British spelling in spec text and identifiers that are not inherited from other | ||
| standards. To aid searchability, a legend of common words that differ in spelling could be included | ||
| at the bottom of each page. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do RFCs do, as it seems like they would hit this the most due to allowing both?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think nothing. Searchability might not be as big of a problem for them given that RFCs get their own pages and search engines appear to be smart enough to even out the spelling differences. |
||
|
|
||
| authorise -> authorize | ||
| authorisation -> authorization | ||
| ... | ||
|
|
||
| A reader searching for "authorization" would at least land on the legend and receive a cue to search | ||
| for "authorisation" instead. This feels more complicated and less practical than standardizing on US | ||
| English, however. | ||
|
|
||
| ## Security considerations | ||
|
|
||
| None. | ||
|
|
||
| ## Unstable prefix | ||
|
|
||
| None. | ||
|
|
||
| ## Dependencies | ||
|
|
||
| None. | ||
|
|
||
| [^1]: <https://github.com/matrix-org/matrix-spec/blob/bb3daafe96cce7ec7e139223429d2ea93e087c08/meta/documentation_style.rst?plain=1#L50> | ||
|
|
||
| [^2]: <https://datatracker.ietf.org/doc/html/rfc6749#section-1.1> | ||
|
|
||
| [^3]: <https://datatracker.ietf.org/doc/html/rfc6749#section-4.1.3> | ||
|
|
||
| [^4]: <https://www.w3.org/guide/manual-of-style/#Spelling> | ||
|
|
||
| [^5]: <https://sagroups.ieee.org/1588/wp-content/uploads/sites/144/2020/05/2014-ieee-sa-standards-style-manual.pdf> | ||
| "under 19.2 c" | ||
|
|
||
| [^6]: <https://www.ansi.org/american-national-standards/ans-introduction/essential-requirements> | ||
| "4.0 only mentions "English" but it is the **American** National Standards Institute" | ||
|
|
||
| [^7]: <https://www.iso.org/ISO-house-style.html#spelling> | ||
|
|
||
| [^8]: <https://www.rfc-editor.org/rfc/rfc7322.html#section-3.1> | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation requirements: