Skip to content

chore(l10n): extract strings, add docs, load locales on demand#452

Open
LeoMcA wants to merge 14 commits intomainfrom
fluent-ast
Open

chore(l10n): extract strings, add docs, load locales on demand#452
LeoMcA wants to merge 14 commits intomainfrom
fluent-ast

Conversation

@LeoMcA
Copy link
Member

@LeoMcA LeoMcA commented Jul 24, 2025

Gets things to a stage where all strings are - in theory - l10n-able. Next steps will be improving the l10n experience, and further optimising which strings we ship to the client.

The added README is a good source for some of what's going on here.

@LeoMcA LeoMcA requested review from a team and mdn-bot as code owners July 24, 2025 18:29
@github-actions
Copy link
Contributor

github-actions bot commented Jul 24, 2025

ba08f2e was deployed to: https://fred-pr452.review.mdn.allizom.net/

@caugner caugner marked this pull request as draft August 19, 2025 11:35
@caugner
Copy link
Contributor

caugner commented Aug 19, 2025

Converted to draft, as it has merge conflicts.

Copy link

@bcolsson bcolsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that more work is happening around localization at some point in the future, but I noticed some major issues that could cause headaches down the road, so adding comments. I'd suggest making changes before asking localizers to translate these. (If the expectation is to use Pontoon, then the complicated Fluent logic wouldn't be supported well either.)

footer-copyright = Portions of this content are ©1998–{ $year } by individual mozilla.org contributors. Content available under <a data-l10n-name="cc">a Creative Commons license</a>.
search-modal-site-search = Site search for <em>{ $query }</em>
site-search-search-stats = Found { $results } documents.
site-search-suggestion-matches =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is technically valid fluent it's best to avoid this pattern. This should be two strings - otherwise you can't guarantee that all patterns will have translations (reference).

Typically you want to remove the logic from the strings themselves, so it's preferable to have a -greaterthan and a -equals string then have the logic happening outside of fluent to choose the correct string.

Comment on lines 77 to 109
compat-support-flags =
{ NUMBER($has_added) ->
[one] From version { $version_added }
*[other] { "" }
}{ $has_last ->
[one]
{ NUMBER($has_added) ->
*[zero] Until { $versionLast } users
[one] { " " }until { $versionLast } users
}
*[zero]
{ NUMBER($has_added) ->
*[zero] Users
[one] { " " }users
}
}
{ " " }must explicitly set the <code data-l10n-name="name">{ $flag_name }</code>{ " " }
{ $flag_type ->
*[preference] preference
[runtime_flag] runtime flag
}{ NUMBER($has_value) ->
[one] { " " }to <code data-l10n-name="value">{ $flag_value }</code>
*[other] { "" }
}{ "." }
{ $flag_type ->
[preference] To change preferences in { $browser_name }, visit { $browser_pref_url }.
*[other] { "" }
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is untranslatable. Even in English I barely understand what a final string would look like, so compounding translations with different word order, you'll have a very difficult time to get something comprehensible out of this.

This should be split into multiple strings. E.g.
string-a = From version { version }
string-b = From version { version } until {version} users
ETc.

compat-experimental = Experimental
compat-nonstandard = Non-standard
compat-no = No
compat-support-full = Full support
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding this and multiple other strings with the exact same ID. You'll need to find a way to avoid duplication.

compat-legend-preview = In development. Supported in a pre-release version.
compat-legend-no = { compat-support-no }
compat-legend-unknown = Compatibility unknown
compat-legend-experimental = { compat-experimental }. Expect behavior to change in the future.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of understand why you want to use a reference here and below, but there is the potential for localization issues here. It'd be simpler for localizers to just have the word instead of a message reference.

@caugner caugner removed the request for review from a team October 22, 2025 10:20
@LeoMcA LeoMcA marked this pull request as draft January 30, 2026 16:54
@LeoMcA
Copy link
Member Author

LeoMcA commented Feb 3, 2026

@bcolsson thanks for your comments, once we've got the mechanics of our string scraping sorted out in this PR, we'll open others to resolve what you've noted, and make some of the auto-generated ids more meaningful.

@caugner I've updated this PR in line with what we discussed a little while ago in Berlin: removing the ability to add strings without defining an id. I temporarily used some of that logic in the migration script to update the current id-less instances to ones with ids.

Commits are mostly atomic - they may not all stand alone, because for some of them I simply split up my previous large commit more logically - but should be much nicer to review than before.

@LeoMcA LeoMcA marked this pull request as ready for review February 3, 2026 17:22
@LeoMcA LeoMcA requested a review from a team as a code owner February 3, 2026 17:22
Comment on lines +24 to +25
"l10n:extract": "node l10n/parser/extract.js",
"l10n:lint": "node l10n/parser/extract.js --lint"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
"l10n:extract": "node l10n/parser/extract.js",
"l10n:lint": "node l10n/parser/extract.js --lint"
"l10n": "node l10n/parser/extract.js",
"l10n:lint": "node l10n/parser/extract.js --lint"

Rather than extract.js maybe it could be called l10n/cli/index.js?

Comment on lines +228 to +231
const locale = req.path.split("/")[1];
if (locale && /^q[a-t][a-z]$/.test(locale)) {
req.path = req.path.replace(locale, "en-US");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this does, can you explain in a code comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally not alphabetically ordered?

* Runs on either client or server,
* and returns the client or server context respectively
* @returns {import("./types.js").SymmetricContext}
* @returns {import("./types.js").SymmetricContext | undefined}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the JSDoc comment explain when this is undefined?

* @import { TextElement } from "@fluent/syntax";
*/

class AccentTransformer extends Transformer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a JSDoc comment explaining what this transformer does?

Comment on lines 24 to 57
export async function extract() {
const manualStrings = await readFile(
fileURLToPath(import.meta.resolve("../locales/en-US.ftl")),
"utf8",
);
const fluentResource = parse(manualStrings, {});

const project = new Project({});
project.addSourceFilesAtPaths(
path.join(__dirname, "..", "..", "components", "**", "*.js"),
);

/** @type {Map<string, string>} */
const map = new Map();

for (const file of project.getSourceFiles()) {
for (const taggedTemplate of file.getDescendantsOfKind(
SyntaxKind.TaggedTemplateExpression,
)) {
const tagNode = taggedTemplate.getTag();
if (Node.isCallExpression(tagNode)) {
// e.g. this.l10n("foobar")`barfoo`
const expr = tagNode.getExpression();
if (Node.isPropertyAccessExpression(expr) && isL10nTag(expr)) {
const [arg] = tagNode.getArguments();
if (Node.isStringLiteral(arg)) {
const key = arg.getLiteralValue();
const value = getLiteralValue(taggedTemplate);
map.set(key, value);
}
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we split this up a bit, to make some parts of it unit-testable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline at EOF.

let message;

if (this.locale === "qa") {
if (this.locale === "qai") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's qa vs qai?

- `qaa`: "accented" locale: adds accents to all characters, duplicates some vowels to create longer strings, wraps string in square brackets to help detect truncation
- `qai`: "id" locale: replaces strings with their identifiers, wrapped in square brackets

The `qai` locale works all the time, the `qaa` locale must be manually generated with `node ./parser/transform.js`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make node ./parser/transform.js a script in package.json?

Comment on lines +4 to +19
article-footer-last-modified = This page was last modified on <time data-l10n-name="date">{ $date }</time> by <a data-l10n-name="contributors">MDN contributors</a>.
article-footer-source-title = Folder: { $folder } (Opens in a new tab)
baseline-asterisk = Some parts of this feature may have varying levels of support.
baseline-high-extra = This feature is well established and works across many devices and browser versions. It’s been available across browsers since { $date }.
baseline-low-extra = Since { $date }, this feature works across the latest devices and browser versions. This feature might not work in older devices or browsers.
baseline-not-extra = This feature is not Baseline because it does not work in some of the most widely-used browsers.
baseline-supported-in = Supported in { $browsers }
baseline-unsupported-in = Not widely supported in { $browsers }
baseline-supported-and-unsupported-in = Supported in { $supported }, but not widely supported in { $unsupported }
homepage-hero-title = Resources for Developers,<br> by Developers
homepage-hero-description = Documenting <a data-l10n-name="css">CSS</a>, <a data-l10n-name="html">HTML</a>, and <a data-l10n-name="js">JavaScript</a>, since 2005.
not-found-title = Page not found
not-found-description = Sorry, the page <code data-l10n-name="url">{ $url }</code> could not be found.
not-found-fallback-english = <strong data-l10n-name="strong">Good news:</strong> The page you requested exists in <em data-l10n-name="em">English</em>.
not-found-fallback-search = The page you requested doesn't exist, but you could try a site search for:
not-found-back = Go back to the home page
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be nice is if it detected multiple lines with common prefix and added a line between them, for readability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants