Skip to content

fix: British English pronunciation corrections (NURSE vowel, stress, yod, French loanwords)#15

Open
yocontra wants to merge 2 commits intohans00:mainfrom
yocontra:fix/gb-pronunciation
Open

fix: British English pronunciation corrections (NURSE vowel, stress, yod, French loanwords)#15
yocontra wants to merge 2 commits intohans00:mainfrom
yocontra:fix/gb-pronunciation

Conversation

@yocontra
Copy link
Contributor

@yocontra yocontra commented Mar 7, 2026

Summary

Adds a fixBritishDict() post-processing function to scripts/build-dict.ts that corrects systematic RP (Received Pronunciation) errors when building the British English dictionary. Also adds a build step that produces data/en-gb/dict.json from the same en_US IPA source with these fixes applied.

Pronunciation Fixes

# Category Problem Fix Example
1 NURSE vowel əː used in dictionary entries Replace all əːɜː "nurse" /nəːs//nɜːs/
2 French -age loanwords Final affricate /dʒ/ used Replace final ʤʒ for French borrowings "garage" /ɡæɹɑːʤ//ɡæɹɑːʒ/
3 -arily adverb stress AmE first-syllable stress pattern Add secondary stress ˌ on penultimate -ɛɹ- "primarily" /pɹaɪmɛɹɪli//pɹaɪˌmɛɹɪli/
4 -ormation derivatives -form- vowel reduced to /ə/ Preserve /ɔː/ in -form- syllable "information" /ɪnfəɹmeɪʃən//ɪnfɔːɹmeɪʃən/
5 Yod insertion after /s/ AmE yod-dropping (/suː/) Insert /j/ before /uː/ after /s/ "suit" /suːt//sjuːt/

Rationale

1. NURSE vowel (əːɜː)

əː is not a recognized phoneme in any English dialect. The NURSE vowel is universally transcribed as /ɜː/ in IPA — the open-mid central unrounded vowel. The source dictionary's use of əː appears to be an error.

2. French -age loanwords (final ʤʒ)

RP consistently uses the fricative /ʒ/ (not the affricate /dʒ/) for French borrowings ending in -age: garage, massage, barrage, camouflage, sabotage, espionage, reportage, decoupage, persiflage, badinage. This is one of the most recognizable BrE/AmE pronunciation differences.

3. -arily adverb stress

BrE places secondary stress on the penultimate syllable of -arily adverbs (e.g. primarily, necessarily), whereas AmE stresses the first syllable. The fix adds a secondary stress marker ˌ before the ɛɹ sequence for the 10 most common words in this class.

4. -ormation derivatives (/ə//ɔː/)

BrE preserves the full /ɔː/ vowel in the -form- syllable of words like information, transformation, confirmation, rather than reducing it to schwa as in casual AmE.

5. Yod insertion after /s/ (/suː//sjuː/)

BrE retains the palatal glide /j/ before /uː/ after /s/ in words like sue, suit, super, superb, superior. This is part of the broader BrE yod-retention pattern.

Changes

  • scripts/build-dict.ts: Added fixBritishDict(dict) function (with detailed comments) and a new build block that produces data/en-gb/dict.json

yocontra added 2 commits March 3, 2026 23:17
When ipa-dict provides multiple pronunciation variants for a word
(e.g. "caught" → /ˈkɑt/, /ˈkɔt/), prefer the variant containing ɔ
(THOUGHT vowel) over ɑ (LOT vowel). This better represents standard
American English pronunciation for THOUGHT-class words like caught,
bought, law, fall, walk, want, etc.
…yod, French loanwords)

Add fixBritishDict() post-processing function to build-dict.ts that
applies RP pronunciation fixes when building the en-gb dictionary:

1. NURSE vowel: əː → ɜː (not a real English phoneme)
2. French -age loanwords: final ʤ → ʒ (garage, massage, etc.)
3. -arily adverb stress: add secondary stress on penultimate -ɛɹ-
4. -ormation derivatives: preserve /ɔː/ in -form- syllable
5. Yod insertion after /s/: suː → sjuː (suit, super, etc.)

Also adds a GB dictionary build step that derives en-gb/dict.json
from the same en_US IPA source with these fixes applied.
yocontra added a commit to yocontra/phonemize that referenced this pull request Mar 15, 2026
Merges the following PRs into a single combined branch:
- hans00#11: fix: prefer THOUGHT vowel (ɔ) variant in dictionary
- hans00#12: fix: use STRUT vowel (ʌ) instead of schwa (ə) in stressed syllables
- hans00#13: fix: distinguish NURSE vowel (ɜː) from unstressed ɚ, add linking ɹ
- hans00#14: feat: add vowel length marks (ː) to IPA output
- hans00#15: fix: British English pronunciation corrections
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant