Skip to content

fix: use STRUT vowel (ʌ) instead of schwa (ə) in stressed syllables#12

Open
yocontra wants to merge 1 commit intohans00:mainfrom
yocontra:fix/strut-vowel
Open

fix: use STRUT vowel (ʌ) instead of schwa (ə) in stressed syllables#12
yocontra wants to merge 1 commit intohans00:mainfrom
yocontra:fix/strut-vowel

Conversation

@yocontra
Copy link
Contributor

@yocontra yocontra commented Mar 4, 2026

Summary

The ipa-dict source data uses ə (schwa) for the STRUT vowel in words like "but", "cut", "run", "come", "love", "other", "mother", "nothing", etc. In standard IPA for English, the STRUT vowel is /ʌ/ — a distinct phoneme from schwa /ə/.

These are different vowels:

  • ə (schwa) — unstressed, reduced vowel: "the" = /ðə/, "about" = /əˈbaʊt/
  • ʌ (STRUT) — stressed, open-mid back vowel: "but" = /bʌt/, "cut" = /kʌt/, "love" = /lʌv/

The ARPABET mapping in src/consts.ts correctly maps AH → ʌ, but the ipa-dict source data uses ə where ʌ is expected, causing hundreds of common words to have incorrect transcriptions.

What this PR does

Adds a fixStrutVowel() post-processing step to scripts/build-dict.ts that runs after merging dictionaries but before trimming. It detects ə in stressed closed syllables and converts it to ʌ.

The regex logic:

  • Finds ə that is the first vowel after a stress mark (ˈ or ˌ), meaning it's in a stressed position
  • Requires ə to be followed by a consonant (closed syllable), which distinguishes STRUT from genuine schwa
  • Open syllable ə (like "the" = ðə) is correctly preserved

Examples of fixes

Word Before (incorrect) After (correct)
but ˈbət ˈbʌt
cut ˈkət ˈkʌt
come ˈkəm ˈkʌm
love ˈɫəv ˈɫʌv
other ˈəðɝ ˈʌðɝ
mother ˈməðɝ ˈmʌðɝ
nothing ˈnəθɪŋ ˈnʌθɪŋ

What is preserved (no false positives)

Word IPA Reason
the ðə No stress mark — not touched
about əˈbaʊt Initial ə is unstressed — not touched
sofa ˈsoʊfə Final ə is not the first vowel after stress — not touched

The ipa-dict source uses ə (schwa) for the STRUT vowel in words like
"but", "cut", "come", "love", "other", "mother", "nothing", etc.
In standard IPA for English, STRUT is /ʌ/ — a distinct phoneme from
schwa /ə/. This adds a post-processing step to the dictionary build
that detects ə in stressed closed syllables and converts it to ʌ.
yocontra added a commit to yocontra/phonemize that referenced this pull request Mar 15, 2026
Merges the following PRs into a single combined branch:
- hans00#11: fix: prefer THOUGHT vowel (ɔ) variant in dictionary
- hans00#12: fix: use STRUT vowel (ʌ) instead of schwa (ə) in stressed syllables
- hans00#13: fix: distinguish NURSE vowel (ɜː) from unstressed ɚ, add linking ɹ
- hans00#14: feat: add vowel length marks (ː) to IPA output
- hans00#15: fix: British English pronunciation corrections
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant