All notable changes to BrailleKit will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.8.0] — 2026-06-03
Added
Auto-update on Windows via WinSparkle 0.9.3. BrailleKit now checks for updates daily in the background and offers signed installer updates through a standard "Software Update" dialog. A "Check for Updates…" entry is also available in the Help menu. Update appcast hosted at
dl.slohmaier.com/appcast/braillekit-win.xmlwith EdDSA-signed release artifacts. WinSparkle.dll bundled with the Inno installer.Spanish G2 (es-g2): full coverage of CBE Documento Técnico B-16-1 (V3 2025) — the authoritative standard for Spanish contracted braille (Estenografía española). Discovered and bundled the previously-missing standard PDF from the official ONCE website (CC BY-NC-ND). Programmatically extracted every contraction:
- Tables 1, 2, 3 (51 single-cell abbreviations)
- Tabla 5 (15 gender/number variants)
- Tabla 6 (5 number-plural variants)
- Tabla 8 (159 two-cell abbreviations)
- Tabla 9 (48 three-cell abbreviations)
- Tabla 10 (21 Iberoamerican country names — including
Costa Rica + República Dominicana via the multi-word
phrase_contractionsengine path) - §2.3 (10 standard Spanish abbreviation variants — doctora, señoras, ustedes, etc.)
- Apéndice C (full alphabetical listing including ~700 derivative gender/number/verbal/-al/-mente/-ble/-dad forms)
es-g2 grew from 128 word_contractions to **749 word_contractions
- 2 phrase_contractions** (5.85× larger), with 733/733 verified gold entries at 100.00% match rate. Promoted beta → STABLE.
Portuguese G2 (pt-g2): full coverage of IBC Estenografia Braille 2006 Section I — the authoritative Brazilian standard (ISBN 978-85-60331-05-5, 70-page PDF bundled). Programmatically extracted all 140 entries from Sections I.1 (Sinais Simples / Duplos / Triplos / Quádruplos by total and partial syllabic representation), I.2 (Contração Apoiada / Pura / Emergência), I.3 (Suspensão), and I.4 (Convenção Relativa).
pt-g2 grew from 81 word_contractions to 149, with 140/140 verified gold entries at 100.00% match rate. Promoted beta → STABLE.
Fixed
es-g2: 9 broken/noise entries removed (foreign-language paste error 'tak', 3 demonstrative collisions that silently truncated the 't' character, 5 literal-only entries with no compression value). 16 collisions against B-16-1 corrected (alguno/alguna/algunos/algunas, ninguno-family, otra-family, aquel-family, según, qué/cuál/dónde/quién question-form @ prefix, mediante, mientras).
pt-g2: 5 legacy mismatches against IBC standard corrected (jamais ⠚⠁⠍⠁⠮ → ⠚⠍, você ⠧⠕⠉⠣ → ⠧⠉, qual ⠟⠅ → ⠟⠇, quem ⠟⠲ → ⠟⠍, tudo ⠞⠥⠙⠕ → ⠞⠕).
Benchmark runner (
scons benchmark): setBRAILLEKIT_CLI_ALLOW_NONPRO=1so the Pro-tier CLI gate doesn't cause empty CLI output that scored every table 0% (the symptom was every table showing 0% raw match in the summary despite working translation).Braille music from MusicXML (alpha): selecting a MusicXML score now produces international braille music notation following the New International Manual of Braille Music Notation (World Blind Union, 1996). The note/rhythm/interval signs are language-independent; the score's text (title, composer, part names, lyrics, directions) is still translated through your selected language table. New
MusicTranslatorAPI andbatch-musicxmlCLI command. v1 covers notes, rests, octave marks, dotted notes, accidentals, key/time signatures, bar lines and chords (via intervals); precise lyric-to-note alignment for vocal scores is planned for a later release.
[0.7.0] — 2026-04-29
Added
Inline-MathML auto-dispatch from prose tables (P1, commit
aa869015). When a prose table declaresmappings.math_table(e.g. en-ueb-g2 → en-nemeth) and the caller has not opted out,Translator::translatesegments the input on<math>…</math>fragments and routes each math fragment throughMathTranslator::translate_mathml. Switch indicators (BANA UEB-with- Nemeth 2014: open ⠸⠩, close ⠸⠱) are inserted at each prose↔math boundary. Opt out per call viaTranslationOptions.disable_math_dispatch. Seesrc/core/src/text_math_segmenter.{h,cpp}and the 8 integration tests intests/core/test_translator_math_dispatch.cpp.phrase_contractionsengine path for cross-word phrase abbrev- iations (W3 / P2). Per-table top-level array of{pattern, braille, priority?, reference?, group?}mapping a whitespace-separated multi-word pattern to a single braille output (e.g. Frenchil y a→ ⠽⠁ per AVH AOÉ). Pre-translation scanner insrc/core/src/phrase_scanner.{h,cpp}walks the input once and returns word-boundary-aligned, non-overlapping hits; the translator splices the phrase braille verbatim and translates the surrounding text segments via the normal pipeline. Match rules: longest-token- count wins, ties by descending priority, ASCII case-insensitive, wrapping punctuation stripped. Opt out per call viaTranslationOptions.disable_phrase_dispatch. 30 new unit tests. Tables that don't define the array are byte-identical with prior behaviour.settings.lookahead_strategy(P4) — per-table opt-in that controls which overlapping peek matches may suppress the current greedy candidate whenlookahead_window > 0. Two values:"suffix-only"(default — preserves canonical S2 behaviour) and"all"(any context-applicable peek with strictly higher priority may suppress). Empty / unknown values fall back to default.NFB Nemeth lesson extraction infrastructure (P3). New
benchmark/golden/extract_nfb_lessons.py(anchor-and-block parser walks all 15 lesson PDFs → 556 candidate Nemeth fragments) andbenchmark/golden/reconstruct_nfb_mathml.py(conservative MathML reconstructor + batched MathTranslator verifier). Drove the en-nemeth gold expansion below.
Changed
en-nemeth: BANA Nemeth Code 2022 §1 numeric indicator emit (
numeric_indicator: "⠼"). MathTranslator now emits the indicator at the start of an expression and after spaces (the only positions BANA requires), via a newcleanup_passrule that walks the post-emit braille and inserts ⠼ before any Nemeth lower-cell digit cell at byte position 0 or after a braille space ⠀. Inside operators / fractions / sub-/super-scripts no insertion. Output changed for every digit-leading expression: e.g.<math><mn>42</mn></math>now emits⠼⠲⠆(was⠲⠆),<math><mn>2</mn><mo>=</mo><mn>22</mn></math>now emits⠼⠆⠀⠨⠅⠀⠼⠆⠆(lesson-correct, was⠆⠀⠨⠅⠀⠆⠆). Other math tables are unaffected — Rule 5 only fires when the loaded table declares a non-emptynumeric_indicator. Updated 28 fixtures + 28 verified gold entries + 5 unit-test expectations.en-nemeth gold corpus: 69 → 84 verified entries at 100 % match. 15 newly-merged lesson-derived Nemeth pairs from
reconstruct_nfb_mathml.py. The remaining 538 NFB candidates await a richer reconstructor (currency, comma-lists, multi- operand) — tracked indocs/NFB_LESSON_RECON.md.Status counts: 25 stable / 45 beta / 23 alpha → 36 stable / 42 beta / 15 alpha. Promotions in this cycle (largely 2026-04-28): af-g2, ca-g1, eo-g1, fi-g1, fil-g1, fr-g1, ga-g1, ga-g2, hi-g1, hu-g1, id-g1, it-g1, mn-g1, nl-g1, sk-g1, sl-g1, ta-g1, te-g1, en-nemeth.
Fixed
benchmark/golden/_pdf_extractor.BRF_TO_UNICODEBANA chart drift (commit8d17567f). Three entries disagreed with the NLS spec:0was mapped to ⠼ (= the # cell) instead of ⠴;+was mapped to ⠴ (what 0 should have been) instead of ⠬;[was mapped to ⠨ (= the . cell) instead of ⠪. Fix is byte-identical for any extractor whose gold doesn't exercise the affected characters; verified by full test suite + full gold benchmark.
Documentation
- New per-table plan:
docs/tables/fr-g2/PLAN.mdsplits P5 fr-g2 contraction completion into three independent sub-projects.docs/tables/fr-g2/PHRASES.mdlists the 34 candidate AVH §IV Locutions that block on extending the AVH BRF codelocal decoder. docs/NFB_LESSON_RECON.mdPhase 2-4 outcome section explains the extractor pipeline + the 538-skipped breakdown by shape.docs/MULTIDAY_PROJECTS.mdSTATUS blocks updated for P1, P2, P3, P4. P5 fr-g2 has its own plan; es-g2 / pt-g2 plans deferred to follow the AVH decoder pattern.docs/IMPROVEMENT_ROADMAP.mdW3 row promoted to "Resolved"; S2 row updated with the P4 evaluation outcome.
Notes for downstream consumers
Math output stability: Tables with
numeric_indicator: ""(empty) are byte-identical with prior behaviour. Tables that declarenumeric_indicator(currently only en-nemeth ships with"⠼") emit the new BANA-correct output. If you have callers locking on bare-digit Nemeth output and want the old behaviour, set the table'snumeric_indicatorback to "".API additions:
TranslationOptions::disable_phrase_dispatch(default false) matches the pre-existingdisable_math_dispatchshape.
[0.6.2] — 2026-04-21
Added
Portable DOCX output with embedded Braille font. New
DocxWriterOptions.embed_braille_fontflag ships an OOXML-obfuscated copy of Braille CC0 (GGBotNet, CC0 1.0 public domain, full U+2800..U+28FF) inside the.docxso the document renders correctly on machines that don't have SimBraille/Tiger/Swell installed. Uses the spec-compliant XOR-twice obfuscation in OOXML §17.8.1 — verified byte-identical round-trip with the embedded GUID as the fontKey. Adds ~50 KB to the binary.docxsize (compressed ~7 KB in the zip).RTF writer (
write_rtf,write_rtf_to_memory,RtfWriterOptions). Emits Microsoft-compatible RTF with the sameembed_braille_fontoption, using RTF 1.9\*\fontemb+\*\fontfilehex encoding. Unicode braille goes through\uN?escapes with\uc1so the fallback "?" is consumed by strict readers. Documented compat note: macOS Cocoa RTF parser (textutil, TextEdit) doesn't support\*\fontemb— use DOCX instead on those readers.Document tab (GUI) gains:
- A fifth radio button for Rich Text (
.rtf) output parallel to DOCX. - A fifth font picker entry "Braille CC0 (embedded, portable)"
that flips on
embed_braille_fontfor whichever container is selected. - Save-dialog filters for
.rtf; persistent settings extended; tab order + EN/DE accessibility labels added.
- A fifth radio button for Rich Text (
Performance documentation (
docs/PERFORMANCE.md): measured throughput, startup cost, and memory ceiling for en-ueb-g2 and de-g2 on 12 MB prose input.
Engine
Allow capitalized prefix per-contraction flag (
allow_capitalized_prefix). Enables shortforms like blind, braille, first, friend, good, great, letter to fire as a bare prefix of a capitalised word (Blindcraft → ⠠⠃⠇⠉⠗⠁⠋⠞, Greatford, Goodge, …). Approximates UEB §10.11 compound-word shortform expansion.Proper-noun
not_wordslists for 11 groupsigns / initial-letter contractions in en-ueb-g2 (§10.11 bridging rule). 21 specific proper nouns (Boone, Chisholm, Esther, Jamestown, Hades, Hadrian, Dayan, Bighorn, Airedale, Newhaven, Sontheim, …) no longer get incorrect groupsigns applied mid-word.
Fixes
CLI
--version/--help: were previously consumed as input text and translated to braille. Now short-circuit to print version / usage before the translate dispatch.RTF
\ucchanged from\uc0to\uc1so strict readers don't render the?fallback character verbatim.
[0.6.1] — 2026-04-20
Engine features
- Grade-2 number-mode persistence: punctuation listed in a table's
number_mode_chars(by default,and.) now keeps the following digit run inside the same number segment. Grade 1 already honored this; Grade 2 was incorrectly resetting number mode after every non-word segment. Affects Norwegian, German, French, Italian etc. - LOWERCASE_WORD shortform extensions: the extension-match path in
apply_contractionsnow fires forLOWERCASE_WORDcontexts too, not onlyWHOLE_WORD. Unlocks UEBin/enoughfollowed by apostrophe clitics ('s,'t,'d,'ll,'ve,'re). - Per-entry flag
allow_adjacent_to_hyphen: escape hatch forsuppress_lower_wordsigns_at_hyphens. UEB §10.6 allowsinandenoughadjacent to hyphens while other lower wordsigns stay suppressed. - Per-entry flag
not_as_whole_word: UEB §10.3.2 forbids the digraph strong-groupsigns (sh,th,ch,st,wh,gh,ar,ou) from standing as the entire word — they presume surrounding letter context. Enablessh → ⠎⠓(not⠩) for standalone input.
Table-level fixes
- en-ueb-g2: apostrophe-clitic extensions on all 29 alphabetic
wordsigns and all 286 shortforms;
n'ton could/would/should/must;disandconnot_words pruned to just the genuine UEB exceptions (coney, coneys);ingcontext restricted to middle/final so it can't fire at word start post-hyphen (to-ing,fro-ing); several bogus indicator/math-symbol shortforms removed (terminator, division, proportion, acknowledge, o'clock — each was a PDF-extraction artifact that truncated the real translation). - de-g2: resolved 6 of 9 BSKDL A1 residual mismatches via targeted dictionary entries + a context change (ihrige, lässt's, möcht's, Man, mrs/drs rejected as Luxembourgish).
- no-g1: 10 missing ASCII symbol mappings added (
&,§,©,π,°,|,=,+,@,_,$); the double-quote mapping was fixed (⠶→⠲).
[0.6.0] — 2026-04-15
Added
Desktop UI overhaul
- Wizard flow (WelcomeStage → InputStage → ConfigStage → ResultStage) for guided text/document conversion
- eBraille export from
DocumentTab - EPUB and Markdown import, drag-and-drop for single files and batch
- Live preview, async-feel conversion pipeline
- System-aware dark mode
- Table maturity indicator in the picker (stable / beta / alpha)
Table maturity classification — new
statusfield in table JSON (stable / beta / alpha), surfaced in the UI pickerEngine features
- Recursive compound boundary detection (German compound handling, +77 words)
check_remaindermorphology engine infrastructurenot_wordsmorphology engine (blocks contractions in specific word forms)- Grade 2 back-translation (BRF → text via inverse contraction lookup)
- Capital passage indicator (⠠⠠⠠) support
- UEB
be-/con-/dis-prefix shortforms with proper overrides
Windows ARM64 cross-compilation support including ARM64 wxWidgets build and MSIX bundle auto-detection
Language coverage
- 591 Norwegian (no-g1/no-g2) gold entries + Scandinavian BRF decoder
- 76 Swedish (sv-g2) entries from MTM Kortskrift PDF
- 215 English (en-ueb-g2) entries bulk-imported from UEB Rules 2024
- 453 French (fr-g2) AVH abbreviations
- pt-g2, cy-g2 promoted to stable
C API expansion with documented error model, new entry points for table enumeration and status queries; new C API README
Documentation
docs/GETTING_STARTED.mdonboarding guide- C API README with usage examples
- All diagrams converted from Graphviz to inline PlantUML (light + dark)
- Accessible Doxygen HTML with auto dark mode (new in this release)
Changed
Engine refactors
- Extracted
capital_indicators.cpp,compound_boundary.cpp,grade1_translator.cppfrom the monolithic translator ContractionGuardsstruct replaces 9 ad-hoc lambdaspass2_processor.cpp— named pass2 phase classes, documented responsibilities- Removed 222-line dead fallback contraction path
table_manager.cppbroken into focused modules
- Extracted
Table accuracy — continued quality push on major tables:
- de-g2: BSKDL A1 coverage 90% → 98% (multiple whole-word batches, Fugen-s compound boundary fix, proper-noun/loanword additions)
- en-ueb-g2: 14 broken shortforms removed, new entries from UEB Rules 2024
- fr-g2: +453 AVH Manuel abbreviations (+585 words correctly translated)
- Scandinavian tables promoted to stable
Bindings: Python / C# / Flutter versions synced from 0.1.0 → 0.6.0 to match the core library
Fixed
- Korean Hangul decomposition: MSVC silently re-encoded
"\u3131"in CP1252; all tables now use explicit"\xe3\x84\xb1"UTF-8 byte escapes - UTF-8 bounds checks missing on several translator paths; silent
try/excepthandlers replaced with typed error propagation (review fixes) - de-g2 compound boundary incorrectly blocked
stcontraction via Fugen-s guard - Capital indicator suppressed correctly after hyphens in compounds
(per BSKDL convention) and before letter-prefix indicator
⠠ - en-ueb-g2
con-prefix false positives in words like console, contest - en-ueb-g2: added
Beatrice,Beatrix,Belinda,acknowledgeshortforms previously missing - CLI stdin handling for piped input in Grade 2 mode
Technical
- New central
VERSIONfile (0.6.0) — single source for all version stamping scons docsnow produces accessible HTML with system-aware dark mode toggle, tree navigation, proper ARIA, and respectsprefers-reduced-motionsite_scons/build_tools.pySDK packager now bundlesCHANGELOG.md,LICENSE, and the generated Doxygen documentation- ~5 GB of old benchmark data removed (word lists, HTML reports, per-table corpus outputs)
[0.5.0] — 2026-03-24
Added
Accessible HTML benchmark report (
scons benchmark→benchmark/results/benchmark_report.html)- CDN-based Pico CSS, dark/light mode toggle
- Full screenreader support (ARIA labels, semantic HTML)
- Sorted by language, expandable per grade/table
- Per-table: Braille system, standard reference, multi-comparison results
- SDK disclaimer about independent implementation and verification status
Chinese Braille engine: Hanzi→Pinyin pre-processing via embedded Pinyin dictionary
- 44,348 CJK→Pinyin mappings from Unicode Unihan (Unicode License, permissive)
- New
pinyin_dictionarytable field intable_impl.h - zh-g1 accuracy: 0% → 91.8% (vs pypinyin reference)
Latin letter indicator engine feature
- New
latin_letter_indicatortable field: emits indicator once before Latin text in non-Latin scripts - ru-g1 accuracy: 94.0% → 98.8%
- New
Japanese Tenji table overhaul
- Added all Hiragana + Katakana to character map (152→237 characters)
- Fixed わ/を swap, added 33 youon combinations, 6 foreign sound combos
- ja-g1 accuracy: 0% → 97.3% (vs uhyo/tenji reference)
Changed
Benchmark accuracy improvements across 20+ tables:
Language Table Before After Change Spanish es-g2 15.0% 70.2% +55.2pp Hindi hi-g1 77.8% 98.0% +20.2pp Portuguese pt-g2 40.3% 52.1% +11.8pp French fr-g2 28.9% 40.7% +11.8pp Arabic ar-g1 89.5% 95.9% +6.4pp Dutch nl-g1 91.3% 97.7% +6.4pp Russian ru-g1 94.0% 98.8% +4.8pp Czech cs-g1 95.6% 99.8% +4.2pp Polish pl-g1 93.6% 98.1% +4.5pp Slovenian sl-g1 96.0% 98.3% +2.3pp Spanish es-g1 96.9% 99.3% +2.4pp Slovak sk-g1 98.6% 99.3% +0.7pp English en-ueb-g1 98.4% 99.3% +0.9pp English en-ueb-g2 96.5% 97.4% +0.9pp Finnish fi-g1 99.7% 99.9% +0.2pp French fr-g1 99.4% 99.8% +0.4pp Italian it-g1 99.5% 99.7% +0.2pp Portuguese pt-g1 99.4% 99.6% +0.2pp Norwegian no-g1 99.4% 99.6% +0.2pp German de-g2 95.6% 95.7% +0.1pp
Fixed
_batch_translate()now passesBRAILLEKIT_TABLES_DIRenvironment variable correctly- Chinese CCB initials/finals sections now loaded into character_map by table_manager
Technical
- New files:
benchmark/report_generator.py,benchmark/references/,benchmark/config/,core/data/pinyin_dict.json - Engine changes:
table_impl.h(2 new fields),translator.cpp(Latin indicator + Pinyin pre-processing),table_manager.cpp(CCB/Pinyin loading) - All changes are table-configurable — no hardcoded language-specific code in engine