Skip to content

Commit 7a12b19

Browse files
committed
Rework the complex condition finder doc.
The result of the analysis is not gauranteed to include all segments in the true condition, only that the found condition is a superset of the true condition comprised only of segments from the true condition. This updates the doc to remove the claim that found conditions will contain all true condition segments. Additionally this reworks the 'Why this works' section to be a bit more rigorous.
1 parent d135688 commit 7a12b19

File tree

5 files changed

+132
-124
lines changed

5 files changed

+132
-124
lines changed

docs/experimental/closure_glyph_segmentation_complex_conditions.md

Lines changed: 50 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -6,49 +6,49 @@ Date: Dec 17, 2025
66

77
## Introduction
88

9-
Before reading this document is recommended to first review the [closure glyph
10-
segmentation](./closure_glyph_segmentation.md) document.
9+
Before reading this document is recommended to first review the
10+
[closure glyph segmentation](./closure_glyph_segmentation.md) document. This document borrows concepts and terms from it.
1111

1212
In closure glyph segmentation the closure analysis step is capable of locating glyph activation conditions that are
13-
either fully disjunctive or fully conjunctive (eg. A or B or C). It is not capable of finding conditions that are a mix
14-
of conjunction and disjunction (eg. (A and B) or (B and C)). These are referred to as complex conditions. By default
13+
either fully disjunctive or fully conjunctive (eg. `(A or B or C)`). It is not capable of finding conditions that are a mix
14+
of conjunction and disjunction (eg. `(A and B) or (B and C)`). These are referred to as complex conditions. By default
1515
glyphs with complex conditions are assigned to a patch that is always loaded, since the true conditions are not known.
1616

17-
This document describes an algorithm which can be used to find the complete set of segments which are a part of the
18-
complex condition for a glyph. If the set of segments which are present in a condition is known, then we can form a
19-
purely disjunctive condition using those segments which is guaranteed to be a superset of the true condition. That is it
20-
will always activate at least when the true condition would. This property allows the superset condition to be used for
21-
a patch in place of the true condition without violating the closure requirement.
17+
This document describes an algorithm which can be used to find purely disjunctive conditions which are supersets of
18+
complex conditions. A superset condition is one that will activate at least whenever the true condition would. This
19+
property allows the superset condition to be used for a patch in place of the true condition without violating the
20+
closure requirement.
2221

23-
For example if we had a glyph with a activation condition of ((A and B) or (B and C)) then this process will find the set
24-
of segments {A, B, C} which would form the superset condition (A or B or C). In a segmentation we could then have a
25-
patch with condition (A or B or C) which loads the glyph and this would satisfy the closure requirement. In the future
26-
if we decide to develop an analysis to find the true condition then the segment set found by this process could be used
27-
to narrow down the search space to only those segments involved in the condition.
22+
For a given complex condition there typically exists more than one possible superset disjunctive condition. The
23+
algorithm will find one of them, but not necessarily the smallest one. The found superset condition will always only
24+
contain only segments which appear in the original condition.
25+
26+
For example if we had a glyph with a activation condition of `((A and B) or (B and C))` then this process will find one
27+
of the possible superset conditions such as `(A or C)`, `(A or B)`, `(B or C)`, or `(B)`. In a segmentation we could
28+
then have a patch with the found condition which loads the glyph and this would satisfy the closure requirement.
2829

2930
## Foundations
3031

3132
The algorithm is based on the following assertions:
3233

33-
1. For any complex activation condition of a glyph, a disjunction over all segments appearing in that condition
34-
will always activate at least when the original condition does.
35-
36-
2. Given some fully disjunctive condition, we can verify that condition is sufficient to meet the glyph closure
37-
requirement for a glyph by the following procedure: compute a glyph closure of the union of all segments except for
38-
those in the condition. If the glyph does not appear in this closure, then the condition satisfies the closure
39-
requirement for that glyph. This is called the “additional conditions” check.
34+
1. Given some fully disjunctive condition for a glyph, we can verify that the condition is a superset of the true
35+
condition for the glyph and meets the closure requirement by the following procedure: compute a glyph closure of the
36+
union of all segments except for those in the condition. If the glyph does not appear in this closure, then the
37+
condition satisfies the closure requirement for that glyph and is a superset of the true condition. This is called
38+
the “additional conditions” check.
4039

41-
3. The glyph closure of all segments will include the glyphs that we are analyzing.
40+
2. The glyph closure of all segments includes the glyph that we are analyzing.
4241

43-
4. We have a glyph which has some true activation condition. If we compute a glyph closure of some combination of
42+
3. We have a glyph which has some true activation condition. If we compute a glyph closure of some combination of
4443
segments, then adding or removing a segment, which is not part of the activation condition, to the glyph closure input
4544
will have no affect on whether or not the glyph appears in the closure output.
4645

46+
4. The closure of no segments contains only glyphs from the initial font.
47+
4748
## The Algorithm
4849

49-
For each glyph with a complex condition we can use the above to find the complete set of segments which are part of the
50-
glyph's complex condition. A condition which is a disjunction across these segments will satisfy the closure requirement
51-
for that glyph.
50+
For a glyph with a complex condition we can use the above to find a superset disjunctive condition for that
51+
glyph's complex condition. These conditions will satisfy the closure requirement for each glyph.
5252

5353
### Finding a Sub Condition
5454

@@ -63,10 +63,10 @@ Inputs:
6363
Algorithm:
6464

6565
1. Start with a set of all segments except those to be excluded, called `to_test`.
66-
2. Initialize a second set of segments, `required`, to the empty set.
67-
3. Remove a segment `s` from `to_test` and compute the glyph closure of `to_test U required`.
68-
4. If `glyph` is not found in the closure then add `s` to `required`.
69-
5. If `to_test` is empty, then return the sub condition `required`.
66+
2. Initialize a second set of segments, `sub_condition`, to the empty set.
67+
3. Remove a segment `s` from `to_test` and compute the glyph closure of `to_test U sub_condition`.
68+
4. If `glyph` is not found in the closure then add `s` to `sub_condition`.
69+
5. If `to_test` is empty, then return the sub condition `sub_condition`.
7070
6. Otherwise, go back to step 3.
7171

7272
### Finding the Complete Condition
@@ -83,32 +83,40 @@ Algorithm:
8383
2. Execute the `Finding a Sub Condition` algorithm with `condition` as the excluded set.
8484
3. Union the returned set into `condition`.
8585
4. Compute the glyph closure of all segments except those in `condition`.
86-
5. If `glyph` is found in the closure, then more conditions still exist. Go back to step 2.
86+
5. If `glyph` is found in the closure, then more sub conditions still exist. Go back to step 2.
8787
6. Return the complete condition, `condition`.
8888

8989
### Initial Font
9090

91-
Any time a closure operation is executed by the above two algorithms it's necessary to union the subset definition
92-
for the initial font into the closure input. That's because the closure of the initial font affects what's reachable
93-
by the segments.
91+
Any time a closure operation is executed by the above two algorithms it's necessary to union the subset definition for
92+
the initial font into the closure input. This is required because the closure of the initial font affects what's
93+
reachable by the segments.
9494

9595
### Why this works
9696

97+
Here we show this procedure is guaranteed to find a disjunctive superset of a glyph's true condition which includes
98+
only segments from the true condition, when the glyph is not already in the initial font:
99+
100+
* For each call to `Finding a Sub Condition` glyph will be in the closure of all non-excluded segments. For the first
101+
call this is guaranteed by assertion (2). For subsequent calls this is guaranteed by the "additional conditions" check
102+
which gates execution.
103+
97104
* Any segments which are not part of the true condition will not impact the glyph's presence in the closure (assertion
98-
(4)). As a result they will never be moved into the `required` set and will not be returned by `Finding a Sub
99-
Condition`. Thus any segments returned by `Finding a Sub Condition` are part of the true condition.
105+
(3)). Further by the previous point we know that at the start of `Finding a Sub Condition` the closure of all
106+
non-excluded segments will contain glyph. Thus testing a segment which is not part of the true condition will never
107+
result in glyph missing from the closure, and won't be added to `sub_condition`. Therefore `Finding a Sub Condition`
108+
will only ever return segments that are part of the true condition.
100109

101-
* Each iteration of `Finding a Sub Condition` is guaranteed to select at least one segment since we know that the
102-
initial closure always starts with the glyph in it, and the closure of no segments will not have the glyph in it. So
103-
at some point during the algorithm the glyph must be found to not be present. In the first iteration this is a result
104-
of assertion (3). For subsequent iterations this is guaranteed by the "additional conditions" check prior to starting
105-
the iteration.
110+
* `Finding a Sub Condition` will always return at least one segment: if when the last segment is tested `sub_condition`
111+
is still the empty set, then the closure will be on no segments and will not have glyph in it. This is a result of
112+
assertion (4) and the premise that glyph is not already in the initial font. As a consequence the returned
113+
`sub_condition` will always have at least one segment in it.
106114

107115
* Since all returned segments from `Finding a Sub Condition` are excluded from future calls, there will be a finite
108116
number of `Finding a Sub Condition` executions which return only segments part of the true condition.
109117

110118
* Lastly, the algorithm terminates only once the additional conditions check finds no additional conditions,
111-
guaranteeing we have found the complete superset disjunctive condition.
119+
guaranteeing we have found a superset disjunctive condition (assertion (1)).
112120

113121
## Making it More Performant
114122

ift/encoder/complex_condition_finder.cc

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ struct Task {
2727
// analysis.
2828
SegmentSet excluded;
2929

30-
// These segments have been determined to be required.
31-
SegmentSet required;
30+
// These segments have been determined to be part of a sub condition.
31+
SegmentSet sub_condition;
3232

3333
// These segments have not yet been tested.
3434
SegmentSet to_be_tested;
@@ -76,7 +76,7 @@ struct Context {
7676
// covered by a task with no excluded segments.
7777
queue.push_back(Task{
7878
.excluded = {},
79-
.required = {},
79+
.sub_condition = {},
8080
.to_be_tested = all_segments,
8181
.glyphs = glyphs,
8282
});
@@ -126,7 +126,7 @@ struct Context {
126126

127127
queue.push_back(Task{
128128
.excluded = condition,
129-
.required = {},
129+
.sub_condition = {},
130130
.to_be_tested = except,
131131
.glyphs = glyphs_with_additional_conditions,
132132
});
@@ -136,21 +136,21 @@ struct Context {
136136
}
137137

138138
// Each analysis step checks one segment to see for which glyphs that segment
139-
// is required. The supplied task data structure gives the specific state
139+
// is relevant. The supplied task data structure gives the specific state
140140
// around which the segment is tested.
141141
//
142142
// To test a segment a closure is run without the segment being tested:
143143
// - For inscope glyphs which appear in the closure the test segment is not
144-
// required for these glyphs
144+
// relevant for these glyphs
145145
// - For inscope glyphs which do not appear in the closure the test segment is
146-
// required for these glyphs.
146+
// relevant for these glyphs.
147147
//
148148
// Based on the anlysis results up to two more analysis steps are spawned (one
149-
// for glyphs where segment is required, the other where it is not required)
149+
// for glyphs where segment is relevant, the other where it is not relevant)
150150
// to test the next segment.
151151
//
152-
// Once all segments are tested the resulting minimal set of required segments
153-
// is recorded in out. Lastly, the non-required segments are checked to see
152+
// Once all segments are tested the resulting sub condition segments
153+
// is recorded in out. Lastly, the non-relevant segments are checked to see
154154
// if additional conditions are present, if they are another analysis task is
155155
// queued to discover the additional conditions.
156156
Status RunAnalysisTask(
@@ -161,13 +161,13 @@ struct Context {
161161
}
162162

163163
if (task.to_be_tested.empty()) {
164-
return RecordMinimalCondition(task, glyph_to_conditions);
164+
return RecordSubCondition(task, glyph_to_conditions);
165165
}
166166

167167
segment_index_t test_segment = *task.to_be_tested.min();
168168
task.to_be_tested.erase(test_segment);
169169

170-
SegmentSet closure_segments = task.required;
170+
SegmentSet closure_segments = task.sub_condition;
171171
closure_segments.union_set(task.to_be_tested);
172172
GlyphSet closure_glyphs = TRY(SegmentClosure(closure_segments));
173173

@@ -178,42 +178,42 @@ struct Context {
178178

179179
queue.push_back(Task{
180180
.excluded = task.excluded,
181-
.required = task.required,
181+
.sub_condition = task.sub_condition,
182182
.to_be_tested = task.to_be_tested,
183183
.glyphs = doesnt_need_test_segment,
184184
});
185185

186-
task.required.insert(test_segment);
186+
task.sub_condition.insert(test_segment);
187187
queue.push_back(Task{
188188
.excluded = task.excluded,
189-
.required = task.required,
189+
.sub_condition = task.sub_condition,
190190
.to_be_tested = task.to_be_tested,
191191
.glyphs = needs_test_segment,
192192
});
193193

194194
return absl::OkStatus();
195195
}
196196

197-
// A minimal condition has been found, record it and kick off any
197+
// A sub condition has been found, record it and kick off any
198198
// further analysis needed for additional conditions.
199-
Status RecordMinimalCondition(
199+
Status RecordSubCondition(
200200
Task task, btree_map<glyph_id_t, SegmentSet>& glyph_to_conditions) {
201201
for (glyph_id_t gid : task.glyphs) {
202-
glyph_to_conditions[gid].union_set(task.required);
202+
glyph_to_conditions[gid].union_set(task.sub_condition);
203203
}
204204

205-
// We have identified a minimal set of required segments for glyphs,
206-
// however as usual there may be remaining additional conditions which we
207-
// need to check for
208-
task.excluded.union_set(task.required);
205+
// We have identified a sub condition for glyphs, however as usual
206+
// there may be remaining additional conditions which we need to
207+
// check for
208+
task.excluded.union_set(task.sub_condition);
209209
auto [additional_condition_glyphs, remaining] =
210210
TRY(HasAdditionalConditions(task.excluded, task.glyphs));
211211

212212
// Anything left in glyphs has additional conditions, recurse again to
213213
// analyze them further
214214
queue.push_back(Task{
215215
.excluded = task.excluded,
216-
.required = {},
216+
.sub_condition = {},
217217
.to_be_tested = remaining,
218218
.glyphs = additional_condition_glyphs,
219219
});
@@ -264,7 +264,7 @@ static SegmentSet NonEmptySegments(
264264
return segments;
265265
}
266266

267-
StatusOr<btree_map<SegmentSet, GlyphSet>> FindMinimalDisjunctiveConditionsFor(
267+
StatusOr<btree_map<SegmentSet, GlyphSet>> FindSupersetDisjunctiveConditionsFor(
268268
const RequestedSegmentationInformation& segmentation_info,
269269
const GlyphConditionSet& glyph_condition_set,
270270
GlyphClosureCache& closure_cache, GlyphSet glyphs) {
@@ -275,7 +275,7 @@ StatusOr<btree_map<SegmentSet, GlyphSet>> FindMinimalDisjunctiveConditionsFor(
275275
// may interact with the GSUB table. Any segments which don't interact with
276276
// GSUB will already have relavent conditions discovered via the standard
277277
// closure analysis. Only segments which interact with GSUB may be part of
278-
// complex conditions (since complex conditions required at least one 'AND'
278+
// complex conditions (since complex conditions require at least one 'AND'
279279
// which only GSUB can introduce). As a result we can exclude any segments
280280
// with no GSUB interaction from this analysis which should significantly
281281
// speed things up.

ift/encoder/complex_condition_finder.h

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,21 +9,21 @@
99

1010
namespace ift::encoder {
1111

12-
// Finds the minimal purely disjunctive conditions that activate each
12+
// Finds superset purely disjunctive conditions that activate each
1313
// provided glyph. Returns a map from each condition to the activated
1414
// glyphs.
1515
//
1616
// Takes a glyph condition set which will be used as a starting point.
1717
//
18-
// A minimal purely disjunctive condition is the complete set of segments
19-
// which appear in the true activation condition for a glyph. The
20-
// purely disjunctive version is a super set of the true condition and
21-
// will activate at least whenever the true condition would.
18+
// A superset purely disjunctive condition will activate at least
19+
// whenever the true condition would. It will only ever include segments
20+
// that appear in the true condition. There are typically multiple
21+
// possible superset conditions. This will find one of them.
2222
//
2323
// For example if a glyph has the true condition (a and b) or (b and c)
24-
// this will find the condition (a or b or c).
24+
// this could find the condition (a or c).
2525
absl::StatusOr<absl::btree_map<common::SegmentSet, common::GlyphSet>>
26-
FindMinimalDisjunctiveConditionsFor(
26+
FindSupersetDisjunctiveConditionsFor(
2727
const RequestedSegmentationInformation& segmentation_info,
2828
const GlyphConditionSet& glyph_condition_set,
2929
GlyphClosureCache& closure_cache, common::GlyphSet glyphs);

0 commit comments

Comments
 (0)