Skip to content

Commit 2772ed6

Browse files
kenhuuuCole-Greer
andcommitted
TINKERPOP-3200 Make repeat() act as a global parent
Co-authored-by: Cole-Greer <[email protected]>
1 parent 65a438a commit 2772ed6

File tree

13 files changed

+442
-47
lines changed

13 files changed

+442
-47
lines changed

CHANGELOG.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ This release also includes changes from <<release-3-7-XXX, 3.7.XXX>>.
9999
* Removed the `@RemoteOnly` testing tag in Gherkin as lambda tests have all been moved to the Java test suite.
100100
* Updated gremlin-javascript to use GraphBinary as default instead of GraphSONv3
101101
* Added the `asNumber()` step to perform number conversion.
102+
* Changed `repeat()` to make `repeatTraversal` global rather than a mix of local and global.
102103
* Renamed many types in the grammar for consistent use of terms "Literal", "Argument", and "Varargs".
103104
* Changed `gremlin-net` so that System.Text.Json is only listed as an explicit dependency when it is not available from the framework.
104105
* Fixed translation of numeric literals for Go losing type definitions.

docs/src/dev/developer/for-committers.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -624,6 +624,8 @@ mid-`E()` step is not supported.
624624
mid-`V()` step is not supported.
625625
* `@GraphComputerVerificationOneBulk` - The scenario will not work because `withBulk(false)` is configured and that
626626
is not compatible with `GraphComputer`
627+
* `@GraphComputerVerificationOrderingNotSupported` - The scenario will not work with `GraphComputer` because ordering
628+
within `repeat()` is not supported.
627629
* `@GraphComputerVerificationReferenceOnly` - The scenario itself is not written to support `GraphComputer` because it
628630
tries to reference inaccessible properties that are on elements only available by "reference" (i.e `T.id` only).
629631
* `@GraphComputerVerificationStrategyNotSupported` - The scenario uses a traversal strategy that is not supported by

docs/src/dev/provider/gremlin-semantics.asciidoc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1974,7 +1974,7 @@ link:https://tinkerpop.apache.org/docs/x.y.z/reference/#project-step[reference]
19741974
[[repeat-step]]
19751975
=== repeat()
19761976
1977-
*Description:* Iteratively applies a traversal (the "loop body") to each incoming traverser until a stopping
1977+
*Description:* Iteratively applies a traversal (the "loop body") to all incoming traversers until a stopping
19781978
condition is met. Optionally, it can emit traversers on each iteration according to an emit predicate. The
19791979
repeat step supports loop naming and a loop counter via `loops()`.
19801980
@@ -2015,6 +2015,9 @@ predicates are evaluated before the first iteration (pre) or after each iteratio
20152015
`do/while` semantics respectively:
20162016
- Pre-check / pre-emit: when the modulator appears before `repeat(...)`.
20172017
- Post-check / post-emit: when the modulator appears after `repeat(...)`.
2018+
- Global traversal scope: The `repeatTraversal` is a global child. This means all traversers entering the repeat body
2019+
are processed together as a unified stream with global semantics. `Barrier` (`order()`, `sample()`, etc.) steps within
2020+
the repeat traversal operate across all traversers collectively rather than in isolation per traverser.
20182021
- Loop counter semantics:
20192022
- The loop counter for a given named or unnamed repeat is incremented once per completion of the loop body (i.e.,
20202023
after the body finishes), not before. Therefore, `loops()` reflects the number of completed iterations.

docs/src/upgrade/release-3.8.x.asciidoc

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -230,6 +230,104 @@ gremlin> g.inject([Float.MAX_VALUE, Float.MAX_VALUE], [Double.MAX_VALUE, Double.
230230
231231
See link:https://issues.apache.org/jira/browse/TINKERPOP-3115[TINKERPOP-3115]
232232
233+
==== repeat() Step Global Children Semantics Change
234+
235+
The `repeat()` step has been updated to treat the repeat traversal as a global child in all cases. Previously, the
236+
repeat traversal behaved as a hybrid between local and global semantics, which could lead to unexpected results in
237+
certain scenarios. The repeat traversal started off as a local child but as traversers were added back per iteration,
238+
it behaved more like a global child.
239+
240+
With this change, the repeat traversal now consistently operates with global semantics, meaning that all traversers
241+
are processed together rather than being processed per traverser. This provides more predictable behavior and aligns
242+
with the semantics of other steps.
243+
244+
[source,text]
245+
----
246+
// In 3.7.x and earlier, the order would be local to the first traverser.
247+
// Notice how the results are grouped by marko, then vadas, then lop
248+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
249+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
250+
==>[marko,lop,josh]
251+
==>[marko,josh,lop]
252+
==>[marko,lop,peter]
253+
==>[marko,josh,ripple]
254+
==>[vadas,marko,josh]
255+
==>[vadas,marko,lop]
256+
==>[lop,marko,josh]
257+
==>[lop,josh,marko]
258+
==>[lop,josh,ripple]
259+
==>[lop,marko,vadas]
260+
261+
// In 3.8.0, the repeat now consistently uses global semantics
262+
// The traversers from the final iteration are ordered first then by the traversers from previous iterations
263+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
264+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
265+
==>[marko,lop,josh]
266+
==>[vadas,marko,josh]
267+
==>[lop,marko,josh]
268+
==>[marko,josh,lop]
269+
==>[vadas,marko,lop]
270+
==>[lop,josh,marko]
271+
==>[marko,lop,peter]
272+
==>[marko,josh,ripple]
273+
==>[lop,josh,ripple]
274+
==>[lop,marko,vadas]
275+
----
276+
277+
This change may affect traversals that relied on the previous hybrid behavior, particularly those using side effects
278+
or barrier steps within `repeat()`. Review any traversals using `repeat()` with steps like `aggregate()`, `store()`,
279+
or other barrier steps to ensure they produce the expected results.
280+
281+
If you would like `repeat()` to behave similarly to how it did in 3.7.x, then you should wrap the repeat inside a
282+
`local()`. The following example demonstrates this:
283+
284+
[source,text]
285+
----
286+
// In 3.7.x
287+
gremlin> g.V().repeat(both().simplePath().order().by("name")).times(2).path().by("name")
288+
==>[marko,lop,josh]
289+
==>[marko,josh,lop]
290+
==>[marko,lop,peter]
291+
==>[marko,josh,ripple]
292+
==>[vadas,marko,josh]
293+
==>[vadas,marko,lop]
294+
==>[lop,marko,josh]
295+
==>[lop,josh,marko]
296+
==>[lop,josh,ripple]
297+
==>[lop,marko,vadas]
298+
==>[josh,marko,lop]
299+
==>[josh,lop,marko]
300+
==>[josh,lop,peter]
301+
==>[josh,marko,vadas]
302+
==>[ripple,josh,lop]
303+
==>[ripple,josh,marko]
304+
==>[peter,lop,josh]
305+
==>[peter,lop,marko]
306+
307+
// In 3.8.0, placing the repeat inside a local will again cause the repeat traversal to apply per traverser (locally)
308+
gremlin> g.V().local(repeat(both().simplePath().order().by("name")).times(2)).path().by("name")
309+
==>[marko,lop,josh]
310+
==>[marko,josh,lop]
311+
==>[marko,lop,peter]
312+
==>[marko,josh,ripple]
313+
==>[vadas,marko,josh]
314+
==>[vadas,marko,lop]
315+
==>[lop,marko,josh]
316+
==>[lop,josh,marko]
317+
==>[lop,josh,ripple]
318+
==>[lop,marko,vadas]
319+
==>[josh,marko,lop]
320+
==>[josh,lop,marko]
321+
==>[josh,lop,peter]
322+
==>[josh,marko,vadas]
323+
==>[ripple,josh,lop]
324+
==>[ripple,josh,marko]
325+
==>[peter,lop,josh]
326+
==>[peter,lop,marko]
327+
----
328+
329+
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3200[TINKERPOP-3200]
330+
233331
==== Prefer OffsetDateTime
234332
235333
The default implementation for date type in Gremlin is now changed from the `java.util.Date` to the more encompassing
@@ -1128,6 +1226,62 @@ The `ChooseStep` now provides a `ChooseSemantics` enum which helps indicate if t
11281226
11291227
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3178[TINKERPOP-3178]
11301228
1229+
===== repeat() Step Global Children Semantics Change
1230+
1231+
The `RepeatStep` has been updated to consistently treat the repeat traversal as a global child rather than using
1232+
hybrid local/global semantics. This change affects how the repeat traversal processes traversers and interacts with
1233+
the parent traversal.
1234+
1235+
Previously, `RepeatStep` would start with local semantics for the first iteration and then switch to global semantics
1236+
for the subsequent iterations, which created inconsistencies in how side effects and barriers behaved within the repeat
1237+
traversal. The biggest change will be to `Barrier` steps in the repeat traversal as they will now have access to all
1238+
the starting traversers.
1239+
1240+
[source,text]
1241+
----
1242+
// In 3.7.x and earlier, the order would be local to the first traverser.
1243+
// Notice how the results are grouped by marko, then vadas, then lop
1244+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
1245+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
1246+
==>[marko,lop,josh]
1247+
==>[marko,josh,lop]
1248+
==>[marko,lop,peter]
1249+
==>[marko,josh,ripple]
1250+
==>[vadas,marko,josh]
1251+
==>[vadas,marko,lop]
1252+
==>[lop,marko,josh]
1253+
==>[lop,josh,marko]
1254+
==>[lop,josh,ripple]
1255+
==>[lop,marko,vadas]
1256+
1257+
// In 3.8.0, the aggregate now consistently uses global semantics
1258+
// The traversers are now ordered so the traversers from the final iteration are ordered first then by
1259+
// the traversers from previous iterations
1260+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
1261+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
1262+
==>[marko,lop,josh]
1263+
==>[vadas,marko,josh]
1264+
==>[lop,marko,josh]
1265+
==>[marko,josh,lop]
1266+
==>[vadas,marko,lop]
1267+
==>[lop,josh,marko]
1268+
==>[marko,lop,peter]
1269+
==>[marko,josh,ripple]
1270+
==>[lop,josh,ripple]
1271+
==>[lop,marko,vadas]
1272+
----
1273+
1274+
Providers implementing custom optimizations or strategies around `RepeatStep` should verify that their
1275+
implementations account for the repeat traversal being a global child. This particularly affects:
1276+
1277+
- Strategies that analyze or transform repeat traversals
1278+
- Optimizations that depend on the scope semantics of child traversals
1279+
1280+
The last point about optimizations may be particularly important for providers that have memory constraints as this
1281+
change may bring about higher memory usage due to more traversers needing to be held in memory.
1282+
1283+
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3200[TINKERPOP-3200]
1284+
11311285
===== Prefer OffsetDateTime
11321286
11331287
The default implementation for date type in Gremlin is now changed from the deprecated `java.util.Date` to the more

gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,12 @@
2121
import org.apache.tinkerpop.gremlin.process.traversal.Step;
2222
import org.apache.tinkerpop.gremlin.process.traversal.Traversal;
2323
import org.apache.tinkerpop.gremlin.process.traversal.Traverser;
24+
import org.apache.tinkerpop.gremlin.process.traversal.step.Barrier;
2425
import org.apache.tinkerpop.gremlin.process.traversal.step.TraversalParent;
2526
import org.apache.tinkerpop.gremlin.process.traversal.step.util.ComputerAwareStep;
2627
import org.apache.tinkerpop.gremlin.process.traversal.traverser.TraverserRequirement;
28+
import org.apache.tinkerpop.gremlin.process.traversal.util.FastNoSuchElementException;
29+
import org.apache.tinkerpop.gremlin.process.traversal.util.TraversalHelper;
2730
import org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil;
2831
import org.apache.tinkerpop.gremlin.structure.util.StringFactory;
2932
import org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils;
@@ -43,6 +46,7 @@ public final class RepeatStep<S> extends ComputerAwareStep<S, S> implements Trav
4346
private Traversal.Admin<S, S> repeatTraversal = null;
4447
private Traversal.Admin<S, ?> untilTraversal = null;
4548
private Traversal.Admin<S, ?> emitTraversal = null;
49+
private boolean first = true;
4650
private String loopName = null;
4751
public boolean untilFirst = false;
4852
public boolean emitFirst = false;
@@ -206,20 +210,20 @@ protected Iterator<Traverser.Admin<S>> standardAlgorithm() throws NoSuchElementE
206210
throw new IllegalStateException("The repeat()-traversal was not defined: " + this);
207211

208212
while (true) {
209-
if (this.repeatTraversal.getEndStep().hasNext()) {
213+
if (!first && this.repeatTraversal.getEndStep().hasNext()) {
210214
return this.repeatTraversal.getEndStep();
211215
} else {
212-
final Traverser.Admin<S> start = this.starts.next();
213-
start.initialiseLoops(this.getId(), this.loopName);
214-
if (doUntil(start, true)) {
215-
start.resetLoops();
216-
return IteratorUtils.of(start);
217-
}
218-
this.repeatTraversal.addStart(start);
219-
if (doEmit(start, true)) {
220-
final Traverser.Admin<S> emitSplit = start.split();
221-
emitSplit.resetLoops();
222-
return IteratorUtils.of(emitSplit);
216+
this.first = false;
217+
if (TraversalHelper.hasStepOfAssignableClassRecursively(Barrier.class, repeatTraversal)) {
218+
// If the repeatTraversal has a Barrier then make sure that all starts are added to the
219+
// repeatTraversal before it is iterated so that RepeatStep always has "global" children.
220+
if (!this.starts.hasNext())
221+
throw FastNoSuchElementException.instance();
222+
while (this.starts.hasNext()) {
223+
processTraverser(this.starts.next());
224+
}
225+
} else {
226+
return processTraverser(this.starts.next());
223227
}
224228
}
225229
}
@@ -249,6 +253,21 @@ protected Iterator<Traverser.Admin<S>> computerAlgorithm() throws NoSuchElementE
249253
}
250254
}
251255

256+
private Iterator<Traverser.Admin<S>> processTraverser(final Traverser.Admin<S> start) {
257+
start.initialiseLoops(this.getId(), this.loopName);
258+
if (doUntil(start, true)) {
259+
start.resetLoops();
260+
return IteratorUtils.of(start);
261+
}
262+
this.repeatTraversal.addStart(start);
263+
if (doEmit(start, true)) {
264+
final Traverser.Admin<S> emitSplit = start.split();
265+
emitSplit.resetLoops();
266+
return IteratorUtils.of(emitSplit);
267+
}
268+
return Collections.emptyIterator();
269+
}
270+
252271
/////////////////////////
253272

254273
public static <A, B, C extends Traversal<A, B>> C addRepeatToTraversal(final C traversal, final Traversal.Admin<B, B> repeatTraversal) {

0 commit comments

Comments
 (0)