Skip to content

Conversation

@BenHenning
Copy link
Collaborator

Fixes #<issue_number_goes_here>

@BenHenning
Copy link
Collaborator Author

BenHenning commented Nov 20, 2025

Looking at this scroll test first: Insert scrolls new block into view. It's failing with an error that the controls_if block isn't in the viewport. Comparing my local Linux run of the test vs. the CI run makes it pretty clear:

Linux local:

block bounds: { bottom: 1016, left: 53.34375, right: 213.34375, top: 904 }
viewport: { bottom: 1265.5, left: -10, right: 561, top: 814.5 }

Mac on CI:

block bounds: {
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656,
  top: 1770
}
viewport: { bottom: 1166.5, left: -10, right: 571, top: 705.5 }

The block is obviously way below the viewport for some reason. Also it's interesting to see that the viewport sizes are different which suggests discrepancies in test behaviors. Lack of cross-platform or cross-environment determinism will make this more difficult.

Attempt to use BiDi and upgrade WIO so that viewport manipulation can be
used instead of window management for better compatibility.
@BenHenning
Copy link
Collaborator Author

BenHenning commented Nov 20, 2025

Oh hey. It actually passed when using viewport instead of window size. Cool--I didn't expect that. I was guessing there could be some differences but I wasn't expecting there to be enough of one to cause that much of a coordinate discrepancy. I'm surprised but I don't care about the why quite enough to dig on it.

@BenHenning
Copy link
Collaborator Author

For reference, here are the results running the changed test on each platform.

Linux (local):

block bounds: { top: 904, bottom: 1016, left: 53.34375, right: 213.34375 }
viewport: { top: 740, bottom: 1340, left: -10, right: 570 }

Mac (CI):

block bounds: {
  top: 904,
  bottom: 1016,
  left: 54.20796203613281,
  right: 214.2079620361328
}
viewport: { top: 740, bottom: 1340, left: -10, right: 571 }

They are basically identical now which is great. :D

Unfortunately other Mac failures are likely a different problem entirely. Will dig on those next.

This test suite seems to now pass consistently between Linux & Mac.
@BenHenning
Copy link
Collaborator Author

Aha--enabling BiDi caused a bunch of new failures! Interesting.

@BenHenning
Copy link
Collaborator Author

BenHenning commented Nov 21, 2025

Updating to latest Webdriver seems to fix a few more issues with BiDi and bring us back to basically where we were at before for failures (I think), plus a new failure affecting Linux (maybe only?).

Edit: Actually interestingly the previously fixed test is either failing again, or is failing when run in conjunction with other tests.

It's failing either again (after WIO update) or when run with the rest
of the test suite.
@BenHenning
Copy link
Collaborator Author

Either it's failing again after the upgrade or, maybe more likely, it was flaky and incidentally passed earlier. Latest dimensions with the failure:

block bounds: {
  top: 1770,
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656
}
viewport: { top: 636, bottom: 1236, left: -10, right: 571 }

This is more or less what we saw earlier. It seems that the BiDi change is probably not needed (and we could downgrade back if I can find fixes for everything without it). The flake seems to be that, for some reason, the block is being put in the wrong place.

@BenHenning
Copy link
Collaborator Author

Huh. This is surprisingly difficult to break. It seems much more consistent now, but it obviously can still fail per the two repros above.

@BenHenning
Copy link
Collaborator Author

BenHenning commented Nov 21, 2025

It passed 50 times in a row. Okay...this might be really tricky actually. My current working theory is that the extra browser waits to perform the debugging are actually adding a pause that allows scrolling to stabilize, thus fixing the test.

@BenHenning
Copy link
Collaborator Author

The extra pause seems to fix it. Removing all debug logs successfully re-failed the test (though we did see a failure earlier w/o needing to remove the later logs in the test, but oh well).

@BenHenning
Copy link
Collaborator Author

BenHenning commented Nov 21, 2025

Attempting a 50x run of all tests except for the move tests since those time out. I think this might take a while. Will follow up with a rough estimate for completion if none fail.

Edit: Approximately 65-70 seconds per run so I'm guessing about an hour to run through 50 times without failure.

Edit 2: Though that's based on WebdriverIO's reported times. The runner actually took 2.5 minutes to fail so I think there's a lot of overhead not being reported here. That likely means this will take more than 2 hours to run through. I think GitHub has a 6 hour limit on runners so we should be below that.

@BenHenning
Copy link
Collaborator Author

Interestingly running the whole suite actually caused the earlier test to fail again. Will try re-adding all of the logs and see if we can glean anything interesting.

@BenHenning
Copy link
Collaborator Author

I added support for uploading screenshots of failing tests at their point of failure.

The plot thickens because the metrics do not match the rendering here:

Insert scrolls new block into view

@BenHenning
Copy link
Collaborator Author

BenHenning commented Dec 1, 2025

Logs from a Linux run of the failing test:

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -809.5 ]
block bounds: { bottom: 1016, left: 53.34375, right: 213.34375, top: 904 }
block position: { x: 53.34375, y: 904 }
block's parent bounds: { bottom: 1040, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1270.5, left: -10, right: 570, top: 809.5 }

Edit: failure from latest OS X run (without BiDi):

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -705.5 ]
block bounds: {
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656,
  top: 1770
}
block position: { x: 130.41592407226562, y: 1770 }
block's parent bounds: { bottom: 936, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1166.5, left: -10, right: 571, top: 705.5 }

@BenHenning
Copy link
Collaborator Author

BenHenning commented Dec 1, 2025

Latest run (with relative position logs).

Edit: added additional details.

Linux:

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -809.5 ]
block bounds: { bottom: 1016, left: 53.34375, right: 213.34375, top: 904 }
block position: { x: 53.34375, y: 904 } relative: { x: 0, y: 48 } null null translate(0, 48) null
block's parent bounds: { bottom: 1040, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1270.5, left: -10, right: 570, top: 809.5 }
matched blocks to controls_if: [ 'blockly-36' ]

OS X:

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -705.5 ]
block bounds: {
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656,
  top: 1770
}
block position: { x: 130.41592407226562, y: 1770 } relative: { x: 76.20796203613281, y: 914 } null null translate(76.20796203613281, 914) null
block's parent bounds: { bottom: 936, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1166.5, left: -10, right: 571, top: 705.5 }
matched blocks to controls_if: [ 'blockly-36' ]

@BenHenning
Copy link
Collaborator Author

Still pretty stuck on this issue. It seems that the newly inserted block's position is very far down (its actual translation), but it's not being rendered like that per the screenshot. Part of me is wondering if more than one block is being matched with getBlocksByType.

@BenHenning
Copy link
Collaborator Author

Interesting. I added a save screenshot line before the logs to see if perhaps there's some weird delay happening with the block layout that's not actually being represented in the screenshot (i.e. it happens after the test finishes). Adding this line fixes the test, yet the 5 second delay doesn't. Something very odd is happening here.

@BenHenning
Copy link
Collaborator Author

Not sure if that actually had the desired effect, but I tried to force the user agent over to pretending to be JavaFX to try and force Blockly into immediate rendering mode.

Right now I am suspicious of requestAnimationFrame. My limited understanding of this function is that it's simply a request for the browser to perform some action at the next rendering frame. However, the browser may choose to not call this in certain cases. I'm wondering right now if headless mode is affecting how requestAnimationFrame behaves (oddly differently on Mac--not sure about that yet) which might explain why snapping a screenshot fixes the problem since it would force the browse to perform a full render frame (and thus resolve any pending animation frame requests, thus allowing Blockly to render).

Not expecting this to actually make a difference.

Also speed up tests by skipping non-WDIO tests.
@BenHenning
Copy link
Collaborator Author

I have found success! I was trying to use headless=new from some rough searching, but this seems to be the default in Chrome now (as far as I can tell) so it expectedly did nothing.

I wanted to test my theory about requestAnimationFrame in more detail. I was having some difficultly overriding core Blockly's render_management.ts behaviors in plugin tests, so I linked against a version of Blockly where I essentially disabled everything queueRender does. Voila--I get the exact same failure for this test when run locally. I think this essentially confirms the theory. It doesn't explain why it's happening specifically with this test and so consistently (to that I can only guess), but it may actually explain some of the other instability being seen (possibly also on Linux, though I am also suspecting Chrome's rendering engine may simply behave differently on OS X--though the fact that can actually affect DOM behavior isn't great but alas).

@BenHenning
Copy link
Collaborator Author

Okay, the plot thickens further. I don't have a deep understanding of Blockly rendering but I may need to develop it to make further progress here. Essentially from what I can tell the newly inserted block gets some bogus coordinates set if requestAnimationFrame isn't called as expected. This is especially odd because even forcing rendering with Blockly.getMainWorkspace().render() never corrects the position--the block remains forever displaced.

This can be very easily reproduced by updating core Blockly here:

https://github.com/RaspberryPiFoundation/blockly/blob/91ef951e92576a090aea7c675bafd37a93ac369a/core/render_management.ts#L48-L66

And replacing that with the following version:

export function queueRender(block: BlockSvg): Promise<void> {
  queueBlock(block);
  return Promise.resolve();
}

This simulates requestAnimationFrame not being called. This will break a few tests though that can be worked around by adding some force workspace rendering calls in select places in the plugin tests. Doing that still causes exactly the scroll test to continue to fail and exactly in the same way as in CI (this repros consistently on Linux with this setup).

My best guess until I can dig further is that rendering is ultimately non-deterministic and has the following effects:

  • It somehow impacts block layouting.
  • Block layouting likely feeds back into rendering or later logical decisions (probably because core Blockly sources block position from the SVG itself).
  • Further attempts to re-render never fix the situation because the block itself got into a logically bad place due to a missed frame.

This almost certainly indicates a real bug in core Blockly. It's valid for browsers to delay frames which could, theoretically, result in Blockly logic executing on not-yet-updated rendering properties (which is ultimately what's happening in this test and breaking core Blockly state).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant