Attempt to fix mac tests #770

BenHenning · 2025-11-20T21:38:27Z

Fixes #<issue_number_goes_here>

BenHenning · 2025-11-20T22:21:44Z

Looking at this scroll test first: Insert scrolls new block into view. It's failing with an error that the controls_if block isn't in the viewport. Comparing my local Linux run of the test vs. the CI run makes it pretty clear:

Linux local:

block bounds: { bottom: 1016, left: 53.34375, right: 213.34375, top: 904 }
viewport: { bottom: 1265.5, left: -10, right: 561, top: 814.5 }

Mac on CI:

block bounds: {
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656,
  top: 1770
}
viewport: { bottom: 1166.5, left: -10, right: 571, top: 705.5 }

The block is obviously way below the viewport for some reason. Also it's interesting to see that the viewport sizes are different which suggests discrepancies in test behaviors. Lack of cross-platform or cross-environment determinism will make this more difficult.

Attempt to use BiDi and upgrade WIO so that viewport manipulation can be used instead of window management for better compatibility.

BenHenning · 2025-11-20T23:30:55Z

Oh hey. It actually passed when using viewport instead of window size. Cool--I didn't expect that. I was guessing there could be some differences but I wasn't expecting there to be enough of one to cause that much of a coordinate discrepancy. I'm surprised but I don't care about the why quite enough to dig on it.

BenHenning · 2025-11-20T23:33:00Z

For reference, here are the results running the changed test on each platform.

Linux (local):

block bounds: { top: 904, bottom: 1016, left: 53.34375, right: 213.34375 }
viewport: { top: 740, bottom: 1340, left: -10, right: 570 }

Mac (CI):

block bounds: {
  top: 904,
  bottom: 1016,
  left: 54.20796203613281,
  right: 214.2079620361328
}
viewport: { top: 740, bottom: 1340, left: -10, right: 571 }

They are basically identical now which is great. :D

Unfortunately other Mac failures are likely a different problem entirely. Will dig on those next.

This test suite seems to now pass consistently between Linux & Mac.

BenHenning · 2025-11-20T23:37:59Z

Aha--enabling BiDi caused a bunch of new failures! Interesting.

BenHenning · 2025-11-21T00:01:19Z

Updating to latest Webdriver seems to fix a few more issues with BiDi and bring us back to basically where we were at before for failures (I think), plus a new failure affecting Linux (maybe only?).

Edit: Actually interestingly the previously fixed test is either failing again, or is failing when run in conjunction with other tests.

It's failing either again (after WIO update) or when run with the rest of the test suite.

BenHenning · 2025-11-21T00:08:46Z

Either it's failing again after the upgrade or, maybe more likely, it was flaky and incidentally passed earlier. Latest dimensions with the failure:

block bounds: {
  top: 1770,
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656
}
viewport: { top: 636, bottom: 1236, left: -10, right: 571 }

This is more or less what we saw earlier. It seems that the BiDi change is probably not needed (and we could downgrade back if I can find fixes for everything without it). The flake seems to be that, for some reason, the block is being put in the wrong place.

BenHenning · 2025-11-21T00:31:21Z

Huh. This is surprisingly difficult to break. It seems much more consistent now, but it obviously can still fail per the two repros above.

BenHenning · 2025-11-21T01:11:05Z

It passed 50 times in a row. Okay...this might be really tricky actually. My current working theory is that the extra browser waits to perform the debugging are actually adding a pause that allows scrolling to stabilize, thus fixing the test.

BenHenning · 2025-11-21T01:26:09Z

The extra pause seems to fix it. Removing all debug logs successfully re-failed the test (though we did see a failure earlier w/o needing to remove the later logs in the test, but oh well).

BenHenning · 2025-11-21T01:30:02Z

Attempting a 50x run of all tests except for the move tests since those time out. I think this might take a while. Will follow up with a rough estimate for completion if none fail.

Edit: Approximately 65-70 seconds per run so I'm guessing about an hour to run through 50 times without failure.

Edit 2: Though that's based on WebdriverIO's reported times. The runner actually took 2.5 minutes to fail so I think there's a lot of overhead not being reported here. That likely means this will take more than 2 hours to run through. I think GitHub has a 6 hour limit on runners so we should be below that.

BenHenning · 2025-11-21T01:35:09Z

Interestingly running the whole suite actually caused the earlier test to fail again. Will try re-adding all of the logs and see if we can glean anything interesting.

BenHenning · 2025-11-21T03:41:59Z

I added support for uploading screenshots of failing tests at their point of failure.

The plot thickens because the metrics do not match the rendering here:

BenHenning · 2025-12-01T22:37:13Z

Logs from a Linux run of the failing test:

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -809.5 ]
block bounds: { bottom: 1016, left: 53.34375, right: 213.34375, top: 904 }
block position: { x: 53.34375, y: 904 }
block's parent bounds: { bottom: 1040, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1270.5, left: -10, right: 570, top: 809.5 }

Edit: failure from latest OS X run (without BiDi):

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -705.5 ]
block bounds: {
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656,
  top: 1770
}
block position: { x: 130.41592407226562, y: 1770 }
block's parent bounds: { bottom: 936, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1166.5, left: -10, right: 571, top: 705.5 }

BenHenning · 2025-12-01T22:58:19Z

Latest run (with relative position logs).

Edit: added additional details.

Linux:

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -809.5 ]
block bounds: { bottom: 1016, left: 53.34375, right: 213.34375, top: 904 }
block position: { x: 53.34375, y: 904 } relative: { x: 0, y: 48 } null null translate(0, 48) null
block's parent bounds: { bottom: 1040, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1270.5, left: -10, right: 570, top: 809.5 }
matched blocks to controls_if: [ 'blockly-36' ]

OS X:

just moved block bounds: { bottom: 936, left: 0, right: 160, top: 832 }
workspace scroll position before scroll: [ 10, -485 ]
workspace scroll position after scroll: [ 10, -65 ]
current focused node before insert: blockly-u
current focused node after insert: blockly-36
workspace scroll position after insert: [ 10, -705.5 ]
block bounds: {
  bottom: 1882,
  left: 130.41592407226562,
  right: 290.4159240722656,
  top: 1770
}
block position: { x: 130.41592407226562, y: 1770 } relative: { x: 76.20796203613281, y: 914 } null null translate(76.20796203613281, 914) null
block's parent bounds: { bottom: 936, left: 0, right: 160, top: 832 } id: blockly-u
viewport: { bottom: 1166.5, left: -10, right: 571, top: 705.5 }
matched blocks to controls_if: [ 'blockly-36' ]

BenHenning · 2025-12-01T23:23:05Z

Still pretty stuck on this issue. It seems that the newly inserted block's position is very far down (its actual translation), but it's not being rendered like that per the screenshot. Part of me is wondering if more than one block is being matched with getBlocksByType.

BenHenning · 2025-12-01T23:57:33Z

Interesting. I added a save screenshot line before the logs to see if perhaps there's some weird delay happening with the block layout that's not actually being represented in the screenshot (i.e. it happens after the test finishes). Adding this line fixes the test, yet the 5 second delay doesn't. Something very odd is happening here.

BenHenning · 2025-12-02T00:34:39Z

Not sure if that actually had the desired effect, but I tried to force the user agent over to pretending to be JavaFX to try and force Blockly into immediate rendering mode.

Right now I am suspicious of requestAnimationFrame. My limited understanding of this function is that it's simply a request for the browser to perform some action at the next rendering frame. However, the browser may choose to not call this in certain cases. I'm wondering right now if headless mode is affecting how requestAnimationFrame behaves (oddly differently on Mac--not sure about that yet) which might explain why snapping a screenshot fixes the problem since it would force the browse to perform a full render frame (and thus resolve any pending animation frame requests, thus allowing Blockly to render).

Not expecting this to actually make a difference. Also speed up tests by skipping non-WDIO tests.

BenHenning · 2025-12-02T23:58:24Z

I have found success! I was trying to use headless=new from some rough searching, but this seems to be the default in Chrome now (as far as I can tell) so it expectedly did nothing.

I wanted to test my theory about requestAnimationFrame in more detail. I was having some difficultly overriding core Blockly's render_management.ts behaviors in plugin tests, so I linked against a version of Blockly where I essentially disabled everything queueRender does. Voila--I get the exact same failure for this test when run locally. I think this essentially confirms the theory. It doesn't explain why it's happening specifically with this test and so consistently (to that I can only guess), but it may actually explain some of the other instability being seen (possibly also on Linux, though I am also suspecting Chrome's rendering engine may simply behave differently on OS X--though the fact that can actually affect DOM behavior isn't great but alas).

BenHenning · 2025-12-03T00:59:12Z

Okay, the plot thickens further. I don't have a deep understanding of Blockly rendering but I may need to develop it to make further progress here. Essentially from what I can tell the newly inserted block gets some bogus coordinates set if requestAnimationFrame isn't called as expected. This is especially odd because even forcing rendering with Blockly.getMainWorkspace().render() never corrects the position--the block remains forever displaced.

This can be very easily reproduced by updating core Blockly here:

https://github.com/RaspberryPiFoundation/blockly/blob/91ef951e92576a090aea7c675bafd37a93ac369a/core/render_management.ts#L48-L66

And replacing that with the following version:

export function queueRender(block: BlockSvg): Promise<void> {
  queueBlock(block);
  return Promise.resolve();
}

This simulates requestAnimationFrame not being called. This will break a few tests though that can be worked around by adding some force workspace rendering calls in select places in the plugin tests. Doing that still causes exactly the scroll test to continue to fail and exactly in the same way as in CI (this repros consistently on Linux with this setup).

My best guess until I can dig further is that rendering is ultimately non-deterministic and has the following effects:

It somehow impacts block layouting.
Block layouting likely feeds back into rendering or later logical decisions (probably because core Blockly sources block position from the SVG itself).
Further attempts to re-render never fix the situation because the block itself got into a logically bad place due to a missed frame.

This almost certainly indicates a real bug in core Blockly. It's valid for browsers to delay frames which could, theoretically, result in Blockly logic executing on not-yet-updated rendering properties (which is ultimately what's happening in this test and breaking core Blockly state).

BenHenning added 3 commits November 20, 2025 20:51

fix: Fix failing NPM build.

8101a16

chore: Disable workflows for investigation.

b85a4b1

chore: Debug & disable test for investigation.

8859f98

chore: More investigation work.

ae9ee30

Attempt to use BiDi and upgrade WIO so that viewport manipulation can be used instead of window management for better compatibility.

chore: Clean up test & re-enable all.

27785c6

This test suite seems to now pass consistently between Linux & Mac.

BenHenning added 3 commits November 20, 2025 23:47

chore: Try upgrading WebdriverIO.

d5497fe

Merge branch 'main' into update-npm-lock-file

3d7c478

Merge branch 'update-npm-lock-file' into attempt-to-fix-mac-tests

75d0c73

chore: Re-isolate scroll test.

8d56eba

It's failing either again (after WIO update) or when run with the rest of the test suite.

BenHenning added 2 commits November 21, 2025 00:12

chore: Add more debug logs.

af1dbaa

chore: Run tests up to 50 times.

2e210fe

BenHenning added 2 commits November 21, 2025 00:52

chore: Speed & reduce tests for CI investigating.

2187b2c

chore: Fix path changing.

dabaad5

BenHenning added 3 commits November 21, 2025 01:12

chore: Remove debug logs to test theory.

4b97659

chore: Try to remove more logs.

cefb286

chore: Try adding a pause.

28a32f4

chore: Add comment & enable most tests.

059092f

chore: Re-add removed comment.

531979e

chore: Re-add logs for investigation.

efecfd9

BenHenning added 7 commits November 21, 2025 02:32

chore: Try to 'jiggle' block.

4b4c842

chore: Try using pause time.

2cc7478

feat: Record & upload test failures in CI.

27b375f

chore: Remove pause timeout.

ffed2d1

chore: Hopefully fix screenshots in CI.

b0ef7f9

chore: Add missing 'cd'.

c757c64

chore: Fix artifact path.

cb6f91f

BenHenning added 3 commits December 1, 2025 22:19

chore: Add position log.

d44fec2

chore: Attempt to disable BiDi again.

a8845c8

chore: Fix syntax error.

e850385

BenHenning added 2 commits December 1, 2025 22:53

chore: More logs on relative position.

f8039b4

chore: Fix test.

1d3b366

BenHenning added 2 commits December 1, 2025 23:08

chore: Add more to log.

ed9d2db

chore: Try adding a long pause.

f504532

BenHenning added 4 commits December 1, 2025 23:24

chore: Add another log statement.

6ee17d7

chore: Fix log output.

7023e09

chore: Add additional screenshot.

bba3633

Merge branch 'main' into attempt-to-fix-mac-tests

d731e9e

BenHenning added 3 commits December 1, 2025 23:57

chore: Restore PAUSE_TIME.

5c505a1

chore: Attempt at synchronizing.

cc40d71

chore: Attempt to pretend to be JavaFX.

14a2a3c

chore: Attempt to use --headless=new.

c7a5087

Not expecting this to actually make a difference. Also speed up tests by skipping non-WDIO tests.

Attempt to fix mac tests #770

Are you sure you want to change the base?

Attempt to fix mac tests #770

Uh oh!

Conversation

BenHenning commented Nov 20, 2025

Uh oh!

BenHenning commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Nov 20, 2025

Uh oh!

BenHenning commented Nov 20, 2025

Uh oh!

BenHenning commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Nov 21, 2025

Uh oh!

BenHenning commented Nov 21, 2025

Uh oh!

BenHenning commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Nov 21, 2025

Uh oh!

BenHenning commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Nov 21, 2025

Uh oh!

BenHenning commented Nov 21, 2025

Uh oh!

BenHenning commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenHenning commented Dec 1, 2025

Uh oh!

BenHenning commented Dec 1, 2025

Uh oh!

BenHenning commented Dec 2, 2025

Uh oh!

BenHenning commented Dec 2, 2025

Uh oh!

BenHenning commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BenHenning commented Nov 20, 2025 •

edited

Loading

BenHenning commented Nov 20, 2025 •

edited

Loading

BenHenning commented Nov 21, 2025 •

edited

Loading

BenHenning commented Nov 21, 2025 •

edited

Loading

BenHenning commented Nov 21, 2025 •

edited

Loading

BenHenning commented Dec 1, 2025 •

edited

Loading

BenHenning commented Dec 1, 2025 •

edited

Loading