Skip to content

[BUG]: Page cache usage causes post-write sync to fail #1216

@P33M

Description

@P33M

What happened?

This bug is specific to the Linux variant of Imager. When writing an image to a storage device that is also a target for a device-specific synchronisation operation, the sync op can time out with various bad effects. A prerequisite is that the destination storage has slower write speed than the read/decompress operation speed (typical for most target SD cards).

Here are two reproducers:

  1. Pi 5 8GB running Pi OS booted from USB and writing an image to SD

Insert a blank SD class A1 card in the SD slot and use imager in CLI mode to write it.

While writing, the buffers/page cache usage reported will climb to use the vast majority of the free RAM. At or near the 100% step in the progress bar, the sync op is issued, takes more than 2 minutes to complete, and this causes a splat in dmesg:

[  726.451032] INFO: task kworker/1:0:2147 blocked for more than 120 seconds.
[  726.451043]       Not tainted 6.12.47-v8-16k+ #635
[  726.451046] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.451048] task:kworker/1:0     state:D stack:0     pid:2147  tgid:2147  ppid:2      flags:0x00000008
[  726.451056] Workqueue: events_freezable mmc_rescan
[  726.451067] Call trace:
[  726.451069]  __switch_to+0xf0/0x160
[  726.451076]  __schedule+0x330/0xb68
[  726.451080]  schedule+0x3c/0x148
[  726.451084]  __mmc_claim_host+0xbc/0x1f0
[  726.451088]  mmc_get_card+0x3c/0x58
[  726.451093]  mmc_sd_detect+0x28/0xa0
[  726.451097]  mmc_rescan+0x94/0x330
[  726.451101]  process_one_work+0x15c/0x3c0
[  726.451107]  worker_thread+0x2e4/0x3f0
[  726.451111]  kthread+0x120/0x130
[  726.451115]  ret_from_fork+0x10/0x20
[  811.206493]  mmcblk0: p1 p2
[  811.282818]  mmcblk0: p1 p2

The process eventually succeeds (the final write to the partition table causes a re-enumeration).

  1. Writing to a Pi exposing storage via USB mass-storage gadget

A different variation of this is seen with a Pi 4/5 4GB running Pi OS, and writing to a Pi 4/5 exposing its SD card via mass-storage gadget. The dirty page counts on the gadget Pi climb to a significant fraction of total RAM.
In this case, the sync op on the mass-storage interface times out and causes Linux to do a device reset, which is badly handled by the gadget.

Imager should have some notion of synchronously writing to/checkpointing writes to the underlying block device in both cases - avoiding buffer bloat which also causes the progress bar to be quite inaccurate.

Version

1.9.6 (Default)

What host operating system were you using?

Debian and derivatives (eg Ubuntu)

Host OS Version

Raspberry Pi OS bookworm

Selected OS

Raspberry Pi OS bookworm

Which Raspberry Pi Device are you using?

Raspberry Pi 5, 500, and Compute Modules 5

What kind of storage device are you using?

Other

OS Customisation

  • Yes, I was using OS Customisation when the bug occurred.

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions