Linux XFS filesystem development
 help / color / mirror / Atom feed
* xfs_scrub_all process execution results in a dead lock condition
@ 2026-04-30 13:00 Patrick Fischer
  2026-04-30 15:50 ` Darrick J. Wong
  0 siblings, 1 reply; 2+ messages in thread
From: Patrick Fischer @ 2026-04-30 13:00 UTC (permalink / raw)
  To: linux-xfs

Hello,
I've encountered a bug within the xfsprogs-dev utilities, particular within the Python script xfs_scrub_all.
I researched the master branch and saw, that the type of sub process call is like within kernel version 6.13.0-2 I stumbled across this issue.

Overview:
xfs_scrub_all.service systemd unit (or manual execution) is hanging due to a pipe buffer exhaustion after sub process call of lsblk.

Steps to reproduce:
Create a bunch of fake block devices to enlarge the output of lsblk to more than 65520 bytes:
> modprobe scsi_debug max_lunx=3 num_tgts=7 add_hosts=100


Run the command of xfs_scrub_all.service manually:
> /usr/sbin/xfs_scrub_all --auto-media-scan-interval 1mo


xfs_scrub_all shows a wait4 for the sub process lsblk:
> wait4(2148527,


Within sub process lsblk there is a write to FD1 / stdout:
> write(1, "                     {\n         "..., 4096


Affected Code in /usr/sbin/xfs_scrub_all[1]:
> 54     cmd=['lsblk', '-o', 'NAME,KNAME,TYPE,FSTYPE,MOUNTPOINT', '-J']
> 55     result = subprocess.Popen(cmd, stdout=subprocess.PIPE)
> 56     result.wait()
> 57     if result.returncode != 0:
> 58         return fs


Actual Results:
The execution of the command above launches a sub process of lsblk and returns more than 65520 bytes, resulting in an endless wait for return.

Expected Results:
The unit / process should not enter a dead lock.

[1] https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/tree/scrub/xfs_scrub_all.py.in

Regards,
Patrick Fischer

Die E-Mail wurde von IKARUS mail.security geprüft.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: xfs_scrub_all process execution results in a dead lock condition
  2026-04-30 13:00 xfs_scrub_all process execution results in a dead lock condition Patrick Fischer
@ 2026-04-30 15:50 ` Darrick J. Wong
  0 siblings, 0 replies; 2+ messages in thread
From: Darrick J. Wong @ 2026-04-30 15:50 UTC (permalink / raw)
  To: Patrick Fischer; +Cc: linux-xfs

On Thu, Apr 30, 2026 at 03:00:01PM +0200, Patrick Fischer wrote:
> Hello,
> I've encountered a bug within the xfsprogs-dev utilities, particular
> within the Python script xfs_scrub_all.  I researched the master
> branch and saw, that the type of sub process call is like within
> kernel version 6.13.0-2 I stumbled across this issue.

"6.13.0-2" ... Debian Trixie?

> Overview:
> xfs_scrub_all.service systemd unit (or manual execution) is hanging
> due to a pipe buffer exhaustion after sub process call of lsblk.
> 
> Steps to reproduce:
> Create a bunch of fake block devices to enlarge the output of lsblk to
> more than 65520 bytes:
> > modprobe scsi_debug max_lunx=3 num_tgts=7 add_hosts=100

/me notes that this wasn't enough to generate more than 60k of lsblk
output.  Creating a fake 130k json file and changing cmd to
['cat', '/tmp/garbage'] was sufficient to reproduce the problem,
however.

> Run the command of xfs_scrub_all.service manually:
> > /usr/sbin/xfs_scrub_all --auto-media-scan-interval 1mo
> 
> 
> xfs_scrub_all shows a wait4 for the sub process lsblk:
> > wait4(2148527,
> 
> 
> Within sub process lsblk there is a write to FD1 / stdout:
> > write(1, "                     {\n         "..., 4096
> 
> 
> Affected Code in /usr/sbin/xfs_scrub_all[1]:
> > 54     cmd=['lsblk', '-o', 'NAME,KNAME,TYPE,FSTYPE,MOUNTPOINT', '-J']
> > 55     result = subprocess.Popen(cmd, stdout=subprocess.PIPE)
> > 56     result.wait()
> > 57     if result.returncode != 0:
> > 58         return fs
> 
> 
> Actual Results:
> The execution of the command above launches a sub process of lsblk and
> returns more than 65520 bytes, resulting in an endless wait for
> return.

Yep, that's a bug.

Now that we can assume (demand?) python >= 3.5, I think we can replace
all the string iteration and collection mess in that function with a
simpler call to subprocess.run:

	cmd=['lsblk', '-o', 'NAME,KNAME,TYPE,FSTYPE,MOUNTPOINT', '-J']
	try:
		proc = subprocess.run(cmd, capture_output = True, text = True, check = True)
	except Exception as e:
		print(e)
		return fs
	if proc.returncode != 0:
		return fs

	# The lsblk output had better be in disks-then-partitions order
	bdevdata = json.loads(proc.stdout)
	for bdev in bdevdata['blockdevices']:

> Expected Results:
> The unit / process should not enter a dead lock.
> 
> [1] https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/tree/scrub/xfs_scrub_all.py.in

Thanks for the report, sorry there was a bug.  I'll post a fixpatch
soon.

--D

> 
> Regards,
> Patrick Fischer
> 
> Die E-Mail wurde von IKARUS mail.security geprüft.
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-30 15:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-30 13:00 xfs_scrub_all process execution results in a dead lock condition Patrick Fischer
2026-04-30 15:50 ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox