From: duchangbin <changbin.du@huawei.com>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: duchangbin <changbin.du@huawei.com>,
Jonathan Corbet <corbet@lwn.net>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] tools: jobserver: Add validation for jobserver tokens to ensure valid '+' characters
Date: Thu, 8 Jan 2026 02:58:05 +0000 [thread overview]
Message-ID: <9ec672bde2cc4b14905175ca22cbb737@huawei.com> (raw)
In-Reply-To: <aV4zoXfoKJE0Id4e@foz.lan>
On Wed, Jan 07, 2026 at 11:42:38AM +0100, Mauro Carvalho Chehab wrote:
> >
> > It would be nice if you could provide more details about how to reproduce it.
> > Are you doing anything special? What distro are you using? what python version?
> >
> > > When this problem occurs, the current implementation deadlocks because for regular files,
> > > os.read() returns empty bytes after reaching EOF, creating an infinite loop. My workaround
> > > is to ignore this error condition to prevent deadlock, although this means the jobserver
> > > protocol will no longer be honored.
> >
> > testing if slot is empty makes sense, but why testing if it is "+"?
> >
> > >
> > > As you suggested above, We can output an error message to stderr to inform users, but
> > > must not use stdout, as it would corrupt the tool's normal output stream.
> >
>
> After thinking a little bit more about this, IMHO the best is to have
> two separate patches (assuming that there is a good reason why ensuring that the
> slot's character is "+"):
>
> > You could do something like (untested):
> >
> > while True:
> > try:
> > slot = os.read(self.reader, 8)
> > + if not slot:
> > + # Stop at the end of the jobserver queue.
> > + break
>
> This would be patch 1, to overcome some issue (probably due to Python
> version) that reading past EOF won't rise an exception. I would very much
> want to see what python version you're using and see if some other
> exception arose (like EOFError), properly described at the patch description.
>
My Python is 3.12.3, and GNU Make is 4.3. But I don't think the issue is caused
by the Python interpreter. Instead, my shell opened /etc/passwd for some reason
without closing it, and as a result, all child processes inherited this fd3 file
descriptor.
$ ls -l /proc/self/fd
total 0
lrwx------ 1 changbin changbin 64 Jan 8 10:40 0 -> /dev/pts/0
lrwx------ 1 changbin changbin 64 Jan 8 10:40 1 -> /dev/pts/0
lrwx------ 1 changbin changbin 64 Jan 8 10:40 2 -> /dev/pts/0
lr-x------ 1 changbin changbin 64 Jan 8 10:40 3 -> /etc/passwd
lr-x------ 1 changbin changbin 64 Jan 8 10:40 4 -> /proc/2468162/fd
In this case, make should open a new file descriptor for jobserver control, but
clearly, it did not do so and instead still passed fd 3. Once I get a chance,
I'll look into how Make 4.3 actually works.
> > + # Why do we need this?
> > + if any(c != b'+'[0] for c in slot):
> > + print("Warning: invalid jobserver slots", file=sys.stderr)
> > + break
>
> This seems to be a separate issue. Why do we need to enforce that the slot data
> is "+"? If it doesn't, why this would be a problem?
>
> Btw, reading:
>
> https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html
>
> We have:
>
> "In both implementations of the jobserver, the pipe will be pre-loaded with
> one single-character token for each available job. To obtain an extra slot
> you must read a single character from the jobserver; to release a slot you
> must write a single character back into the jobserver.
>
> It’s important that when you release the job slot, you write back the same
> character you read. Don’t assume that all tokens are the same character;
> different characters may have different meanings to GNU make. The order is
> not important, since make has no idea in what order jobs will complete anyway."
>
> So, a 100% compliant POSIX jobserver code shall not test for "+", but, instead,
> preserve whatever character is there.
>
> Yet, checking for "+" is really needed, please add a rationale at the patch
> description justifying why. On such case, we should still:
>
> - release the slot(s) we don't want by writing the character via
> os.write();
> - print a warning message about why we rejected the slot(s).
>
Thank you for the information. I previously misunderstood that reading from the
jobserver would only return a '+' symbol, but now it's obviously not the case.
At this point, we seem unable to verify whether it's a valid jobserver file
descriptor, so we have to read the entire file contents util EOF.
> > self.jobs += slot
> > except (OSError, IOError) as e:
> > if e.errno == errno.EWOULDBLOCK:
> > # Stop at the end of the jobserver queue.
> > break
> > # If something went wrong, give back the jobs.
> > if self.jobs:
> > os.write(self.writer, self.jobs)
> > raise e
> >
> > Yet, if os.read() fails or reaches EOF, I would expect that the "except" block
> > would pick it. It sounds to me that it could be some issue with the python
> > version you're using.
> >
> > > For
> > > example, in scripts/Makefile.vmlinux_o we have:
> > >
> > > quiet_cmd_gen_initcalls_lds = GEN $@
> > > cmd_gen_initcalls_lds = \
> > > $(PYTHON3) $(srctree)/scripts/jobserver-exec \
> > > $(PERL) $(real-prereqs) > $@
> > >
> > >
> > > > > self.jobs += slot
> > > > > except (OSError, IOError) as e:
> > > > > if e.errno == errno.EWOULDBLOCK:
> > > >
> > > > Thanks,
> > > >
> > > > jon
> > >
> >
> >
> >
> > Thanks,
> > Mauro
>
> --
> Thanks,
> Mauro
>
--
Cheers,
Changbin Du
next prev parent reply other threads:[~2026-01-08 2:58 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-25 6:26 [PATCH] tools: jobserver: Add validation for jobserver tokens to ensure valid '+' characters Changbin Du
2026-01-05 8:22 ` duchangbin
2026-01-05 15:35 ` Jonathan Corbet
2026-01-06 21:52 ` Jonathan Corbet
2026-01-07 8:11 ` duchangbin
2026-01-07 9:29 ` Mauro Carvalho Chehab
2026-01-07 10:42 ` Mauro Carvalho Chehab
2026-01-07 10:54 ` Mauro Carvalho Chehab
2026-01-08 2:58 ` duchangbin [this message]
2026-01-08 8:24 ` Mauro Carvalho Chehab
2026-01-08 10:01 ` duchangbin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9ec672bde2cc4b14905175ca22cbb737@huawei.com \
--to=changbin.du@huawei.com \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab+huawei@kernel.org \
--cc=mchehab@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.