public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
From: duchangbin <changbin.du@huawei.com>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: duchangbin <changbin.du@huawei.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] tools: jobserver: Add validation for jobserver tokens to ensure valid '+' characters
Date: Thu, 8 Jan 2026 10:01:26 +0000	[thread overview]
Message-ID: <107868000b0d4fe09ac8671b4057f012@huawei.com> (raw)
In-Reply-To: <20260108092417.20e97f91@foz.lan>

On Thu, Jan 08, 2026 at 09:24:17AM +0100, Mauro Carvalho Chehab wrote:
> Em Thu, 8 Jan 2026 02:58:05 +0000
> duchangbin <changbin.du@huawei.com> escreveu:
> 
> > On Wed, Jan 07, 2026 at 11:42:38AM +0100, Mauro Carvalho Chehab wrote:
> > > > 
> > > > It would be nice if you could provide more details about how to reproduce it. 
> > > > Are you doing anything special? What distro are you using? what python version?
> > > >   
> > > > > When this problem occurs, the current implementation deadlocks because for regular files,
> > > > > os.read() returns empty bytes after reaching EOF, creating an infinite loop. My workaround
> > > > > is to ignore this error condition to prevent deadlock, although this means the jobserver
> > > > > protocol will no longer be honored.  
> > > > 
> > > > testing if slot is empty makes sense, but why testing if it is "+"?
> > > >   
> > > > > 
> > > > > As you suggested above, We can output an error message to stderr to inform users, but
> > > > > must not use stdout, as it would corrupt the tool's normal output stream.  
> > > >   
> > > 
> > > After thinking a little bit more about this, IMHO the best is to have
> > > two separate patches (assuming that there is a good reason why ensuring that the
> > > slot's character is "+"):
> > >  
> > > > You could do something like (untested):
> > > > 
> > > >              while True:
> > > >                  try:
> > > >                      slot = os.read(self.reader, 8)
> > > > +                    if not slot:
> > > > +                        # Stop at the end of the jobserver queue.
> > > > +                        break  
> > > 
> > > This would be patch 1, to overcome some issue (probably due to Python
> > > version) that reading past EOF won't rise an exception. I would very much
> > > want to see what python version you're using and see if some other
> > > exception arose (like EOFError), properly described at the patch description.
> > >  
> > 
> > My Python is 3.12.3, and GNU Make is 4.3. But I don't think the issue is caused
> > by the Python interpreter. Instead, my shell opened /etc/passwd for some reason
> > without closing it, and as a result, all child processes inherited this fd3 file
> > descriptor.
> 
> Maybe there's something weird with your PAM settings and/or /etc/nsswitch.conf. I
> saw some issues in the past related to kerberos/ldap/radius/sso timeouts.
> 
> > 
> > $ ls -l /proc/self/fd
> > total 0
> > lrwx------ 1 changbin changbin 64 Jan  8 10:40 0 -> /dev/pts/0
> > lrwx------ 1 changbin changbin 64 Jan  8 10:40 1 -> /dev/pts/0
> > lrwx------ 1 changbin changbin 64 Jan  8 10:40 2 -> /dev/pts/0
> > lr-x------ 1 changbin changbin 64 Jan  8 10:40 3 -> /etc/passwd
> > lr-x------ 1 changbin changbin 64 Jan  8 10:40 4 -> /proc/2468162/fd
> > 
> > In this case, make should open a new file descriptor for jobserver control, but
> > clearly, it did not do so and instead still passed fd 3. Once I get a chance,
> > I'll look into how Make 4.3 actually works.
> > 
> > 
> > > > +                    # Why do we need this?
> > > > +                    if any(c != b'+'[0] for c in slot):
> > > > +                        print("Warning: invalid jobserver slots", file=sys.stderr)
> > > > +                        break  
> > > 
> > > This seems to be a separate issue. Why do we need to enforce that the slot data
> > > is "+"? If it doesn't, why this would be a problem?
> > > 
> > > Btw, reading:
> > > 
> > >     https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html
> > > 
> > > We have:
> > > 
> > >     "In both implementations of the jobserver, the pipe will be pre-loaded with 
> > >      one single-character token for each available job. To obtain an extra slot
> > >      you must read a single character from the jobserver; to release a slot you
> > >      must write a single character back into the jobserver.
> > > 
> > >      It’s important that when you release the job slot, you write back the same
> > >      character you read. Don’t assume that all tokens are the same character;
> > >      different characters may have different meanings to GNU make. The order is
> > >      not important, since make has no idea in what order jobs will complete anyway."
> > > 
> > > So, a 100% compliant POSIX jobserver code shall not test for "+", but, instead,
> > > preserve whatever character is there.
> > > 
> > > Yet, checking for "+" is really needed, please add a rationale at the patch
> > > description justifying why. On such case, we should still:
> > > 
> > >     - release the slot(s) we don't want by writing the character via
> > >       os.write();
> > >     - print a warning message about why we rejected the slot(s).
> > >   
> > Thank you for the information. I previously misunderstood that reading from the
> > jobserver would only return a '+' symbol, but now it's obviously not the case.
> > At this point, we seem unable to verify whether it's a valid jobserver file
> > descriptor, so we have to read the entire file contents util EOF.
> 
> Agreed. I guess that just checking "if not slot" should be enough for such
> purpose.
> 
> If you can still reproduce the original issue, I would try that and
> see if it works, e.g. this should likely be enough:
> 
>                     slot = os.read(self.reader, 8)
>                     if not slot:
>                         # Stop at the end of the jobserver queue.
>                         break  
> 	
> 

I have tested with below changes and it works. It also prevent us from probably
writing incorrect file.

@@ -91,6 +91,10 @@ class JobserverExec:
             while True:
                 try:
                     slot = os.read(self.reader, 8)
+                    if not slot:
+                        # Clear self.jobs to prevent us from probably writing incorrect file.
+                        self.jobs = []
+                        raise ValueError("unexpected empty token from jobserver fd, invalid '--jobserver-auth=' setting?")
                     self.jobs += slot
                 except (OSError, IOError) as e:
                     if e.errno == errno.EWOULDBLOCK:
@@ -105,7 +109,8 @@ class JobserverExec:
             # to sit here blocked on our child.
             self.claim = len(self.jobs) + 1

-        except (KeyError, IndexError, ValueError, OSError, IOError):
+        except (KeyError, IndexError, ValueError, OSError, IOError) as e:
+            print(f"Warning: {e}", file=sys.stderr)
             # Any missing environment strings or bad fds should result in just

-- 
Cheers,
Changbin Du

      reply	other threads:[~2026-01-08 10:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-25  6:26 [PATCH] tools: jobserver: Add validation for jobserver tokens to ensure valid '+' characters Changbin Du
2026-01-05  8:22 ` duchangbin
2026-01-05 15:35   ` Jonathan Corbet
2026-01-06 21:52 ` Jonathan Corbet
2026-01-07  8:11   ` duchangbin
2026-01-07  9:29     ` Mauro Carvalho Chehab
2026-01-07 10:42       ` Mauro Carvalho Chehab
2026-01-07 10:54         ` Mauro Carvalho Chehab
2026-01-08  2:58         ` duchangbin
2026-01-08  8:24           ` Mauro Carvalho Chehab
2026-01-08 10:01             ` duchangbin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=107868000b0d4fe09ac8671b4057f012@huawei.com \
    --to=changbin.du@huawei.com \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mchehab@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox