From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yann Droneaud Subject: Re: [PATCH] fs: clear close-on-exec flag as part of put_unused_fd() Date: Thu, 12 Dec 2013 12:36:30 +0100 Message-ID: <1386848190.9959.12.camel@localhost.localdomain> References: <1386796107-4197-1-git-send-email-ydroneaud@opteya.com> <20131211223634.GA13828@mguzik.redhat.com> <20131211233011.GA10323@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Mateusz Guzik , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Yann Droneaud To: Al Viro Return-path: In-Reply-To: <20131211233011.GA10323@ZenIV.linux.org.uk> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Hi, Le mercredi 11 d=C3=A9cembre 2013 =C3=A0 23:30 +0000, Al Viro a =C3=A9c= rit : > On Wed, Dec 11, 2013 at 11:36:35PM +0100, Mateusz Guzik wrote: >=20 > > >From my reading this will break at least the following: > > fd =3D open(..., .. | O_CLOEXEC); > > dup2(whatever, fd); > >=20 > > now fd has O_CLOEXEC even though it should not >=20 > Moreover, consider fork() done by a thread that shares descriptor > table with somebody else. Suppose it happens in the middle of > open() with O_CLOEXEC being done by another thread. We copy descript= or > table after descriptor had been reserved (and marked close-on-exec), > but before a reference to struct file has actually been inserted ther= e. > This code > for (i =3D open_files; i !=3D 0; i--) { > struct file *f =3D *old_fds++; > if (f) { > get_file(f); > } else { > /* =20 > * The fd may be claimed in the fd bitmap but= not yet > * instantiated in the files array if a sibli= ng thread > * is partway through open(). So make sure t= hat this > * fd is available to the new process. > */ > __clear_open_fd(open_files - i, new_fdt); > } > rcu_assign_pointer(*new_fds++, f); > } > spin_unlock(&oldf->file_lock); > in dup_fd() will clear the corresponding bit in open_fds, leaving clo= se_on_exec > alone. Currently that's fine (we will override whatever had been in > close_on_exec when we reserve that descriptor again), but AFAICS with= this > patch it will break. >=20 That's a terrible subtle case. it will indeed break with the patch. > Sure, it can be fixed up (ditto with dup2(), etc.), but what's the po= int? It was only an attempt at making close-on-exec handling "simpler". > Result will require more subtle reasoning to prove correctness and wi= ll > be more prone to breakage. Does that really yield visible performanc= e > improvements that would be worth the extra complexity? After all, yo= u > trade some writes to close_on_exec on descriptor reservation for unco= nditional > write on descriptor freeing; if anything, I would expect that you'll = get > minor _loss_ from that change, assuming they'll be measurable in the = first > place... Since it's not so straightforward to get it correct, and the only advantage I was trying to address is aesthetic, I will discard it. Thanks a lot for the review and the comments. Regards. --=20 Yann Droneaud OPTEYA