From mboxrd@z Thu Jan 1 00:00:00 1970 From: NeilBrown Subject: Re: [PATCH 0/9] Support follow_link in RCU-walk. Date: Fri, 6 Mar 2015 08:08:15 +1100 Message-ID: <20150306080815.63093875@notabene.brown> References: <20150305051530.23906.65097.stgit@notabene.brown> <20150305060520.GY29656@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/FWdf_8VQjf/fw52=L0.M0jU"; protocol="application/pgp-signature" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org To: Al Viro Return-path: In-Reply-To: <20150305060520.GY29656@ZenIV.linux.org.uk> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org --Sig_/FWdf_8VQjf/fw52=L0.M0jU Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Thu, 5 Mar 2015 06:05:20 +0000 Al Viro wrote: > On Thu, Mar 05, 2015 at 04:21:21PM +1100, NeilBrown wrote: > > Hi Al (and others), > >=20 > > I wonder if you could look over this patchset. > > It allows RCU-walk to follow symlinks in many common cases, > > thus removing a surprising performance hit caused by using symlinks. > >=20 > > The last could of patches make changes to XFS and NFS to support > > this but I haven't forwarded to the relevant lists yet. > > If/when the early code meets with approval I'll do that. > >=20 > > The first patch almost certainly needs to be changed. I originally > > wrote this code when filesystems could see inside nameidata. > > It is now opaque so the simplest solution was to provide an > > accessor function. > > Maybe I should as a 'flags' arg to ->follow_link?? Or have > > ->follow_link and ->follow_link_rcu ?? > > What do you suggest? >=20 > Umm... Some observations: > * now ->follow_link() can be called in RCU mode, which means > that it can race with fs shutdown; not a problem, except that now it > joins ->lookup() et.al. in "if some data structure is needed in RCU > case of that, make sure it's not destroyed without an RCU delay somewhere > between the entry into ->kill_sb() and destruction. > * highmem pages in symlinks: that BS shouldn't be allowed at > all. Just make sure that at least for those filesystems symlink inodes > get mapping_set_gfp_mask(&inode->i_data, GFP_KERNEL) and be done with tha= t. > * are you sure that security_inode_follow_link() is OK to call in > RCU mode? > * what warranties are you giving for the lifetime of strings > passed to nd_set_link()? Right now it's "should not be freed until the > matching ->put_link()"; what happens for RCU mode? > * really nasty one: creat(2) on a dangling symlink. What's to > preserve the last component if you get into that symlink in RCU mode? >=20 > TBH, I'm less than fond of passing nameidata to ->follow_link() at all, > flags or no flags. We could kill current->link_count and > current->total_link_count, replacing them with one void * current->nameid= ata > and taking counters into struct nameidata itself. Have places like e.g. > kern_path_locked() do > struct nameidata nd, *saved =3D set_nameidata(&nd); > ... > set_nameidata(saved); > with set_nameidata(p) doing this: > old =3D current->nameidata; > current->nameidata =3D p; > if (p) { > if (!old) { > p->link_count =3D 0; > p->total_link_count =3D 0; > } else { > p->link_count =3D old->link_count; > p->total_link_count =3D old->total_link_count; > } > } > return old; >=20 > Then nd_set_link() et.al. would use current->nameidata instead of an > explicitly passed pointer and ->follow_link() instances wouldn't need > that opaque pointer passed to them at all. Wow, thanks for the quick response!!! I'll look into all those issues and get back to you when I have something coherent to say. NeilBrown --Sig_/FWdf_8VQjf/fw52=L0.M0jU Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIVAwUBVPjFvznsnt1WYoG5AQJaxw/8CXmA1Yr4v0Z8RKKeOreAWDGkBOr2ZOPG Zqy4IR4ZjFZzrgseqZ3V1ZvOf3eQ0COqgZ/rwoqZu0Sn/qjn0eRFqVvdcTDynL1P eJyS9LAMDznFQEeUxgaUJCL6D5Y7oIJix87qevltfCEZ9Q59QGN7CqCK4KlIf7Ed WVuauArS6F/AUsiGtfcA5fIIBfL3lYOT34hrasJt/QFPjCuUZ0I24OxwExBxm419 xWGFXrKUP+2nSi23C5SIzZ4kjdRjrPMEotdL1GRSv3I9/6zpIBSogO/l+tyqVLLI 5KvCvHYU43YQsPkZiX3GQc4XMPa8orqpxVVqI0k3uDCM50cbP+S+0C/k1sd+PngE 7VPueQNgoqFRPeCthzESE9EGmteW0r6RGeCWt78JZXwB3kwEP2ZbcWQMEIeZWrjv AA3u2T+HeN7WJa44hOw1U0KgYjPFTBEnB9v9X+8F0Mq5auBmgtjiIIEi1zTXHd5u Uxf9q8eOaxBL6i4RyzKsW1SuPp4Vxb+BlqAJ0igDnGHuDnELNaZtTA7gQSVE+wYo +0ly9JpoC04R6Aeueyksb9mYtAGLeXohjHcn/4HSQ6KuyO2x9M7++0Gfq2NqkPCc TlL6IxR/a3ywdxa+5MONVkabXpSoWF/HFsO1yTnKK2CF9tE7FRZe7tTD01PT8wY8 5aUJgotbeTI= =UeSf -----END PGP SIGNATURE----- --Sig_/FWdf_8VQjf/fw52=L0.M0jU--