From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Moyer Subject: Re: [patch] fs: aio fix rcu lookup Date: Tue, 18 Jan 2011 18:00:14 -0500 Message-ID: References: <20110118190114.GA5070@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Jan Kara , Andrew Morton , linux-fsdevel , linux-kernel@vger.kernel.org To: Nick Piggin Return-path: In-Reply-To: (Nick Piggin's message of "Wed, 19 Jan 2011 09:17:23 +1100") Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Nick Piggin writes: > On Wed, Jan 19, 2011 at 6:01 AM, Jan Kara wrote: >> =C2=A0Hi, >> >> On Tue 18-01-11 10:24:24, Nick Piggin wrote: >>> On Tue, Jan 18, 2011 at 6:07 AM, Jeff Moyer wro= te: >>> > Nick Piggin writes: >>> >> Do you agree with the theoretical problem? I didn't try to >>> >> write a racer to break it yet. Inserting a delay before the >>> >> get_ioctx might do the trick. >>> > >>> > I'm not convinced, no. =C2=A0The last reference to the kioctx is = always the >>> > process, released in the exit_aio path, or via sys_io_destroy. =C2= =A0In both >>> > cases, we cancel all aios, then wait for them all to complete bef= ore >>> > dropping the final reference to the context. >>> >>> That wouldn't appear to prevent a concurrent thread from doing an >>> io operation that requires ioctx lookup, and taking the last refere= nce >>> after the io_cancel thread drops the ref. >>> >>> > So, while I agree that what you wrote is better, I remain unconvi= nced of >>> > it solving a real-world problem. =C2=A0Feel free to push it in as= a cleanup, >>> > though. >>> >>> Well I think it has to be technically correct first. If there is in= deed a >>> guaranteed ref somehow, it just needs a comment. >> =C2=A0Hmm, the code in io_destroy() indeed looks fishy. We delete th= e ioctx >> from the hash table and set ioctx->dead which is supposed to stop >> lookup_ioctx() from finding it (see the !ctx->dead check in >> lookup_ioctx()). There's even a comment in io_destroy() saying: >> =C2=A0 =C2=A0 =C2=A0 =C2=A0/* >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Wake up any waiters. =C2=A0The setting= of ctx->dead must be seen >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * by other CPUs at this point. =C2=A0Rig= ht now, we rely on the >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * locking done by the above calls to ens= ure this consistency. >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 */ >> But since lookup_ioctx() is called without any lock or barrier nothi= ng >> really seems to prevent the list traversal and ioctx->dead test to h= appen >> before io_destroy() and get_ioctx() after io_destroy(). >> >> But wouldn't the right fix be to call synchronize_rcu() in io_destro= y()? >> Because with your fix we could still return 'dead' ioctx and I don't= think >> we are supposed to do that... > > With my fix we won't oops, I was a bit concerned about ->dead, > yes but I don't know what semantics it is attempted to have there. > > synchronize_rcu() in io_destroy() does not prevent it from returning > as soon as lookup_ioctx drops the rcu_read_lock(). > > The dead=3D1 in io_destroy indeed doesn't guarantee a whole lot. > Anyone know? See the comment above io_destroy for starters. Note that rcu was bolted on later, and I believe that ->dead has nothing to do with the rcu-ification. Cheers, Jeff