From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sat, 22 Sep 2001 19:30:21 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sat, 22 Sep 2001 19:30:09 -0400 Received: from lsmls01.we.mediaone.net ([24.130.1.20]:38311 "EHLO lsmls01.we.mediaone.net") by vger.kernel.org with ESMTP id ; Sat, 22 Sep 2001 19:29:55 -0400 Message-ID: <3BAD1F04.8EFA46F4@kegel.com> Date: Sat, 22 Sep 2001 16:30:12 -0700 From: Dan Kegel X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.7-6 i686) X-Accept-Language: en MIME-Version: 1.0 To: Vitaly Luban , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH][RFC] Signal-per-fd for RT signals; write_lock_bh(file_lock)? In-Reply-To: <3BA2AFFF.C7B8C4DF@kegel.com> <3BA2E144.FB0E5D55@luban.org> <3BA2E99A.1134E382@kegel.com> <3BA350A7.7D39FC23@kegel.com> <3BA3C61A.DED5A27A@luban.org> <3BA3D10B.FE3C6C79@kegel.com> <3BA40FEC.A6E0557E@luban.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Vitaly Luban wrote: > Could you please try attached one? It's mostly untested, but my home site > will be down next week. I just had time to read your new patch; have not yet run it. I think there may be some problems in send_signal: 1) might still have null pointer dereference. If files is NULL, following section + if( newsignal && q ) + filep->f_infoptr = q; + write_unlock( &files->file_lock); may crash due to NULL filep and unlocking an already unlocked lock. BTW, you might be able to use a read lock there. 2) You're using a kind of spinlock that compiles to a no-op on single processor, but the crash (see the old oops traceback in http://marc.theaimsgroup.com/?l=linux-kernel&m=100055931808169&w=2 ) happens on a single processor, so your locking shouldn't help. The conflict is between a bottom half and normal. Thus write_lock, besides not helping on uniprocessor, might cause a deadlock on smp. Might have to convert the write_lock in get_unused_fd() and friends to be a write_lock_bh() instead to prevent the bottom half from running until the normal part is done with the file lock! This bottom-half-vs-normal issue worries me greatly. It may take a lot of thought to fix this. Then again, maybe I'm just confused; I've never dealt with kernel spinlocks before, and anyone who reads my posts regularly knows that I'm wrong most of the time. Anyone who actually *understands* this stuff, please speak up with corrections... 3) you still save info in the file even if you don't end up queuing a signal; your changes in send_signal should be in the default case of the switch, just to keep things in a good state. 4) collect_signal references the file table, so it needs to lock it; probably read_lock (but maybe write_lock, I dunno), and probably _bh, too, to avoid deadlock. > I'm looking forward to see a test case, all I could come up with happily > runs on the old version. I'll try to put it together today. - Dan