From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755352AbdKBIxk (ORCPT ); Thu, 2 Nov 2017 04:53:40 -0400 Received: from mail-pg0-f68.google.com ([74.125.83.68]:54748 "EHLO mail-pg0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752614AbdKBIxS (ORCPT ); Thu, 2 Nov 2017 04:53:18 -0400 X-Google-Smtp-Source: ABhQp+Qli0UbwjEu/kIsOyThrTFqEkBN9Vkte3+0bgi0Ptwx+5/46iDOl8bsGRjFt3SrzRhIbZ93Wg== Date: Thu, 2 Nov 2017 17:53:13 +0900 From: Sergey Senozhatsky To: Steven Rostedt Cc: Tetsuo Handa , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Petr Mladek , Sergey Senozhatsky , Vlastimil Babka , "yuwang.yuwang" Subject: Re: [PATCH] mm: don't warn about allocations which stall for too long Message-ID: <20171102085313.GD655@jagdpanzerIV> References: <1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> <20171031153225.218234b4@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171031153225.218234b4@gandalf.local.home> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (10/31/17 15:32), Steven Rostedt wrote: [..] > (new globals) > static DEFINE_SPIN_LOCK(console_owner_lock); > static struct task_struct console_owner; > static bool waiter; > > console_unlock() { > > [ Assumes this part can not preempt ] > > spin_lock(console_owner_lock); > console_owner = current; > spin_unlock(console_owner_lock); + disables IRQs? > for each message > write message out to console > > if (READ_ONCE(waiter)) > break; > > spin_lock(console_owner_lock); > console_owner = NULL; > spin_unlock(console_owner_lock); > > [ preemption possible ] otherwise printk() if (console_trylock()) console_unlock() preempt_disable() spin_lock(console_owner_lock); console_owner = current; spin_unlock(console_owner_lock); ....... spin_lock(console_owner_lock); IRQ printk() console_trylock() // fails so we go to busy-loop part spin_lock(console_owner_lock); << deadlock even if we would replace spin_lock(console_owner_lock) with IRQ spin_lock, we still would need to protect against IRQs on the very same CPU. right? IOW, we need to store smp_processor_id() of a CPU currently doing console_unlock() and check it in vprintk_emit()? and we need to protect the entire console_unlock() function. not just the printing loop, otherwise the IRQ CPU will spin forever waiting for itself to up() the console_sem. this somehow reminds me of "static unsigned int logbuf_cpu", which we used to have in vprintk_emit() and were happy to remove it... the whole "console_unlock() is non-preemptible" can bite, I'm afraid. it's not always printk()->console_unlock(), sometimes it's console_lock()->console_unlock() that has to flush the logbuf. CPU0 CPU1 ~ CPU99 console_lock(); printk(); ... printk(); console_unlock() preempt_disable(); for (;;) call_console_drivers(); <> this pattern is not so unusual. _especially_ in the existing scheme of things. not to mention the problem of "the last printk()", which will take over and do the flush. CPU0 CPU1 ~ CPU99 console_lock(); printk(); ... printk(); console_unlock(); IRQ on CPU2 printk() // take over console_sem console_unlock() and so on. seems that there will be lots of if-s. -ss