From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751330AbdH2RvL (ORCPT <rfc822;w@1wt.eu>);
        Tue, 29 Aug 2017 13:51:11 -0400
Received: from bombadil.infradead.org ([65.50.211.133]:45455 "EHLO
        bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751223AbdH2RvJ (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 29 Aug 2017 13:51:09 -0400
Date: Tue, 29 Aug 2017 19:51:06 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Byungchul Park <byungchul.park@lge.com>, mingo@kernel.org,
        linux-kernel@vger.kernel.org, kernel-team@lge.com,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Dave Chinner <david@fromorbit.com>, Tejun Heo <tj@kernel.org>,
        johannes@sipsolutions.net
Subject: Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all
 part of PROVE_LOCKING
Message-ID: <20170829175106.GU32112@worktop.programming.kicks-ass.net>
References: <20170822075238.uyfmhgxeal2bwcdg@hirez.programming.kicks-ass.net>
 <20170822085100.GH20323@X58A-UD3R>
 <20170822092141.fjmr74xhfid7vu7h@hirez.programming.kicks-ass.net>
 <20170822093337.GJ20323@X58A-UD3R>
 <20170822100840.eababgjcu76iois5@hirez.programming.kicks-ass.net>
 <20170822134922.m2g6kqsqo2eojrg7@hirez.programming.kicks-ass.net>
 <20170822144602.uh5jzkkchvdgzs3s@hirez.programming.kicks-ass.net>
 <20170823163903.GA12973@redhat.com>
 <20170823174714.in4mv7uc3rdheygg@hirez.programming.kicks-ass.net>
 <20170829155205.GA17290@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170829155205.GA17290@redhat.com>
User-Agent: Mutt/1.5.22.1 (2013-10-16)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Aug 29, 2017 at 05:52:05PM +0200, Oleg Nesterov wrote:

> The problem is that start_flush_work() does not do acquire/release
> unconditionally, it does this only if it is going to wait, and I am not
> sure this is right...

Right, I think you're right, we can move it earlier, once we have the
pwq.

> Plus process_one_work() does lock_map_acquire_read(), I don't really
> understand this too.

Right, so aside from recursive-reads being broken, I think that was a
mistake.

> > And the analogous:
> >
> > flush_workqueue(wq)	process_one_work(wq, work)
> >   A(wq)			  A(wq)
> >   R(wq)			  work->func(work);
> > 			  R(wq)
> >
> >
> > The thing I puzzled over was flush_work() (really start_flush_work())
> > doing:
> >
> >         if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)
> >                 lock_map_acquire(&pwq->wq->lockdep_map);
> >         else
> >                 lock_map_acquire_read(&pwq->wq->lockdep_map);
> >         lock_map_release(&pwq->wq->lockdep_map);
> >
> > Why does flush_work() care about the wq->lockdep_map?
> >
> > The answer is because, for single-threaded workqueues, doing
> > flush_work() from a work is a potential deadlock:
> 
> Yes, but the simple answer is that flush_work() doesn't really differ
> from flush_workqueue() in this respect?

Right, and I think that the new code (aside from maybe placing it
earlier) does that. If single-threaded we use wq->lockdep_map, otherwise
we (also) use work->lockdep_map.

> If nothing else, if some WORK is the last queued work on WQ, then
> flush_work(WORK) is the same thing as flush_workqueuw(WQ), more or less.
> Again, I am talking about single-threaded workqueues.

Right, so single-threaded workqueues are special and are what we need
this extra bit for, but only for single-threaded.

> > workqueue-thread:
> >
> > 	work-n:
> > 	  flush_work(work-n+1);
> >
> > 	work-n+1:
> >
> >
> > Will not be going anywhere fast..
> 
> Or another example,
> 
> 	lock(LOCK);
> 	flush_work(WORK);
> 	unlock(LOCK);
> 
> 	workqueue-thread:
> 		another_pending_work:
> 			LOCK(LOCK);
> 			UNLOCK(LOCK);
> 
> 		WORK:
> 
> In this case we do not care about WORK->lockdep_map, but
> taking the wq->lockdep_map from flush_work() (if single-threaded) allows
> to report the deadlock.

Right. And the new code does so.