From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Subject: Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug() Date: Thu, 17 Sep 2015 18:50:29 -0700 Message-ID: References: <20150916195806.GD29530@quack.suse.cz> <20150916200012.GB8624@ret.masoncoding.com> <20150916220704.GM3902@dastard> <20150917003738.GN3902@dastard> <20150917021453.GO3902@dastard> <20150917224230.GF8624@ret.masoncoding.com> <20150917235647.GG8624@ret.masoncoding.com> <20150918003735.GR3902@dastard> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=001a113fec6ea7e88a051ffbc13d Cc: Chris Mason , Jan Kara , Josef Bacik , LKML , linux-fsdevel , Neil Brown , Christoph Hellwig , Tejun Heo To: Dave Chinner Return-path: In-Reply-To: <20150918003735.GR3902@dastard> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org --001a113fec6ea7e88a051ffbc13d Content-Type: text/plain; charset=UTF-8 On Thu, Sep 17, 2015 at 5:37 PM, Dave Chinner wrote: >> > >> > I'm not seeing why that should be an issue. Sure, there's some CPU >> > overhead to context switching, but I don't see that it should be that >> > big of a deal. > > It may well change the dispatch order of enough IOs for it to be > significant on an IO bound device. Hmm. Maybe. We obviously try to order the IO's a bit by inode, and I could see the use of a workqueue maybe changing that sufficiently. But it still sounds a bit unlikely. And in fact, I think I have a better explanation. > In outright performance on my test machine, the difference in > files/s is noise. However, the consistency looks to be substantially > improved and the context switch rate is now running at under > 3,000/sec. Hmm. I don't recall seeing you mention how many context switches per second you had before. What is it down from? However, I think I may have found something more interesting here. The fact is that a *normal* schedule will trigger that whole blk_schedule_flush_plug(), but a cond_sched() or a cond_sched_lock() doesn't actually do a normal schedule at all. Those trigger a *preemption* schedule. And a preemption schedule does not trigger that unplugging at all. Why? A kernel "preemption" very much tries to avoid touching thread state, because the whole point is that normally we may be preempting threads in random places, so we don't run things like sched_submit_work(), because the thread may be in the middle of *creating* that work, and we don't want to introduce races. The preemption scheduling can also be done with "task->state" set to sleeping, and it won't actually sleep. Now, for the explicit schedules like "cond_resched()" and "cond_resched_lock()", those races with obviously don't exist, but they happen to share the same preemption scheduling logic. So it turns out that as far as I can see, the whole "cond_resched()" will not start any IO at all, and it will just be left on the thread plug until we schedule back to the thread. So I don't think this has anything to do with kblockd_workqueue. I don't think it even gets to that point. I may be missing something, but just to humor me, can you test the attached patch *without* Chris's patch to do explicit plugging? This should make cond_resched() and cond_resched_lock() run the unplugging. It may be entirely broken, I haven't thought this entirely through. Linus --001a113fec6ea7e88a051ffbc13d Content-Type: text/plain; charset=US-ASCII; name="patch.diff" Content-Disposition: attachment; filename="patch.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ieozojkr0 ZGlmZiAtLWdpdCBhL2tlcm5lbC9zY2hlZC9jb3JlLmMgYi9rZXJuZWwvc2NoZWQvY29yZS5jCmlu ZGV4IDk3ZDI3NmZmMWVkYi4uMzg4ZWE5ZTdhYjhhIDEwMDY0NAotLS0gYS9rZXJuZWwvc2NoZWQv Y29yZS5jCisrKyBiL2tlcm5lbC9zY2hlZC9jb3JlLmMKQEAgLTQ1NDgsNiArNDU0OCw3IEBAIFNZ U0NBTExfREVGSU5FMChzY2hlZF95aWVsZCkKIGludCBfX3NjaGVkIF9jb25kX3Jlc2NoZWQodm9p ZCkKIHsKIAlpZiAoc2hvdWxkX3Jlc2NoZWQoMCkpIHsKKwkJc2NoZWRfc3VibWl0X3dvcmsoY3Vy cmVudCk7CiAJCXByZWVtcHRfc2NoZWR1bGVfY29tbW9uKCk7CiAJCXJldHVybiAxOwogCX0KQEAg LTQ1NzIsOSArNDU3MywxMCBAQCBpbnQgX19jb25kX3Jlc2NoZWRfbG9jayhzcGlubG9ja190ICps b2NrKQogCiAJaWYgKHNwaW5fbmVlZGJyZWFrKGxvY2spIHx8IHJlc2NoZWQpIHsKIAkJc3Bpbl91 bmxvY2sobG9jayk7Ci0JCWlmIChyZXNjaGVkKQorCQlpZiAocmVzY2hlZCkgeworCQkJc2NoZWRf c3VibWl0X3dvcmsoY3VycmVudCk7CiAJCQlwcmVlbXB0X3NjaGVkdWxlX2NvbW1vbigpOwotCQll bHNlCisJCX0gZWxzZQogCQkJY3B1X3JlbGF4KCk7CiAJCXJldCA9IDE7CiAJCXNwaW5fbG9jayhs b2NrKTsK --001a113fec6ea7e88a051ffbc13d--