From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-owner@vger.kernel.org>
Received: from mail.linuxfoundation.org ([140.211.169.12]:41278 "EHLO
        mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1754309AbeCRQAJ (ORCPT
        <rfc822;stable@vger.kernel.org>); Sun, 18 Mar 2018 12:00:09 -0400
Subject: Patch "blk-throttle: make sure expire time isn't too big" has been added to the 4.9-stable tree
To: shli@fb.com, alexander.levin@microsoft.com, axboe@fb.com,
        gregkh@linuxfoundation.org
Cc: <stable@vger.kernel.org>, <stable-commits@vger.kernel.org>
From: <gregkh@linuxfoundation.org>
Date: Sun, 18 Mar 2018 16:59:34 +0100
Message-ID: <152138877437107@kroah.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ANSI_X3.4-1968
Content-Transfer-Encoding: 8bit
Sender: stable-owner@vger.kernel.org
List-ID: <stable.vger.kernel.org>


This is a note to let you know that I've just added the patch titled

    blk-throttle: make sure expire time isn't too big

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     blk-throttle-make-sure-expire-time-isn-t-too-big.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Shaohua Li <shli@fb.com>
Date: Mon, 27 Mar 2017 10:51:36 -0700
Subject: blk-throttle: make sure expire time isn't too big

From: Shaohua Li <shli@fb.com>


[ Upstream commit 06cceedcca67a93ac7f7aa93bbd9980c7496d14e ]

cgroup could be throttled to a limit but when all cgroups cross high
limit, queue enters a higher state and so the group should be throttled
to a higher limit. It's possible the cgroup is sleeping because of
throttle and other cgroups don't dispatch IO any more. In this case,
nobody can trigger current downgrade/upgrade logic. To fix this issue,
we could either set up a timer to wakeup the cgroup if other cgroups are
idle or make sure this cgroup doesn't sleep too long. Setting up a timer
means we must change the timer very frequently. This patch chooses the
latter. Making cgroup sleep time not too big wouldn't change cgroup
bps/iops, but could make it wakeup more frequently, which isn't a big
issue because throtl_slice * 8 is already quite big.

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/blk-throttle.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -499,6 +499,17 @@ static void throtl_dequeue_tg(struct thr
 static void throtl_schedule_pending_timer(struct throtl_service_queue *sq,
 					  unsigned long expires)
 {
+	unsigned long max_expire = jiffies + 8 * throtl_slice;
+
+	/*
+	 * Since we are adjusting the throttle limit dynamically, the sleep
+	 * time calculated according to previous limit might be invalid. It's
+	 * possible the cgroup sleep time is very long and no other cgroups
+	 * have IO running so notify the limit changes. Make sure the cgroup
+	 * doesn't sleep too long to avoid the missed notification.
+	 */
+	if (time_after(expires, max_expire))
+		expires = max_expire;
 	mod_timer(&sq->pending_timer, expires);
 	throtl_log(sq, "schedule timer. delay=%lu jiffies=%lu",
 		   expires - jiffies, jiffies);


Patches currently in stable-queue which might be from shli@fb.com are

queue-4.9/md.c-didn-t-unlock-the-mddev-before-return-einval-in-array_size_store.patch
queue-4.9/md-raid6-fix-anomily-when-recovering-a-single-device-in-raid6.patch
queue-4.9/blk-throttle-make-sure-expire-time-isn-t-too-big.patch