[PATCH AUTOSEL 4.14 051/170] sunvdc: Do not spin in an infinite loop when vio_ldc

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH AUTOSEL 4.14 051/170] sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN
       [not found] <20190128161200.55107-1-sashal@kernel.org>
@ 2019-01-28 16:10 ` Sasha Levin
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 123/170] drbd: narrow rcu_read_lock in drbd_sync_handshake Sasha Levin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2019-01-28 16:10 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Young Xiao, Jens Axboe, Sasha Levin, linux-block

From: Young Xiao <YangX92@hotmail.com>

[ Upstream commit a11f6ca9aef989b56cd31ff4ee2af4fb31a172ec ]

__vdc_tx_trigger should only loop on EAGAIN a finite
number of times.

See commit adddc32d6fde ("sunvnet: Do not spin in an
infinite loop when vio_ldc_send() returns EAGAIN") for detail.

Signed-off-by: Young Xiao <YangX92@hotmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/block/sunvdc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
index ad9749463d4f..ed4d6276e94f 100644
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -41,6 +41,8 @@ MODULE_VERSION(DRV_MODULE_VERSION);
 #define WAITING_FOR_GEN_CMD	0x04
 #define WAITING_FOR_ANY		-1
 
+#define	VDC_MAX_RETRIES	10
+
 static struct workqueue_struct *sunvdc_wq;
 
 struct vdc_req_entry {
@@ -427,6 +429,7 @@ static int __vdc_tx_trigger(struct vdc_port *port)
 		.end_idx		= dr->prod,
 	};
 	int err, delay;
+	int retries = 0;
 
 	hdr.seq = dr->snd_nxt;
 	delay = 1;
@@ -439,6 +442,8 @@ static int __vdc_tx_trigger(struct vdc_port *port)
 		udelay(delay);
 		if ((delay <<= 1) > 128)
 			delay = 128;
+		if (retries++ > VDC_MAX_RETRIES)
+			break;
 	} while (err == -EAGAIN);
 
 	if (err == -ENOTCONN)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 4.14 123/170] drbd: narrow rcu_read_lock in drbd_sync_handshake
       [not found] <20190128161200.55107-1-sashal@kernel.org>
  2019-01-28 16:10 ` [PATCH AUTOSEL 4.14 051/170] sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN Sasha Levin
@ 2019-01-28 16:11 ` Sasha Levin
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 124/170] drbd: disconnect, if the wrong UUIDs are attached on a connected peer Sasha Levin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2019-01-28 16:11 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Roland Kammerer, Jens Axboe, Sasha Levin, drbd-dev, linux-block

From: Roland Kammerer <roland.kammerer@linbit.com>

[ Upstream commit d29e89e34952a9ad02c77109c71a80043544296e ]

So far there was the possibility that we called
genlmsg_new(GFP_NOIO)/mutex_lock() while holding an rcu_read_lock().

This included cases like:

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      drbd_bcast_event
        genlmsg_new(GFP_NOIO) --> may sleep

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      notify_helper
        genlmsg_new(GFP_NOIO) --> may sleep

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      notify_helper
        mutex_lock --> may sleep

While using GFP_ATOMIC whould have been possible in the first two cases,
the real fix is to narrow the rcu_read_lock.

Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/block/drbd/drbd_receiver.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 796eaf347dc0..143c5a666e25 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -3361,7 +3361,7 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device,
 	enum drbd_conns rv = C_MASK;
 	enum drbd_disk_state mydisk;
 	struct net_conf *nc;
-	int hg, rule_nr, rr_conflict, tentative;
+	int hg, rule_nr, rr_conflict, tentative, always_asbp;
 
 	mydisk = device->state.disk;
 	if (mydisk == D_NEGOTIATING)
@@ -3412,8 +3412,12 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device,
 
 	rcu_read_lock();
 	nc = rcu_dereference(peer_device->connection->net_conf);
+	always_asbp = nc->always_asbp;
+	rr_conflict = nc->rr_conflict;
+	tentative = nc->tentative;
+	rcu_read_unlock();
 
-	if (hg == 100 || (hg == -100 && nc->always_asbp)) {
+	if (hg == 100 || (hg == -100 && always_asbp)) {
 		int pcount = (device->state.role == R_PRIMARY)
 			   + (peer_role == R_PRIMARY);
 		int forced = (hg == -100);
@@ -3452,9 +3456,6 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device,
 			     "Sync from %s node\n",
 			     (hg < 0) ? "peer" : "this");
 	}
-	rr_conflict = nc->rr_conflict;
-	tentative = nc->tentative;
-	rcu_read_unlock();
 
 	if (hg == -100) {
 		/* FIXME this log message is not correct if we end up here
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 4.14 124/170] drbd: disconnect, if the wrong UUIDs are attached on a connected peer
       [not found] <20190128161200.55107-1-sashal@kernel.org>
  2019-01-28 16:10 ` [PATCH AUTOSEL 4.14 051/170] sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN Sasha Levin
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 123/170] drbd: narrow rcu_read_lock in drbd_sync_handshake Sasha Levin
@ 2019-01-28 16:11 ` Sasha Levin
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 125/170] drbd: skip spurious timeout (ping-timeo) when failing promote Sasha Levin
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 159/170] block/swim3: Fix -EBUSY error when re-opening device after unmount Sasha Levin
  4 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2019-01-28 16:11 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Lars Ellenberg, Jens Axboe, Sasha Levin, drbd-dev, linux-block

From: Lars Ellenberg <lars.ellenberg@linbit.com>

[ Upstream commit b17b59602b6dcf8f97a7dc7bc489a48388d7063a ]

With "on-no-data-accessible suspend-io", DRBD requires the next attach
or connect to be to the very same data generation uuid tag it lost last.

If we first lost connection to the peer,
then later lost connection to our own disk,
we would usually refuse to re-connect to the peer,
because it presents the wrong data set.

However, if the peer first connects without a disk,
and then attached its disk, we accepted that same wrong data set,
which would be "unexpected" by any user of that DRBD
and cause "undefined results" (read: very likely data corruption).

The fix is to forcefully disconnect as soon as we notice that the peer
attached to the "wrong" dataset.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/block/drbd/drbd_receiver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 143c5a666e25..1aad373da50e 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -4139,7 +4139,7 @@ static int receive_uuids(struct drbd_connection *connection, struct packet_info
 	kfree(device->p_uuid);
 	device->p_uuid = p_uuid;
 
-	if (device->state.conn < C_CONNECTED &&
+	if ((device->state.conn < C_CONNECTED || device->state.pdsk == D_DISKLESS) &&
 	    device->state.disk < D_INCONSISTENT &&
 	    device->state.role == R_PRIMARY &&
 	    (device->ed_uuid & ~((u64)1)) != (p_uuid[UI_CURRENT] & ~((u64)1))) {
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 4.14 125/170] drbd: skip spurious timeout (ping-timeo) when failing promote
       [not found] <20190128161200.55107-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 124/170] drbd: disconnect, if the wrong UUIDs are attached on a connected peer Sasha Levin
@ 2019-01-28 16:11 ` Sasha Levin
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 159/170] block/swim3: Fix -EBUSY error when re-opening device after unmount Sasha Levin
  4 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2019-01-28 16:11 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Lars Ellenberg, Jens Axboe, Sasha Levin, drbd-dev, linux-block

From: Lars Ellenberg <lars.ellenberg@linbit.com>

[ Upstream commit 9848b6ddd8c92305252f94592c5e278574e7a6ac ]

If you try to promote a Secondary while connected to a Primary
and allow-two-primaries is NOT set, we will wait for "ping-timeout"
to give this node a chance to detect a dead primary,
in case the cluster manager noticed faster than we did.

But if we then are *still* connected to a Primary,
we fail (after an additional timeout of ping-timout).

This change skips the spurious second timeout.

Most people won't notice really,
since "ping-timeout" by default is half a second.

But in some installations, ping-timeout may be 10 or 20 seconds or more,
and spuriously delaying the error return becomes annoying.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/block/drbd/drbd_nl.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index a12f77e6891e..ad13ec66c8e4 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -668,14 +668,15 @@ drbd_set_role(struct drbd_device *const device, enum drbd_role new_role, int for
 		if (rv == SS_TWO_PRIMARIES) {
 			/* Maybe the peer is detected as dead very soon...
 			   retry at most once more in this case. */
-			int timeo;
-			rcu_read_lock();
-			nc = rcu_dereference(connection->net_conf);
-			timeo = nc ? (nc->ping_timeo + 1) * HZ / 10 : 1;
-			rcu_read_unlock();
-			schedule_timeout_interruptible(timeo);
-			if (try < max_tries)
+			if (try < max_tries) {
+				int timeo;
 				try = max_tries - 1;
+				rcu_read_lock();
+				nc = rcu_dereference(connection->net_conf);
+				timeo = nc ? (nc->ping_timeo + 1) * HZ / 10 : 1;
+				rcu_read_unlock();
+				schedule_timeout_interruptible(timeo);
+			}
 			continue;
 		}
 		if (rv < SS_SUCCESS) {
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 4.14 159/170] block/swim3: Fix -EBUSY error when re-opening device after unmount
       [not found] <20190128161200.55107-1-sashal@kernel.org>
                   ` (3 preceding siblings ...)
  2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 125/170] drbd: skip spurious timeout (ping-timeo) when failing promote Sasha Levin
@ 2019-01-28 16:11 ` Sasha Levin
  4 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2019-01-28 16:11 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Finn Thain, linuxppc-dev, Jens Axboe, Sasha Levin, linux-block

From: Finn Thain <fthain@telegraphics.com.au>

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/block/swim3.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 0d7527c6825a..2f7acdb830c3 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -1027,7 +1027,11 @@ static void floppy_release(struct gendisk *disk, fmode_t mode)
 	struct swim3 __iomem *sw = fs->swim3;
 
 	mutex_lock(&swim3_mutex);
-	if (fs->ref_count > 0 && --fs->ref_count == 0) {
+	if (fs->ref_count > 0)
+		--fs->ref_count;
+	else if (fs->ref_count == -1)
+		fs->ref_count = 0;
+	if (fs->ref_count == 0) {
 		swim3_action(fs, MOTOR_OFF);
 		out_8(&sw->control_bic, 0xff);
 		swim3_select(fs, RELAX);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-28 17:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20190128161200.55107-1-sashal@kernel.org>
2019-01-28 16:10 ` [PATCH AUTOSEL 4.14 051/170] sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN Sasha Levin
2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 123/170] drbd: narrow rcu_read_lock in drbd_sync_handshake Sasha Levin
2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 124/170] drbd: disconnect, if the wrong UUIDs are attached on a connected peer Sasha Levin
2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 125/170] drbd: skip spurious timeout (ping-timeo) when failing promote Sasha Levin
2019-01-28 16:11 ` [PATCH AUTOSEL 4.14 159/170] block/swim3: Fix -EBUSY error when re-opening device after unmount Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).