From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Philipp Reisner , Lars Ellenberg , Jens Axboe Subject: [PATCH 3.18 092/150] drbd: Fix state change in case of connection timeout Date: Tue, 13 Jan 2015 23:22:43 -0800 Message-Id: <20150114072100.039410406@linuxfoundation.org> In-Reply-To: <20150114072055.842408181@linuxfoundation.org> References: <20150114072055.842408181@linuxfoundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Sender: linux-kernel-owner@vger.kernel.org List-ID: 3.18-stable review patch. If anyone has any objections, please let me know. ------------------ From: Philipp Reisner commit 9581f97a687724ea41cf2e145dda4751161198c1 upstream. A connection timeout affects all volumes of a resource! Under the following conditions: A resource with multiple volumes AND ko-count >=1 AND a write request triggers the timeout (ko-count * timeout) DRBD's internal state gets confused. That in turn may lead to very miss leading follow up failures. E.g. "BUG: scheduling while atomic" Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- drivers/block/drbd/drbd_req.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/block/drbd/drbd_req.c +++ b/drivers/block/drbd/drbd_req.c @@ -1629,7 +1629,7 @@ void request_timer_fn(unsigned long data time_after(now, req_peer->pre_send_jif + ent) && !time_in_range(now, connection->last_reconnect_jif, connection->last_reconnect_jif + ent)) { drbd_warn(device, "Remote failed to finish a request within ko-count * timeout\n"); - _drbd_set_state(_NS(device, conn, C_TIMEOUT), CS_VERBOSE | CS_HARD, NULL); + _conn_request_state(connection, NS(conn, C_TIMEOUT), CS_VERBOSE | CS_HARD); } if (dt && oldest_submit_jif != now && time_after(now, oldest_submit_jif + dt) &&