From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:50480) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TfmyG-0007aj-V0 for qemu-devel@nongnu.org; Tue, 04 Dec 2012 02:31:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TfmyB-0000Bt-AO for qemu-devel@nongnu.org; Tue, 04 Dec 2012 02:31:12 -0500 Received: from ssl.dlhnet.de ([91.198.192.8]:36287 helo=ssl.dlh.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TfmyB-0000Bo-0h for qemu-devel@nongnu.org; Tue, 04 Dec 2012 02:31:07 -0500 Message-ID: <50BDA6C7.3010905@dlhnet.de> Date: Tue, 04 Dec 2012 08:31:19 +0100 From: Peter Lieven MIME-Version: 1.0 References: <50BCFEB8.2070703@dlhnet.de> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH][RESEND] iscsi: add support for iSCSI NOPs List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: ronnie sahlberg Cc: kwolf@redhat.com, Paolo Bonzini , "qemu-devel@nongnu.org" On 04.12.2012 06:03, ronnie sahlberg wrote: > Acked-By: ronniesahlberg@gmail.com (Ronnie Sahlberg) > > > This verified that the service is actually operational and is much > more reliable than TCP-KEEPALIVES. > This is the proper way to monitor that the iscsi target is alive. Yes, especially because (at least under linux) keepalives are only send if the output buffer is empty. I have added tcp user timeout option earlier, but its timeout is not as reliable as the NOPs. > > We should as a later patch add the ability to configure this via the > qemu config file instead of using hardcoded values. Of course, but I can report from several hundred connections that I did not have any false positives during the last weeks with that values. But this might depend on the target that is used. I have one concern where I would appreciate feedback: Is the qemu timer fired in the same thread as libiscsi is running? Otherwise the iscsi_nop_timed_event() cound interfere with an ongoing reconnect that has been inititated by libiscsi (if it detects the broken connection earlier). Peter > > > regards > ronnie sahlberg > > > On Mon, Dec 3, 2012 at 11:34 AM, Peter Lieven wrote: >> This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target. >> If a consecutive number of NOP-In replies fail a reconnect is initiated. >> iSCSI NOPs help to ensure that the connection to the target is still operational. >> This should not, but in reality may be the case even if the TCP connection is still >> alive if there are bugs in either the target or the initiator implementation. >> >> Reported-by: Ronnie Sahlberg >> Signed-off-by: Peter Lieven >> --- >> block/iscsi.c | 43 +++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 43 insertions(+) >> >> diff --git a/block/iscsi.c b/block/iscsi.c >> index d0b1a10..fab4c8b 100644 >> --- a/block/iscsi.c >> +++ b/block/iscsi.c >> @@ -47,6 +47,9 @@ typedef struct IscsiLun { >> int block_size; >> uint64_t num_blocks; >> int events; >> + >> + QEMUTimer *nop_timer; >> + int nops_in_flight; >> } IscsiLun; >> >> typedef struct IscsiAIOCB { >> @@ -72,6 +75,9 @@ struct IscsiTask { >> int complete; >> }; >> >> +#define NOP_INTERVAL 5000 >> +#define MAX_NOP_FAILURES 3 >> + >> static void >> iscsi_bh_cb(void *p) >> { >> @@ -925,6 +931,35 @@ static char *parse_initiator_name(const char *target) >> } >> } >> >> +static void iscsi_nop_cb(struct iscsi_context *iscsi, int status, void *command_data, void *private_data) >> +{ >> + IscsiLun *iscsilun = private_data; >> + >> + if (iscsilun) { >> + iscsilun->nops_in_flight = 0; >> + } >> +} >> + >> +static void iscsi_nop_timed_event(void *opaque) >> +{ >> + IscsiLun *iscsilun = opaque; >> + >> + if (iscsilun->nops_in_flight > MAX_NOP_FAILURES) { >> + error_report("iSCSI: NOP timeout. Reconnecting..."); >> + iscsi_reconnect(iscsilun->iscsi); >> + iscsilun->nops_in_flight = 0; >> + } >> + >> + if (iscsi_nop_out_async(iscsilun->iscsi, iscsi_nop_cb, NULL, 0, iscsilun) != 0) { >> + error_report("iSCSI: failed to sent NOP-Out. Disabling NOP messages."); >> + return; >> + } >> + >> + qemu_mod_timer(iscsilun->nop_timer, qemu_get_clock_ms(rt_clock) + NOP_INTERVAL); >> + iscsi_set_events(iscsilun); >> + iscsilun->nops_in_flight++; >> +} >> + >> /* >> * We support iscsi url's on the form >> * iscsi://[%@][:]// >> @@ -1036,6 +1071,10 @@ static int iscsi_open(BlockDriverState *bs, const char *filename, int flags) >> >> ret = 0; >> >> + /* Set up a timer for sending out iSCSI NOPs */ >> + iscsilun->nop_timer = qemu_new_timer_ms(rt_clock, iscsi_nop_timed_event, iscsilun); >> + qemu_mod_timer(iscsilun->nop_timer, qemu_get_clock_ms(rt_clock) + NOP_INTERVAL); >> + >> out: >> if (initiator_name != NULL) { >> g_free(initiator_name); >> @@ -1058,6 +1097,10 @@ static void iscsi_close(BlockDriverState *bs) >> IscsiLun *iscsilun = bs->opaque; >> struct iscsi_context *iscsi = iscsilun->iscsi; >> >> + if (iscsilun->nop_timer) { >> + qemu_del_timer(iscsilun->nop_timer); >> + qemu_free_timer(iscsilun->nop_timer); >> + } >> qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL, NULL); >> iscsi_destroy_context(iscsi); >> memset(iscsilun, 0, sizeof(IscsiLun)); >> -- >> 1.7.9.5 >> >>