linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: "Kwolek, Adam" <adam.kwolek@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"Ciechanowski, Ed" <ed.ciechanowski@intel.com>
Subject: Re: [PATCH 14/53] FIX: Cannot exit monitor after takeover
Date: Wed, 1 Dec 2010 09:06:50 +1100	[thread overview]
Message-ID: <20101201090650.46481f95@notabene.brown> (raw)
In-Reply-To: <905EDD02F158D948B186911EB64DB3D174C8AB57@irsmsx503.ger.corp.intel.com>

On Tue, 30 Nov 2010 16:03:16 +0000 "Kwolek, Adam" <adam.kwolek@intel.com>
wrote:

> The problem is that, when raid0 array is about unfreezing and this is single/last array in container,
> Ping to this container causes to mdmon not to exit.
> In such condition managemon receives message and in handle_message() for ping case, calls wakeup_monitor()
> and then goes in to loop for monitor_loop_cnt update 
> 1. this occurs after timeout 
> 2. when this happens managemon stops on pselect() and as there is nothing to monitor in never wakeups.
> 3. monitor waits to be allowed to exit on open handlers.
> 
> How can this be resolved:
> 1. do not ping for last raid0 array during unfreezing (I've reworked patch to meet this condition)
> 2. guard waiting for monitor_loop_cnt change in handle_message() with:
> 	if (container->arrays)
> 
> 3. change in manage member condition:
> 	if (sigterm)
> 		Wakeup_monitor();
> 
> To
> 	if (sigterm || (container->arrays == NULL))
> 		Wakeup_monitor();
> 
> This causes additional monitor wakeup.
> 
> Any of method causes mdmon to exit as expected. 
> In cases 2 and 3 it takes a while (we are waiting on communication timeouts).
> Method 1 is fast and we are not blocking mdmon exit by communication.

Thanks for the explanation!
I definitely want to fix the managemon/monitor interaction so that it doesn't
hang as you describe.  I might end up with something a lot more heavy-weight
that the changes you suggest.

It might still be OK to include your option '1' as well - I decide when you
post the patch.

thanks,
NeilBrown


  reply	other threads:[~2010-11-30 22:06 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-26  8:03 [PATCH 00/53] External Metadata Reshape Adam Kwolek
2010-11-26  8:03 ` [PATCH 01/53] Provide a mdstat_ent to subarray helper Adam Kwolek
2010-11-26  8:04 ` [PATCH 02/53] block monitor: freeze spare assignment for external arrays Adam Kwolek
2010-11-26  8:04 ` [PATCH 03/53] Manage: allow manual control of external raid0 readonly flag Adam Kwolek
2010-11-26  8:04 ` [PATCH 04/53] Grow: mark some functions static Adam Kwolek
2010-11-26  8:04 ` [PATCH 05/53] Assemble: fix assembly in the delta_disks > max_degraded case Adam Kwolek
2010-11-26  8:04 ` [PATCH 06/53] Grow: fix check for raid6 layout normalization Adam Kwolek
2010-11-26  8:04 ` [PATCH 07/53] Grow: add missing raid4 geometries to geo_map() Adam Kwolek
2010-11-26  8:04 ` [PATCH 08/53] fix a get_linux_version() comparison typo Adam Kwolek
2010-11-26  8:05 ` [PATCH 09/53] Create: cleanup/unify default geometry handling Adam Kwolek
2010-11-26  8:05 ` [PATCH 10/53] Initialize st->devnum and st->container_dev in super_by_fd Adam Kwolek
2010-11-26  8:05 ` [PATCH 11/53] Document the external reshape implementation Adam Kwolek
2010-11-26  8:05 ` [PATCH 12/53] External reshape (step 1): container reshape and ->reshape_super() Adam Kwolek
2010-11-26  8:05 ` [PATCH 13/53] External reshape (step 2): Freeze container Adam Kwolek
2010-11-26  8:05 ` [PATCH 14/53] FIX: Cannot exit monitor after takeover Adam Kwolek
2010-11-28 23:38   ` Neil Brown
2010-11-30 16:03     ` Kwolek, Adam
2010-11-30 22:06       ` Neil Brown [this message]
2010-11-26  8:05 ` [PATCH 15/53] FIX: Unfreeze not only container for external metadata Adam Kwolek
2010-11-28 23:48   ` Neil Brown
2010-11-30 16:03     ` Kwolek, Adam
2010-11-26  8:05 ` [PATCH 16/53] Add takeover support for external meta Adam Kwolek
2010-11-29  0:31   ` Neil Brown
2010-11-26  8:06 ` [PATCH 17/53] Disk removal support for Raid10->Raid0 takeover Adam Kwolek
2010-11-29  1:00   ` Neil Brown
2010-11-26  8:06 ` [PATCH 18/53] Treat feature as experimental Adam Kwolek
2010-11-29  1:13   ` Neil Brown
2010-11-26  8:06 ` [PATCH 19/53] imsm: Add support for general migration Adam Kwolek
2010-11-29  1:17   ` Neil Brown
2010-11-29  1:29     ` Neil Brown
2010-11-26  8:06 ` [PATCH 20/53] imsm: Add reshape_update for grow array case Adam Kwolek
2010-11-29  1:48   ` Neil Brown
2010-11-26  8:06 ` [PATCH 21/53] imsm: FIX: core dump during imsm metadata writing Adam Kwolek
2010-11-29  1:54   ` Neil Brown
2010-11-26  8:06 ` [PATCH 22/53] Send information to managemon about reshape request Adam Kwolek
2010-11-29  1:56   ` Neil Brown
2010-11-26  8:06 ` [PATCH 23/53] Process reshape initialization by managemon Adam Kwolek
2010-11-26  8:07 ` [PATCH 24/53] Add support to skip slot configuration Adam Kwolek
2010-11-26  8:07 ` [PATCH 25/53] imsm: Verify slots in meta against slot numbers set by md Adam Kwolek
2010-11-26  8:07 ` [PATCH 26/53] imsm: Cancel metadata changes on reshape start failure Adam Kwolek
2010-11-26  8:07 ` [PATCH 27/53] imsm: Do not accept messages sent by mdadm Adam Kwolek
2010-11-26  8:07 ` [PATCH 28/53] imsm: Do not indicate resync during reshape Adam Kwolek
2010-11-26  8:07 ` [PATCH 29/53] Add spares to raid0 array using takeover Adam Kwolek
2010-11-30  2:00   ` Neil Brown
2010-11-26  8:07 ` [PATCH 30/53] imsm: FIX: Fill sys_name field in getinfo_super() Adam Kwolek
2010-11-30  2:06   ` Neil Brown
2010-11-26  8:07 ` [PATCH 31/53] imsm: FIX: Fill delta_disks " Adam Kwolek
2010-11-30  2:07   ` Neil Brown
2010-11-26  8:08 ` [PATCH 32/53] imsm: FIX: spare list contains one device several times Adam Kwolek
2010-11-30  2:17   ` Neil Brown
2010-11-26  8:08 ` [PATCH 33/53] Prepare and free fdlist in functions Adam Kwolek
2010-11-30  2:28   ` Neil Brown
2010-11-26  8:08 ` [PATCH 34/53] Compute backup blocks in function Adam Kwolek
2010-11-30  2:32   ` Neil Brown
2010-11-26  8:08 ` [PATCH 35/53] Control reshape in mdadm Adam Kwolek
2010-11-30  2:37   ` Neil Brown
2010-11-26  8:08 ` [PATCH 36/53] Finalize reshape after adding disks to array Adam Kwolek
2010-11-26  8:08 ` [PATCH 37/53] mdadm: second_map enhancement for imsm_get_map() Adam Kwolek
2010-11-26  8:08 ` [PATCH 38/53] mdadm: read chunksize and layout from mdstat Adam Kwolek
2010-11-26  8:08 ` [PATCH 39/53] mdadm: Add IMSM migration record to intel_super Adam Kwolek
2010-11-26  8:09 ` [PATCH 40/53] mdadm: add backup methods to superswitch Adam Kwolek
2010-11-26  8:09 ` [PATCH 41/53] mdadm: support restore_stripes() from the given buffer Adam Kwolek
2010-11-26  8:09 ` [PATCH 42/53] mdadm: support backup operations for imsm Adam Kwolek
2010-11-26  8:09 ` [PATCH 43/53] mdadm: migration restart for external meta Adam Kwolek
2010-11-26  8:09 ` [PATCH 44/53] Add mdadm->mdmon sync_max command message Adam Kwolek
2010-11-26  8:09 ` [PATCH 45/53] mdadm: support grow operation for external meta Adam Kwolek
2010-11-26  8:09 ` [PATCH 46/53] FIX: Allow for reshape without backup file Adam Kwolek
2010-11-26  8:09 ` [PATCH 47/53] FIX: Honor !reshape state on wait_reshape() entry Adam Kwolek
2010-11-26  8:10 ` [PATCH 48/53] WORKAROUND: md reports idle state during reshape start Adam Kwolek
2010-11-26  8:10 ` [PATCH 49/53] imsm Fix: Core during rebuild on array details read Adam Kwolek
2010-11-26  8:10 ` [PATCH 50/53] Change manage_reshape() placement Adam Kwolek
2010-11-26  8:10 ` [PATCH 51/53] Migration: raid5->raid0 Adam Kwolek
2010-11-26  8:10 ` [PATCH 52/53] Migration raid0->raid5 Adam Kwolek
2010-11-26  8:10 ` [PATCH 53/53] Migration: Chunk size migration Adam Kwolek
2010-11-29  3:32 ` [PATCH 00/53] External Metadata Reshape Neil Brown
2010-11-29  4:07   ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101201090650.46481f95@notabene.brown \
    --to=neilb@suse.de \
    --cc=adam.kwolek@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ed.ciechanowski@intel.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).