All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Adam Kwolek <adam.kwolek@intel.com>
Cc: linux-raid@vger.kernel.org, dan.j.williams@intel.com,
	ed.ciechanowski@intel.com, wojciech.neubauer@intel.com
Subject: Re: [PATCH 02/29] imsm: Prepare reshape_update in mdadm
Date: Tue, 14 Dec 2010 11:07:51 +1100	[thread overview]
Message-ID: <20101214110751.5691de92@notabene.brown> (raw)
In-Reply-To: <20101209151855.26876.17602.stgit@gklab-170-024.igk.intel.com>

On Thu, 09 Dec 2010 16:18:56 +0100 Adam Kwolek <adam.kwolek@intel.com> wrote:

> During Online Capacity Expansion metadata has to be updated to show
> array changes and allow for future assembly of array.
> To do this mdadm prepares and sends reshape_update metadata update to mdmon.
> Update is sent for one array in container. It contains updated device
> and spares that have to be turned in to array members.
> For spares we have 2 cases:
> 1. For first array in container:
>    reshape_delta_disks: shows how many disks will be added to array
>    Spares are sent in update so variable spares_in_update in metadata update tells that mdmon has to turn spares in to array
>    (IMSM's array meaning) members.
> 2. For 2nd array in container:
>    reshape_delta_disks: shows how many disks will be added to array -exactly as in first case
>    Spares were turned in to array members (they are not a spares) so we have for this volume
>    reuse those disks only.
> 
> This update will change active array state to reshape_is_starting state.
> This works in the following way:
> 1. reshape_super() prepares metadata update and send it to mdmon
> 2. managemon in prepare_update() allocates required memory for bigger device object
> 3. monitor in process_update() updates (replaces) device object with information
>    passed from mdadm (memory was allocated by managemon)
> 4. process_update() function performs:
>    - sets reshape_delta_disks variable to reshape_delta_disks value from update
>    - sets array in to reshape_is_starting state.
> 5. This signals managemon to add devices to md and start reshape for this array
>    and put array in to reshape_in_progress.
>    Managemon can request reshape_cancel_request state in error case.
> 
> Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>

I have applied this.
I haven't reviewed the super-intel.c bits very closely.

I have improved find_array_by_subdev, renamed it to mdstat_by_subdev and
moved it to mdstat.c where it belongs.
I have also totally re-written sysfs_get_unused_spares which was:
  - undocumented
  - overly complex
  - wrong in a few places. e.g.


> +
> +struct mdinfo *sysfs_get_unused_spares(int container_fd, int fd)
> +{
> +	char fname[PATH_MAX];
> +	char buf[PATH_MAX];
> +	char *base;
> +	char *dbase;
> +	struct mdinfo *ret_val;
> +	struct mdinfo *dev;
> +	DIR *dir = NULL;
> +	struct dirent *de;
> +	int is_in;
> +	char *to_check;
> +
> +	ret_val = malloc(sizeof(*ret_val));
> +	if (ret_val == NULL)
> +		goto abort;
> +	memset(ret_val, 0, sizeof(*ret_val));
> +	sysfs_init(ret_val, container_fd, -1);
> +	if (ret_val->sys_name[0] == 0)
> +		goto abort;
> +
> +	sprintf(fname, "/sys/block/%s/md/", ret_val->sys_name);
> +	base = fname + strlen(fname);
> +
> +	strcpy(base, "raid_disks");
> +	if (load_sys(fname, buf))
> +		goto abort;
> +	ret_val->array.raid_disks = strtoul(buf, NULL, 0);
> +
> +	/* Get all the devices as well */
> +	*base = 0;
> +	dir = opendir(fname);
> +	if (!dir)
> +		goto abort;
> +	ret_val->array.spare_disks = 0;
> +	while ((de = readdir(dir)) != NULL) {
> +		char *ep;
> +		if (de->d_ino == 0 ||
> +		    strncmp(de->d_name, "dev-", 4) != 0)
> +			continue;
> +		strcpy(base, de->d_name);
> +		dbase = base + strlen(base);
> +		*dbase = '\0';
> +
> +		to_check = strstr(fname, "/md/");
> +		is_in = sysfs_is_spare_device_belongs_to(fd, to_check);
> +		if (is_in == -1) {
> +			char *p;
> +			struct stat stb;
> +			char stb_name[PATH_MAX];
> +
> +			dev = malloc(sizeof(*dev));
> +			if (!dev)
> +				goto abort;
> +			strncpy(dev->text_version, fname, 50);
> +
> +			*dbase++ = '/';
> +
> +			dev->disk.raid_disk = strtoul(buf, &ep, 10);
> +			dev->disk.raid_disk = -1;
> +
> +			strcpy(dbase, "block/dev");
> +			if (load_sys(fname, buf)) {
> +				free(dev);
> +				continue;
> +			}
> +			/* check first if we are working on block device
> +			 * if not, we cannot check it
> +			 */
> +			p = strchr(dev->text_version, '-');
> +			if (p)
> +				p++;
> +			sprintf(stb_name, "/dev/%s", p);
> +			if (stat(stb_name, &stb) < 0) {

Taking a string out of /sys and looking it up in /dev is wrong.  Names
in /dev tend to follow /sys by convention, but there is no guarantee.


> +				dprintf(Name ": stat failed for %s: %s.\n",
> +					stb_name, strerror(errno));
> +				free(dev);
> +				continue;
> +			}
> +			if (!S_ISBLK(stb.st_mode)) {

And everything that is a member of an array will be a block device, so
checking that it is a block device is pointless.


> +				dprintf(Name\
> +					": %s is not a block device."\
> +					" Skip checking.\n",
> +					stb_name);
> +				goto skip;
> +			}
> +			dprintf(Name": %s seams to a be block device\n",
> +				stb_name);
> +			sscanf(buf, "%d:%d",
> +			       &dev->disk.major,
> +			       &dev->disk.minor);
> +			strcpy(dbase, "block/device/state");
> +			if (load_sys(fname, buf) != 0) {
> +				free(dev);
> +				continue;
> +			}
> +			if (strncmp(buf, "offline", 7) == 0) {
> +				free(dev);
> +				continue;
> +			}
> +			if (strncmp(buf, "failed", 6) == 0) {
> +				free(dev);
> +				continue;
> +			}

The string you want to check here is "faulty" not "failed".

NeilBrown



  reply	other threads:[~2010-12-14  0:07 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-09 15:18 [PATCH 00/29] OLCE, migrations and raid10 takeover Adam Kwolek
2010-12-09 15:18 ` [PATCH 01/29] Add state_of_reshape for external metadata Adam Kwolek
2010-12-09 15:18 ` [PATCH 02/29] imsm: Prepare reshape_update in mdadm Adam Kwolek
2010-12-14  0:07   ` Neil Brown [this message]
2010-12-14  7:54     ` Kwolek, Adam
2010-12-09 15:19 ` [PATCH 03/29] imsm: Process reshape_update in mdmon Adam Kwolek
2010-12-09 15:19 ` [PATCH 04/29] imsm: Block array state change during reshape Adam Kwolek
2010-12-09 15:19 ` [PATCH 05/29] Process reshape initialization by managemon Adam Kwolek
2010-12-09 15:19 ` [PATCH 06/29] imsm: Verify slots in meta against slot numbers set by md Adam Kwolek
2010-12-09 15:19 ` [PATCH 07/29] imsm: Cancel metadata changes on reshape start failure Adam Kwolek
2010-12-09 15:19 ` [PATCH 08/29] imsm: Do not accept messages sent by mdadm Adam Kwolek
2010-12-09 15:19 ` [PATCH 09/29] imsm: Do not indicate resync during reshape Adam Kwolek
2010-12-09 15:20 ` [PATCH 10/29] imsm: Fill delta_disks field in getinfo_super() Adam Kwolek
2010-12-09 15:20 ` [PATCH 11/29] Control reshape in mdadm Adam Kwolek
2010-12-09 15:20 ` [PATCH 12/29] Finalize reshape after adding disks to array Adam Kwolek
2010-12-09 15:20 ` [PATCH 13/29] Add reshape progress updating Adam Kwolek
2010-12-09 15:20 ` [PATCH 14/29] WORKAROUND: md reports idle state during reshape start Adam Kwolek
2010-12-09 15:20 ` [PATCH 15/29] FIX: core during getting map Adam Kwolek
2010-12-09 15:20 ` [PATCH 16/29] Enable reshape for subarrays Adam Kwolek
2010-12-09 15:21 ` [PATCH 17/29] Change manage_reshape() placement Adam Kwolek
2010-12-09 15:21 ` [PATCH 18/29] Migration: raid5->raid0 Adam Kwolek
2010-12-09 15:21 ` [PATCH 19/29] Detect level change Adam Kwolek
2010-12-09 15:21 ` [PATCH 20/29] Migration raid0->raid5 Adam Kwolek
2010-12-09 15:21 ` [PATCH 21/29] Read chunk size and layout from mdstat Adam Kwolek
2010-12-09 15:21 ` [PATCH 22/29] FIX: mdstat doesn't read chunk size correctly Adam Kwolek
2010-12-09 15:21 ` [PATCH 23/29] Migration: Chunk size migration Adam Kwolek
2010-12-09 15:21 ` [PATCH 24/29] Add takeover support for external meta Adam Kwolek
2010-12-09 15:22 ` [PATCH 25/29] Takeover raid10 -> raid0 for external metadata Adam Kwolek
2010-12-09 15:22 ` [PATCH 26/29] Takeover raid0 -> raid10 " Adam Kwolek
2010-12-09 15:22 ` [PATCH 27/29] FIX: Problem with removing array after takeover Adam Kwolek
2010-12-09 15:22 ` [PATCH 28/29] IMSM compatibility for raid0 -> raid10 takeover Adam Kwolek
2010-12-09 15:22 ` [PATCH 29/29] Add spares to raid0 in mdadm Adam Kwolek
2010-12-16 11:20 ` [PATCH 00/29] OLCE, migrations and raid10 takeover Neil Brown
2010-12-20  8:27   ` Wojcik, Krzysztof
2010-12-20 21:59     ` Neil Brown
2010-12-21 12:37     ` Neil Brown
2010-12-27 14:56       ` Kwolek, Adam
2010-12-28  2:24         ` Neil Brown
2010-12-28  8:49           ` Kwolek, Adam
2010-12-29 10:34             ` Neil Brown
2010-12-29 12:29               ` Kwolek, Adam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101214110751.5691de92@notabene.brown \
    --to=neilb@suse.de \
    --cc=adam.kwolek@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ed.ciechanowski@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=wojciech.neubauer@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.