Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Re: Superblock of raid5 log can't be updated when array stoped
From: Zhengyuan Liu @ 2016-10-18  5:51 UTC (permalink / raw)
  To: Shaohua Li; +Cc: shli, Song Liu, linux-raid
In-Reply-To: <20161018002814.GC106864@kernel.org>

If that's the problem, I think there is another problem about next_checkpoint initialization.
No initial operation was done to this field when we loaded/recovery  the log , it got
assignment  only when IO to raid disk was finished.  So r5l_quiesce may use wrong 
next_checkpoint to reclaim log space, that would confuse log recovery.  Bellow patch
may help the expression.

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 1b1ab4a..998ea00 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1096,6 +1096,8 @@ static int r5l_recovery_log(struct r5l_log *log)
                log->seq = ctx.seq + 11;
                log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
                r5l_write_super(log, ctx.pos);
+               log->last_checkpoint = ctx.pos;
+               log->next_checkpoint = ctx.pos;
        } else {
                log->log_start = ctx.pos;
                log->seq = ctx.seq;
@@ -1168,6 +1170,7 @@ create:
        if (log->max_free_space > RECLAIM_MAX_FREE_SPACE)
                log->max_free_space = RECLAIM_MAX_FREE_SPACE;
        log->last_checkpoint = cp;
+       log->next_checkpoint = cp;
 
        __free_page(page);

Thanks,
--Zhengyuan

------------------ Original ------------------
From:  "Shaohua Li"<shli@kernel.org>;
Date:  Tue, Oct 18, 2016 08:28 AM
To:  "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>;
Cc:  "shli"<shli@fb.com>; "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>;
Subject:  Re: Superblock of raid5 log can't be updated when array stoped
 
On Sat, Oct 15, 2016 at 10:19:36AM +0800, liuzhengyuan wrote:
> Hi, Shaohua.
> 
> when we stop raid5 array with "mdadm -S" or reboot the system,  md module will 
> call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce
> was pasted bellow.
> 
>                 /* make sure r5l_write_super_and_discard_space exits */
>                 mddev = log->rdev->mddev;
>                 wake_up(&mddev->sb_wait);
>                 r5l_wake_reclaim(log, -1L);
>                 md_unregister_thread(&log->reclaim_thread);
>                 r5l_do_reclaim(log);
> +                md_update_sb(mddev, 1);
> 
> It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail
> to log->next_checkpoint . However, new rdev->journal_tail would not be written to 
> journal device for persistent because journal device may not support discard operation
> or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce 
> was called with  reconfig_mutex hold, isn't it?). As a result, it will take a long time to
>  recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb
>  directly to guarantee  superblock  update?

Yep, that's problem here. Unfortunately we can't call md_update_sb here,
because we might not hold the mddev lock. I think we should call it at do_md_stop.

Thanks,
Shaohua

^ permalink raw reply related

* [PATCH] dm: do not assign error to md->kworker_task
From: Tahsin Erdogan @ 2016-10-18  1:16 UTC (permalink / raw)
  To: Alasdair Kergon, Mike Snitzer, dm-devel, Shaohua Li
  Cc: linux-raid, linux-kernel, Tahsin Erdogan

cleanup_mapped_device() calls kthread_stop() if kworker_task is
non-NULL. Currently the assigned value could be a valid task struct or
an error code. Do not assign in case of error.

Example failure when kthread_run() returns -ENOMEM:

[   22.255939] BUG: unable to handle kernel NULL pointer dereference at 000000000000000c
[   22.258847] IP: [<ffffffff802973a4>] kthread_stop+0x34/0x260
[   22.260130] PGD 78a23067 PUD 78b56067 PMD 0
[   22.260130] Oops: 0002 [#1] SMP
[   22.260130] CPU: 1 PID: 1849 Comm: dmsetup Tainted: G        W 4.8.0+ #3
[   22.260130] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   22.260130] task: ffff880078966400 task.stack: ffffc90001898000
[   22.260130] RIP: 0010:[<ffffffff802973a4>]  [<ffffffff802973a4>] kthread_stop+0x34/0x260
[   22.260130] RSP: 0018:ffffc9000189bc40  EFLAGS: 00010202
[   22.260130] RAX: 0000000000000001 RBX: fffffffffffffff4 RCX: 0000000000000003
[   22.260130] RDX: ffff88007fd18600 RSI: 0000000000000001 RDI: ffffffff81037080
[   22.260130] RBP: ffffc9000189bc50 R08: 0000000000000000 R09: 0000000000000000
[   22.260130] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[   22.260130] R13: 0000000000000001 R14: ffff880077f539d8 R15: 0000000000000004
[   22.260130] FS:  00007fc9ef2e2840(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[   22.260130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   22.260130] CR2: 000000000000000c CR3: 0000000077fa2000 CR4: 00000000000006e0
[   22.260130] Stack:
[   22.260130]  ffff880077f53800 0000000000000000 ffffc9000189bc68 ffffffff808b26fa
[   22.260130]  ffff880077f53800 ffffc9000189bcb0 ffffffff808b3c58 0000000000000000
[   22.260130]  00000000808b534b ffffc9000189bd20 ffff880077f53800 0000000000000000
[   22.260130] Call Trace:
[   22.260130]  [<ffffffff808b26fa>] cleanup_mapped_device+0x2a/0xe0
[   22.260130]  [<ffffffff808b3c58>] __dm_destroy+0x1a8/0x2b0
[   22.260130]  [<ffffffff808b4b6e>] dm_destroy+0xe/0x10
[   22.260130]  [<ffffffff808b9f49>] dev_remove+0xd9/0x120
[   22.260130]  [<ffffffff808b9e70>] ? dev_suspend+0x210/0x210
[   22.260130]  [<ffffffff808ba576>] ctl_ioctl+0x206/0x500
[   22.260130]  [<ffffffff808ba87e>] dm_ctl_ioctl+0xe/0x20
[   22.260130]  [<ffffffff803bca40>] do_vfs_ioctl+0x90/0x6b0
[   22.260130]  [<ffffffff80b11fd7>] ?  entry_SYSCALL_64_fastpath+0x5/0xad
[   22.260130]  [<ffffffff802bd974>] ?  trace_hardirqs_on_caller+0xf4/0x1c0
[   22.260130]  [<ffffffff803bd0d4>] SyS_ioctl+0x74/0x80
[   22.260130]  [<ffffffff80b11fea>] entry_SYSCALL_64_fastpath+0x18/0xad
[   22.260130] Code: e5 41 54 85 c0 53 48 89 fb 0f 8f bb 01 00 00 65 8b
05 a1 2d d7 7f 89 c0 48 0f a3 05 9f 94 e8 00 0f 92 c0 84 c0 0f 85 a3 00
00 00 <f0> ff 43 18 48 89 df e8 10 f8 ff ff 48 85 c0 49 89 c4 74 29 f0
[   22.260130] RIP  [<ffffffff802973a4>] kthread_stop+0x34/0x260
[   22.260130]  RSP <ffffc9000189bc40>
[   22.260130] CR2: 000000000000000c
[   22.301062] ---[ end trace 22b4f4f62c04f3cf ]---

Signed-off-by: Tahsin Erdogan <tahsin@google.com>
---
 drivers/md/dm-rq.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 5eacce1ef88b..6e5197414a57 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -841,6 +841,8 @@ static void dm_old_request_fn(struct request_queue *q)
  */
 int dm_old_init_request_queue(struct mapped_device *md)
 {
+	struct task_struct *task;
+
 	/* Fully initialize the queue */
 	if (!blk_init_allocated_queue(md->queue, dm_old_request_fn, NULL))
 		return -EINVAL;
@@ -854,11 +856,12 @@ int dm_old_init_request_queue(struct mapped_device *md)
 
 	/* Initialize the request-based DM worker thread */
 	init_kthread_worker(&md->kworker);
-	md->kworker_task = kthread_run(kthread_worker_fn, &md->kworker,
-				       "kdmwork-%s", dm_device_name(md));
-	if (IS_ERR(md->kworker_task))
-		return PTR_ERR(md->kworker_task);
+	task = kthread_run(kthread_worker_fn, &md->kworker, "kdmwork-%s",
+			   dm_device_name(md));
+	if (IS_ERR(task))
+		return PTR_ERR(task);
 
+	md->kworker_task = task;
 	elv_register_queue(md->queue);
 
 	return 0;
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related

* Re: Superblock of raid5 log can't be updated when array stoped
From: Shaohua Li @ 2016-10-18  0:28 UTC (permalink / raw)
  To: liuzhengyuan; +Cc: shli, Song Liu, linux-raid
In-Reply-To: <tencent_382FBBB34581DF7E1F5EC9A2@qq.com>

On Sat, Oct 15, 2016 at 10:19:36AM +0800, liuzhengyuan wrote:
> Hi, Shaohua.
> 
> when we stop raid5 array with "mdadm -S" or reboot the system,  md module will 
> call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce
> was pasted bellow.
> 
>                 /* make sure r5l_write_super_and_discard_space exits */
>                 mddev = log->rdev->mddev;
>                 wake_up(&mddev->sb_wait);
>                 r5l_wake_reclaim(log, -1L);
>                 md_unregister_thread(&log->reclaim_thread);
>                 r5l_do_reclaim(log);
> +                md_update_sb(mddev, 1);
> 
> It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail
> to log->next_checkpoint . However, new rdev->journal_tail would not be written to 
> journal device for persistent because journal device may not support discard operation
> or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce 
> was called with  reconfig_mutex hold, isn't it?). As a result, it will take a long time to
>  recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb
>  directly to guarantee  superblock  update?

Yep, that's problem here. Unfortunately we can't call md_update_sb here,
because we might not hold the mddev lock. I think we should call it at do_md_stop.

Thanks,
Shaohua

^ permalink raw reply

* Re: [PATCH 2/2] md: unblock array if bad blocks have been acknowledged
From: Shaohua Li @ 2016-10-18  0:24 UTC (permalink / raw)
  To: Tomasz Majchrzak
  Cc: linux-raid, aleksey.obitotskiy, pawel.baldysiak,
	artur.paszkiewicz, maksymilian.kunt, mariusz.dabrowski
In-Reply-To: <1476714580-15167-1-git-send-email-tomasz.majchrzak@intel.com>

On Mon, Oct 17, 2016 at 04:29:40PM +0200, Tomasz Majchrzak wrote:
> Once external metadata handler acknowledges all bad blocks (by writing
> to rdev 'bad_blocks' sysfs file), it requests to unblock the array.
> Check if all bad blocks are actually acknowledged as there might be a
> race if new bad blocks are notified at the same time. If all bad blocks
> are acknowledged, just unblock the array and continue. If not, ignore
> the request to unblock (do not fail an array). External metadata handler
> is expected to either process remaining bad blocks and try to unblock
> again or remove bad block support for a disk (which will cause disk to
> fail as in no-support case).
> 
> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
> Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> ---
>  drivers/md/md.c | 43 ++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 36 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index f375d1b..81b4c33 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2629,19 +2629,48 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
>  		set_bit(Blocked, &rdev->flags);
>  		err = 0;
>  	} else if (cmd_match(buf, "-blocked")) {
> -		if (!test_bit(Faulty, &rdev->flags) &&
> +		int unblock = 1;
> +
> +		if ((test_bit(ExternalBbl, &rdev->flags) &&
> +			 rdev->badblocks.changed)) {
> +			struct badblocks *bb = &rdev->badblocks;
> +			int ack = 1;
> +
> +			write_seqlock_irq(&bb->lock);
> +			if (bb->unacked_exist) {
> +				u64 *p = bb->page;
> +				int i;
> +
> +				for (i = 0; i < bb->count ; i++) {
> +					if (!BB_ACK(p[i])) {
> +						ack = 0;
> +						break;
> +					}
> +				}
> +				if (ack) {
> +					bb->unacked_exist = 0;
> +					bb->changed = 0;
> +				}
> +			}
> +			write_sequnlock_irq(&bb->lock);
> +		}

shouldn't this part be moved to badblocks.c?

Thanks,
Shaohua

^ permalink raw reply

* Re: [PATCH 1/2] md: add bad block support for external metadata
From: Shaohua Li @ 2016-10-18  0:23 UTC (permalink / raw)
  To: Tomasz Majchrzak
  Cc: linux-raid, aleksey.obitotskiy, pawel.baldysiak,
	artur.paszkiewicz, maksymilian.kunt, mariusz.dabrowski
In-Reply-To: <1476714549-15057-1-git-send-email-tomasz.majchrzak@intel.com>

On Mon, Oct 17, 2016 at 04:29:09PM +0200, Tomasz Majchrzak wrote:
> Add new rdev flag which external metadata handler can use to switch
> on/off bad block support. If new bad block is encountered, notify it via
> rdev 'unacknowledged_bad_blocks' sysfs file. If bad block has been
> cleared, notify update to rdev 'bad_blocks' sysfs file.
> 
> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
> Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> ---
>  drivers/md/md.c | 22 ++++++++++++++++++++--
>  drivers/md/md.h |  3 +++
>  2 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index eac84d8..f375d1b 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2568,6 +2568,10 @@ state_show(struct md_rdev *rdev, char *page)
>  		len += sprintf(page+len, "%sreplacement", sep);
>  		sep = ",";
>  	}
> +	if (test_bit(ExternalBbl, &flags)) {
> +		len += sprintf(page+len, "%sexternal_bbl", sep);
> +		sep = ",";
> +	}

that sep assignment isn't needed, please delete it and the previous occurences.
 
>  	return len+sprintf(page+len, "\n");
>  }
> @@ -2708,6 +2712,14 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
>  			}
>  		} else
>  			err = -EBUSY;
> +	} else if (cmd_match(buf, "external_bbl") && (rdev->mddev->external)) {
> +		set_bit(ExternalBbl, &rdev->flags);
> +		rdev->badblocks.shift = 0;
> +		err = 0;
> +	} else if (cmd_match(buf, "-external_bbl") && (rdev->mddev->external)) {
> +		clear_bit(ExternalBbl, &rdev->flags);
> +		rdev->badblocks.shift = -1;
> +		err = 0;

How is this safe? If md is clearing badblocks and in the meaningtime we set
badblocks.shift = -1 here, we have a trouble.

Thanks,
Shaohua

^ permalink raw reply

* [PATCH v2] Fix bus error when accessing MBR partition records
From: James Clarke @ 2016-10-17 20:16 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: James Clarke, linux-raid
In-Reply-To: <wrfjshs8dwd7.fsf@redhat.com>

Since the MBR layout only has partition records as 2-byte aligned, the 32-bit
fields in them are not aligned. Thus, they cannot be accessed on some
architectures (such as SPARC) by using a "struct MBR_part_record *" pointer,
as the compiler can assume that the pointer is properly aligned. Instead, the
records must be accessed by going through the MBR struct itself every time.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
---
Changes from v1:

 * Added packed attribute to MBR_part_record

 * Reformatted changes to fit the 80-character line limit (assuming a tab stop
   of 4; if it's 8 I can try and cut the line lengths down, though with the
   extra verbosity that's going to be more awkward...)

 part.h      |  2 +-
 super-mbr.c |  6 ++++++
 util.c      | 15 ++++++++-------
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/part.h b/part.h
index 862a14c..e697fb4 100644
--- a/part.h
+++ b/part.h
@@ -40,7 +40,7 @@ struct MBR_part_record {
   __u8 last_cyl;
   __u32 first_sect_lba;
   __u32 blocks_num;
-};
+} __attribute__((packed));
 
 struct MBR {
 	__u8 pad[446];
diff --git a/super-mbr.c b/super-mbr.c
index 62b3f03..303dde4 100644
--- a/super-mbr.c
+++ b/super-mbr.c
@@ -57,6 +57,9 @@ static void examine_mbr(struct supertype *st, char *homehost)
 
 	printf("   MBR Magic : %04x\n", sb->magic);
 	for (i = 0; i < MBR_PARTITIONS; i++)
+		/* Have to make every access through sb rather than using a pointer to
+		 * the partition table (or an entry), since the entries are not
+		 * properly aligned. */
 		if (sb->parts[i].blocks_num)
 			printf("Partition[%d] : %12lu sectors at %12lu (type %02x)\n",
 			       i,
@@ -151,6 +154,9 @@ static void getinfo_mbr(struct supertype *st, struct mdinfo *info, char *map)
 	info->component_size = 0;
 
 	for (i = 0; i < MBR_PARTITIONS ; i++)
+		/* Have to make every access through sb rather than using a pointer to
+		 * the partition table (or an entry), since the entries are not
+		 * properly aligned. */
 		if (sb->parts[i].blocks_num) {
 			unsigned long last =
 				(unsigned long)__le32_to_cpu(sb->parts[i].blocks_num)
diff --git a/util.c b/util.c
index a238a21..4a8767b 100644
--- a/util.c
+++ b/util.c
@@ -1412,7 +1412,6 @@ static int get_gpt_last_partition_end(int fd, unsigned long long *endofpart)
 static int get_last_partition_end(int fd, unsigned long long *endofpart)
 {
 	struct MBR boot_sect;
-	struct MBR_part_record *part;
 	unsigned long long curr_part_end;
 	unsigned part_nr;
 	int retval = 0;
@@ -1429,21 +1428,23 @@ static int get_last_partition_end(int fd, unsigned long long *endofpart)
 	if (boot_sect.magic == MBR_SIGNATURE_MAGIC) {
 		retval = 1;
 		/* found the correct signature */
-		part = boot_sect.parts;
 
 		for (part_nr = 0; part_nr < MBR_PARTITIONS; part_nr++) {
+			/* Have to make every access through boot_sect rather than using a
+			 * pointer to the partition table (or an entry), since the entries
+			 * are not properly aligned. */
+
 			/* check for GPT type */
-			if (part->part_type == MBR_GPT_PARTITION_TYPE) {
+			if (boot_sect.parts[part_nr].part_type == MBR_GPT_PARTITION_TYPE) {
 				retval = get_gpt_last_partition_end(fd, endofpart);
 				break;
 			}
 			/* check the last used lba for the current partition  */
-			curr_part_end = __le32_to_cpu(part->first_sect_lba) +
-				__le32_to_cpu(part->blocks_num);
+			curr_part_end =
+				__le32_to_cpu(boot_sect.parts[part_nr].first_sect_lba) +
+				__le32_to_cpu(boot_sect.parts[part_nr].blocks_num);
 			if (curr_part_end > *endofpart)
 				*endofpart = curr_part_end;
-
-			part++;
 		}
 	} else {
 		/* Unknown partition table */
-- 
2.10.1


^ permalink raw reply related

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17 17:18 UTC (permalink / raw)
  To: SF Markus Elfring, linux-raid
  Cc: Bernd Petrovitsch, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <bf6c54f1-7af4-747c-a58b-be13ab74563e@users.sourceforge.net>

On 10/17/2016 06:08 PM, SF Markus Elfring wrote:
>>> * Would you really like to know under which circumstances data processing
>>>   will be faster for a single character instead of using a string pointer
>>>   and corresponding two characters?
>>>
>> It's not a problem of the interface, it's a problem of the resulting code
>> (ie assembler output).
>
> How do you think about to discuss concrete generated code any further?
>
Sure. Show me the generated code and point out where the benefits are.

>> We can discuss all we like, if the compiler decides to throw in
>> an optimisation none of the arguments even apply.
>
> Would it make sense to clarify assembler output with optimisation switched off?
>
> Do you eventually care for code from non-optimising compilers?
>
No. This is the linux kernel. There is a very, _very_ limited benefit of 
trying to use a non-standard compiler.

>
>>> * Will it occasionally be useful to avoid the storage for another string literal?
>>>
>> Occasionally: yes.
>> In this particular case: hardly.
>
> I am curious when such a software design aspect can become more relevant.
> Would it be nice to get rid of three questionable string terminators (null bytes)
> for example?
>
Again, all this does it trying to out-guess what the compiler might be 
doing during compilation. For which the easiest method is checking.
So back to the original task for you: Show me in the generated output 
where the benefits are.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-17 16:08 UTC (permalink / raw)
  To: Hannes Reinecke, linux-raid
  Cc: Bernd Petrovitsch, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <05d0cade-7922-9d8a-a974-34b2cc9150fb@suse.de>

>> * Would you really like to know under which circumstances data processing
>>   will be faster for a single character instead of using a string pointer
>>   and corresponding two characters?
>>
> It's not a problem of the interface, it's a problem of the resulting code
> (ie assembler output).

How do you think about to discuss concrete generated code any further?


> We can discuss all we like, if the compiler decides to throw in
> an optimisation none of the arguments even apply.

Would it make sense to clarify assembler output with optimisation switched off?

Do you eventually care for code from non-optimising compilers?


>> * Will it occasionally be useful to avoid the storage for another string literal?
>>
> Occasionally: yes.
> In this particular case: hardly.

I am curious when such a software design aspect can become more relevant.
Would it be nice to get rid of three questionable string terminators (null bytes)
for example?

Regards,
Markus

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17 15:33 UTC (permalink / raw)
  To: SF Markus Elfring, linux-raid
  Cc: Bernd Petrovitsch, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <cb9beadc-3ea2-c09c-f621-a54cd73e3647@users.sourceforge.net>

On 10/17/2016 04:30 PM, SF Markus Elfring wrote:
>>> Am I the only software developer so far who would dare to reconsider
>>> implementation details from three status functions?
>>>
>> No.
>
> Thanks for this kind of promising feedback.
>
>
>> But we're waiting for you showing is that it is an improvement.
>
> Can this aspect also be clarified to some degree from a logical point of view?
>
I sincerely doubt that.
We've discussed the logical implications already, and failed to come to 
a consensus. So we need some proof (as in: on this architecture I'm 
seeing this and that performance improvements).
Which you have to deliver.

> * Would you really like to know under which circumstances data processing
>   will be faster for a single character instead of using a string pointer
>   and corresponding two characters?
>
It's not a problem of the interface, it's a problem of the resulting 
code (ie assembler output). We can discuss all we like, if the compiler 
decides to throw in an optimisation none of the arguments even apply.

> * Do you care for a changed memory allocation characteristic?
>
> * Will it occasionally be useful to avoid the storage for another string literal?
>
Occasionally: yes.
In this particular case: hardly.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-17 14:30 UTC (permalink / raw)
  To: Hannes Reinecke, linux-raid
  Cc: Bernd Petrovitsch, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <57299b72-8e6b-0b92-4374-1b7a0907e810@suse.de>

>> Am I the only software developer so far who would dare to reconsider
>> implementation details from three status functions?
>>
> No.

Thanks for this kind of promising feedback.

> But we're waiting for you showing is that it is an improvement.

Can this aspect also be clarified to some degree from a logical point of view?

* Would you really like to know under which circumstances data processing
  will be faster for a single character instead of using a string pointer
  and corresponding two characters?

* Do you care for a changed memory allocation characteristic?

* Will it occasionally be useful to avoid the storage for another string literal?

Regards,
Markus

^ permalink raw reply

* [PATCH 2/2] md: unblock array if bad blocks have been acknowledged
From: Tomasz Majchrzak @ 2016-10-17 14:29 UTC (permalink / raw)
  To: linux-raid
  Cc: shli, aleksey.obitotskiy, pawel.baldysiak, artur.paszkiewicz,
	maksymilian.kunt, mariusz.dabrowski, Tomasz Majchrzak

Once external metadata handler acknowledges all bad blocks (by writing
to rdev 'bad_blocks' sysfs file), it requests to unblock the array.
Check if all bad blocks are actually acknowledged as there might be a
race if new bad blocks are notified at the same time. If all bad blocks
are acknowledged, just unblock the array and continue. If not, ignore
the request to unblock (do not fail an array). External metadata handler
is expected to either process remaining bad blocks and try to unblock
again or remove bad block support for a disk (which will cause disk to
fail as in no-support case).

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
---
 drivers/md/md.c | 43 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index f375d1b..81b4c33 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2629,19 +2629,48 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 		set_bit(Blocked, &rdev->flags);
 		err = 0;
 	} else if (cmd_match(buf, "-blocked")) {
-		if (!test_bit(Faulty, &rdev->flags) &&
+		int unblock = 1;
+
+		if ((test_bit(ExternalBbl, &rdev->flags) &&
+			 rdev->badblocks.changed)) {
+			struct badblocks *bb = &rdev->badblocks;
+			int ack = 1;
+
+			write_seqlock_irq(&bb->lock);
+			if (bb->unacked_exist) {
+				u64 *p = bb->page;
+				int i;
+
+				for (i = 0; i < bb->count ; i++) {
+					if (!BB_ACK(p[i])) {
+						ack = 0;
+						break;
+					}
+				}
+				if (ack) {
+					bb->unacked_exist = 0;
+					bb->changed = 0;
+				}
+			}
+			write_sequnlock_irq(&bb->lock);
+		}
+		if ((test_bit(ExternalBbl, &rdev->flags) &&
+			rdev->badblocks.unacked_exist)) {
+			unblock = 0;
+		} else if (!test_bit(Faulty, &rdev->flags) &&
 		    rdev->badblocks.unacked_exist) {
 			/* metadata handler doesn't understand badblocks,
 			 * so we need to fail the device
 			 */
 			md_error(rdev->mddev, rdev);
 		}
-		clear_bit(Blocked, &rdev->flags);
-		clear_bit(BlockedBadBlocks, &rdev->flags);
-		wake_up(&rdev->blocked_wait);
-		set_bit(MD_RECOVERY_NEEDED, &rdev->mddev->recovery);
-		md_wakeup_thread(rdev->mddev->thread);
-
+		if (unblock) {
+			clear_bit(Blocked, &rdev->flags);
+			clear_bit(BlockedBadBlocks, &rdev->flags);
+			wake_up(&rdev->blocked_wait);
+			set_bit(MD_RECOVERY_NEEDED, &rdev->mddev->recovery);
+			md_wakeup_thread(rdev->mddev->thread);
+		}
 		err = 0;
 	} else if (cmd_match(buf, "insync") && rdev->raid_disk == -1) {
 		set_bit(In_sync, &rdev->flags);
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH 1/2] md: add bad block support for external metadata
From: Tomasz Majchrzak @ 2016-10-17 14:29 UTC (permalink / raw)
  To: linux-raid
  Cc: shli, aleksey.obitotskiy, pawel.baldysiak, artur.paszkiewicz,
	maksymilian.kunt, mariusz.dabrowski, Tomasz Majchrzak

Add new rdev flag which external metadata handler can use to switch
on/off bad block support. If new bad block is encountered, notify it via
rdev 'unacknowledged_bad_blocks' sysfs file. If bad block has been
cleared, notify update to rdev 'bad_blocks' sysfs file.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
---
 drivers/md/md.c | 22 ++++++++++++++++++++--
 drivers/md/md.h |  3 +++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index eac84d8..f375d1b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2568,6 +2568,10 @@ state_show(struct md_rdev *rdev, char *page)
 		len += sprintf(page+len, "%sreplacement", sep);
 		sep = ",";
 	}
+	if (test_bit(ExternalBbl, &flags)) {
+		len += sprintf(page+len, "%sexternal_bbl", sep);
+		sep = ",";
+	}
 
 	return len+sprintf(page+len, "\n");
 }
@@ -2708,6 +2712,14 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 			}
 		} else
 			err = -EBUSY;
+	} else if (cmd_match(buf, "external_bbl") && (rdev->mddev->external)) {
+		set_bit(ExternalBbl, &rdev->flags);
+		rdev->badblocks.shift = 0;
+		err = 0;
+	} else if (cmd_match(buf, "-external_bbl") && (rdev->mddev->external)) {
+		clear_bit(ExternalBbl, &rdev->flags);
+		rdev->badblocks.shift = -1;
+		err = 0;
 	}
 	if (!err)
 		sysfs_notify_dirent_safe(rdev->sysfs_state);
@@ -8614,6 +8626,9 @@ int rdev_set_badblocks(struct md_rdev *rdev, sector_t s, int sectors,
 	rv = badblocks_set(&rdev->badblocks, s, sectors, 0);
 	if (rv == 0) {
 		/* Make sure they get written out promptly */
+		if (test_bit(ExternalBbl, &rdev->flags))
+			sysfs_notify(&rdev->kobj, NULL,
+				     "unacknowledged_bad_blocks");
 		sysfs_notify_dirent_safe(rdev->sysfs_state);
 		set_mask_bits(&mddev->flags, 0,
 			      BIT(MD_CHANGE_CLEAN) | BIT(MD_CHANGE_PENDING));
@@ -8627,12 +8642,15 @@ EXPORT_SYMBOL_GPL(rdev_set_badblocks);
 int rdev_clear_badblocks(struct md_rdev *rdev, sector_t s, int sectors,
 			 int is_new)
 {
+	int rv;
 	if (is_new)
 		s += rdev->new_data_offset;
 	else
 		s += rdev->data_offset;
-	return badblocks_clear(&rdev->badblocks,
-				  s, sectors);
+	rv = badblocks_clear(&rdev->badblocks, s, sectors);
+	if ((rv == 0) && test_bit(ExternalBbl, &rdev->flags))
+		sysfs_notify(&rdev->kobj, NULL, "bad_blocks");
+	return rv;
 }
 EXPORT_SYMBOL_GPL(rdev_clear_badblocks);
 
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 2b20417..21bd94f 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -168,6 +168,9 @@ enum flag_bits {
 				 * so it is safe to remove without
 				 * another synchronize_rcu() call.
 				 */
+	ExternalBbl,            /* External metadata provides bad
+				 * block management for a disk
+				 */
 };
 
 static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors,
-- 
1.8.3.1


^ permalink raw reply related

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17 14:01 UTC (permalink / raw)
  To: SF Markus Elfring, Bernd Petrovitsch, linux-raid
  Cc: Christoph Hellwig, Guoqing Jiang, Jens Axboe, Joe Perches,
	Mike Christie, Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors, kbuild-all, ltp
In-Reply-To: <97f31b70-e3ff-f194-c753-54da1fe3e664@users.sourceforge.net>

On 10/17/2016 01:43 PM, SF Markus Elfring wrote:
>>>> See above. At the moment _any_ test result from your side would do.
>>>
>>> I imagine that another single result might not be representative.
>>
>> Publish not only results but also everything (complete!) so that anyone
>> can *easily* follow it to check and reproduce the results - especially
>> if you want people with knowledge of other architectures to comment
>> (otherwise they probably won't bother).
> 
> Am I the only software developer so far who would dare to reconsider
> implementation details from three status functions?
> 
No.
But we're waiting for you showing is that it is an improvement.
Which at the moment we don't see.
Hence we're waiting from a proof or validation from your side.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17 14:00 UTC (permalink / raw)
  To: SF Markus Elfring, linux-raid
  Cc: Christoph Hellwig, Guoqing Jiang, Jens Axboe, Joe Perches,
	Mike Christie, Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors, kbuild-all, ltp
In-Reply-To: <a0f96e19-fa24-8507-2b77-344a342113aa@users.sourceforge.net>

On 10/17/2016 01:10 PM, SF Markus Elfring wrote:
>>> * Is a string pointer often longer than a byte?
>>>
>> Always.
> 
> I have got doubts for this specific information.
> 
> 
>> (Which up to now I thought was basic programming knowledge...)
> 
> By the way:
> Run time environments still exist where the size of a pointer can be also
> just one byte, don't they?
> 
Really? Name one.
You can only fit a point in one byte if you are on an 8-bit system.
Which I don't think linux is running on.

>>> How many results would we like to clarify from various hardware
>>> and software combinations?
>>>
>> See above. At the moment _any_ test result from your side would do.
> 
> I imagine that another single result might not be representative.
> How many lessons from test statistics will usually be also relevant here?
> 
> 
As said above, _any_ statistic will do at this point.

>>> How important are the mentioned functions for you within the Linux
>>> programming interface so far?
>>>
>> Not very. The interface is only used in a slow path, and the execution
>> time doesn't affect I/O performance in any way.
> 
> Thanks for another interesting information.
> 
> 
>>>> Case in point: with your patch the x86_64 compiler generates nearly
>>>> identical code for driver/md/raid1.c, but with one instruction _more_
>>>> after your patch has been applied.
>>>
>>> Which software versions and command parameters did you try out
>>> for this information (from an unspecified run time environment)?
>>>
>> # gcc --version
>> gcc (SUSE Linux) 4.8.5
> 
> Thanks for this detail.
> 
> * Did you choose any special optimisation settings for your quick check?
> 
> * Will any compilation results matter if "optimisation" would be
>   switched off there?
> 
> 
These were the results when calling 'make' in the kernel source tree.
With all applicable options.

>> I'm still waiting from results from your side.
> 
> Would any other software developers or testers dare to add related information?
> 
No. It's your patch, _you_ have to do it.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-17 11:43 UTC (permalink / raw)
  To: Bernd Petrovitsch, linux-raid
  Cc: Hannes Reinecke, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <1476703920.2520.105.camel@petrovitsch.priv.at>

>>> See above. At the moment _any_ test result from your side would do.
>>
>> I imagine that another single result might not be representative.
> 
> Publish not only results but also everything (complete!) so that anyone
> can *easily* follow it to check and reproduce the results - especially
> if you want people with knowledge of other architectures to comment
> (otherwise they probably won't bother).

Am I the only software developer so far who would dare to reconsider
implementation details from three status functions?

Regards,
Markus

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Bernd Petrovitsch @ 2016-10-17 11:32 UTC (permalink / raw)
  To: SF Markus Elfring, Hannes Reinecke, linux-raid
  Cc: Christoph Hellwig, Guoqing Jiang, Jens Axboe, Joe Perches,
	Mike Christie, Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors, kbuild-all, ltp
In-Reply-To: <a0f96e19-fa24-8507-2b77-344a342113aa@users.sourceforge.net>

On Mon, 2016-10-17 at 13:10 +0200, SF Markus Elfring wrote:
[...]
> > (Which up to now I thought was basic programming knowledge...)
> 
> By the way:
> Run time environments still exist where the size of a pointer can
> be also just one byte, don't they?

In the context of the Linux kernel: No.

[ Side note: there might be some DSP out there with a running Linux
kernel which cannot really address a "byte" (meaning 8bits) but only in
register sized quantities (and also aligned for that). But no one cares
here really deeply as that is a so fundamental difference that the C-
compiler must cope with that anyways in the first place. ]

[...]
> > See above. At the moment _any_ test result from your side would do.
> 
> I imagine that another single result might not be representative.

Publish not only results but also everything (complete!) so that anyone
can *easily* follow it to check and reproduce the results - especially
if you want people with knowledge of other architectures to comment
(otherwise they probably won't bother).

Kind regards,
	Bernd
-- 
Bernd Petrovitsch                  Email : bernd@petrovitsch.priv.at
                     LUGA : http://www.luga.at

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-17 11:10 UTC (permalink / raw)
  To: Hannes Reinecke, linux-raid
  Cc: Christoph Hellwig, Guoqing Jiang, Jens Axboe, Joe Perches,
	Mike Christie, Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors, kbuild-all, ltp
In-Reply-To: <c2500336-5848-66c3-ecf0-1ade04decf75@suse.de>

>> * Is a string pointer often longer than a byte?
>>
> Always.

I have got doubts for this specific information.

> (Which up to now I thought was basic programming knowledge...)

By the way:
Run time environments still exist where the size of a pointer can be also
just one byte, don't they?

>> How many results would we like to clarify from various hardware
>> and software combinations?
>>
> See above. At the moment _any_ test result from your side would do.

I imagine that another single result might not be representative.
How many lessons from test statistics will usually be also relevant here?

>> How important are the mentioned functions for you within the Linux
>> programming interface so far?
>>
> Not very. The interface is only used in a slow path, and the execution
> time doesn't affect I/O performance in any way.

Thanks for another interesting information.

>>> Case in point: with your patch the x86_64 compiler generates nearly
>>> identical code for driver/md/raid1.c, but with one instruction _more_
>>> after your patch has been applied.
>>
>> Which software versions and command parameters did you try out
>> for this information (from an unspecified run time environment)?
>>
> # gcc --version
> gcc (SUSE Linux) 4.8.5

Thanks for this detail.

* Did you choose any special optimisation settings for your quick check?

* Will any compilation results matter if "optimisation" would be
  switched off there?

> I'm still waiting from results from your side.

Would any other software developers or testers dare to add related information?

Regards,
Markus

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17  9:50 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <653e60ee-f862-8828-3e4f-498c7cc34bdc@users.sourceforge.net>

On 10/17/2016 11:00 AM, SF Markus Elfring wrote:
>>> Calling the function "seq_putc" will be more efficient than "seq_printf"
>>> in this case because of the following reasons.
>>>
>>> 1. How does the distribution look like for supported processor architectures
>>>    where the data transfer for bytes (as a function call parameter)
>>>    is faster than for (string) pointers?
>>>
>> How would I know?
> 
> How many processor architecture characteristics do you know already?
> 
x86, s390x, ppc/ppc64.

> * Is a string pointer often longer than a byte?
> 
Always.

(Which up to now I thought was basic programming knowledge...)

> * I imagine that it can become also interesting to check byte level data access
>   under constraints of machine word sizes and alignment.
> 
So, another test for you to do.

> 
>> I would assume that _you_ did some measurements here;
> 
> How much would you trust in any concrete numbers I could present
> for this use case?
> 
> Do you give more trust to a reference testing platform?
> 
At the moment _any_ test would do.
With every response from your side you just keep on asking further
questions. But so far you haven't delivered any answers nor measurements.

> 
>> I could easily claim that seq_printf() is more efficient than
>> seq_putc(), and won't apply your patch.
> 
> This is also possible in principle.
> 
No, this is what's going to happen if you don't show any measurements.

> 
>> So _you_ have to prove that your patch is more efficient.
> 
> How many results would we like to clarify from various hardware
> and software combinations?
> 
See above. At the moment _any_ test result from your side would do.

> 
>>> 2. Did anybody measure already how many the execution times can vary
>>>    for these functions?
>>>
>> Probably not.
> 
> Thanks for this information.
> 
> How important are the mentioned functions for you within the Linux
> programming interface so far?
> 
Not very. The interface is only used in a slow path, and the execution
time doesn't affect I/O performance in any way.

> 
>> Unless _you_ prove that _your_ patch is more efficient it won't get applied.
> 
> Which data would you accept as a "prove" in this case?
> 
Again: You want something from us. We don't have to prove anything, you
need to convince us. And it is really hard to convince anyone by asking
questions.

> 
>>>    Where do you get doubts about its efficiency for the data processing
>>>    of a single character?
>>>
>> Because it's being called at the end of a function calling seq_printf() already.
> 
> Interesting view …
> 
> 
>> So exchanging a single call is probably not helping anything,
>> as the compiler will optimize it anyway.
> 
> How common is the discussed software transformation between implementations
> for optimising compilers?
> 
> 
>> Case in point: with your patch the x86_64 compiler generates nearly
>> identical code for driver/md/raid1.c, but with one instruction _more_
>> after your patch has been applied.
> 
> Which software versions and command parameters did you try out
> for this information (from an unspecified run time environment)?
> 
> 
# gcc --version
gcc (SUSE Linux) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is
NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

git tree from git.kernel.org/mkp/u/4.10/scsi-queue

_I_ did some measurements.
I'm still waiting from results from your side.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-17  9:00 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, kbuild-all, ltp
In-Reply-To: <2cc42b2f-1f1a-e95c-91fa-54e1dd3b6d49@suse.de>

>> Calling the function "seq_putc" will be more efficient than "seq_printf"
>> in this case because of the following reasons.
>>
>> 1. How does the distribution look like for supported processor architectures
>>    where the data transfer for bytes (as a function call parameter)
>>    is faster than for (string) pointers?
>>
> How would I know?

How many processor architecture characteristics do you know already?

* Is a string pointer often longer than a byte?

* I imagine that it can become also interesting to check byte level data access
  under constraints of machine word sizes and alignment.

> I would assume that _you_ did some measurements here;

How much would you trust in any concrete numbers I could present
for this use case?

Do you give more trust to a reference testing platform?

> I could easily claim that seq_printf() is more efficient than
> seq_putc(), and won't apply your patch.

This is also possible in principle.

> So _you_ have to prove that your patch is more efficient.

How many results would we like to clarify from various hardware
and software combinations?

>> 2. Did anybody measure already how many the execution times can vary
>>    for these functions?
>>
> Probably not.

Thanks for this information.

How important are the mentioned functions for you within the Linux
programming interface so far?

> Unless _you_ prove that _your_ patch is more efficient it won't get applied.

Which data would you accept as a "prove" in this case?

>>    Where do you get doubts about its efficiency for the data processing
>>    of a single character?
>>
> Because it's being called at the end of a function calling seq_printf() already.

Interesting view …

> So exchanging a single call is probably not helping anything,
> as the compiler will optimize it anyway.

How common is the discussed software transformation between implementations
for optimising compilers?

> Case in point: with your patch the x86_64 compiler generates nearly
> identical code for driver/md/raid1.c, but with one instruction _more_
> after your patch has been applied.

Which software versions and command parameters did you try out
for this information (from an unspecified run time environment)?

> So it's not immediately obvious that your patch is an improvement.

I agree that there are system properties and constraints which can be
considered further.

Regards,
Markus

^ permalink raw reply

* mdadm: Intel RST option ROM not found when Video option ROM is disabled
From: Tomasz Rostanski @ 2016-10-17  8:31 UTC (permalink / raw)
  To: linux-raid

The mdadm assumes the Video ROM is present in the system (0xc0000 ~ 
0xc7fff), so the it scans the other option ROMs (including Intel RST 
option ROM) starting from address 0xc8000 to 0xfffff.

If Video ROM is absent, the other option ROMs will start from 0xc0000. 
In my case, the Intel RST option ROM is located at Video ROM area, and 
mdadm is not finding it. The find_imsm_hba_orom() returns NULL.

Due to the changes introduced in mdadm 3.3.4 (commits 7eee461 “Assemble: 
don’t assemble IMSM array without OROM.” and 8360760 “Assemble: really 
don’t assemble IMSM array without OROM.”) the mdadm is reporting the 
Intel RAID is not supported on my platform.

Regards,

Tomasz

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17  8:09 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors
In-Reply-To: <83e720c6-9037-a3c1-6e83-27505805f37f@users.sourceforge.net>

On 10/17/2016 09:39 AM, SF Markus Elfring wrote:
>>>> Does it improve code? Does it improve anything?
>>>
>>> Yes. - I got such an impression.
>>>
>>> * Is it more efficient to call the function "seq_printf" for the desired data processing
>>>   for a single character than to pass it to the function "" in a string?
>>>
>>> * Will the required data transfer shrink a bit for the affected functions because of
>>>   such a change?
>>>
>> Which are questions _you_ should be able to answer.
> 
> I wonder that the answers are not obvious for you already.
> 
> Calling the function "seq_putc" will be more efficient than "seq_printf"
> in this case because of the following reasons.
> 
> 1. How does the distribution look like for supported processor architectures
>    where the data transfer for bytes (as a function call parameter)
>    is faster than for (string) pointers?
> 
How would I know? I would assume that _you_ did some measurements here;
after all, _you_ are trying to push this patch.
I could easily claim that seq_printf() is more efficient than
seq_putc(), and won't apply your patch.
So _you_ have to prove that your patch is more efficient.

> 2. Did anybody measure already how many the execution times can vary
>    for these functions?
> 
Probably not.
But referring to the previous topic:
Unless _you_ prove that _your_ patch is more efficient it won't get
applied. _You_ want us to apply your patch, so the burden is on _you_ to
provide the required data.


>    Where do you get doubts about its efficiency for the data processing
>    of a single character?
> 
Because it's being called at the end of a function calling seq_printf()
already. So exchanging a single call is probably not helping anything,
as the compiler will optimize it anyway.
Case in point: with your patch the x86_64 compiler generates nearly
identical code for driver/md/raid1.c, but with one instruction _more_
after your patch has been applied.

So it's not immediately obvious that your patch is an improvement.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-17  7:39 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors
In-Reply-To: <1e151094-e228-5307-ae2f-b376b31f5628@suse.de>

>>> Does it improve code? Does it improve anything?
>>
>> Yes. - I got such an impression.
>>
>> * Is it more efficient to call the function "seq_printf" for the desired data processing
>>   for a single character than to pass it to the function "" in a string?
>>
>> * Will the required data transfer shrink a bit for the affected functions because of
>>   such a change?
>>
> Which are questions _you_ should be able to answer.

I wonder that the answers are not obvious for you already.

Calling the function "seq_putc" will be more efficient than "seq_printf"
in this case because of the following reasons.

1. How does the distribution look like for supported processor architectures
   where the data transfer for bytes (as a function call parameter)
   is faster than for (string) pointers?

2. Did anybody measure already how many the execution times can vary
   for these functions?

3. seq_printf() provides more functionality as this kind of programming
   interface was designed for a bigger purpose.
   How much do you care for consequences when such general functions
   are called with input data they were not designed for mainly?

4. The seq_putc() implementation is so simple.
   http://lxr.free-electrons.com/source/fs/seq_file.c?v=4.8#L657

   Where do you get doubts about its efficiency for the data processing
   of a single character?

> It's your patch, after all.

Yes. - I published a special update suggestion once again.

> Once you do (and prove that the answer is 'yes' to the above two
> questions) the patch will be applied.

How do you think about to share a bit more from your software development
and testing experience?
Which call frequencies do you observe for the affected functions?

1. raid1_status
2. raid10_status
3. raid5_status

Regards,
Markus

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: Hannes Reinecke @ 2016-10-17  5:58 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors
In-Reply-To: <59d71170-c48d-a084-c748-b6ab74a2bee4@users.sourceforge.net>

On 10/16/2016 07:10 PM, SF Markus Elfring wrote:
>> Does it improve code? Does it improve anything?
> 
> Yes. - I got such an impression.
> 
> * Is it more efficient to call the function "seq_printf" for the desired data processing
>   for a single character than to pass it to the function "" in a string?
> 
> * Will the required data transfer shrink a bit for the affected functions because of
>   such a change?
> 
Which are questions _you_ should be able to answer.
It's your patch, after all.
Once you do (and prove that the answer is 'yes' to the above two
questions) the patch will be applied.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-16 17:18 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors
In-Reply-To: <59d71170-c48d-a084-c748-b6ab74a2bee4@users.sourceforge.net>

> Yes. - I got such an impression.

Correction:

* Is it more efficient to call the function "seq_putc" for the desired data processing
  for a single character than to pass it to the function "seq_printf" in a string?

* Will the required data transfer shrink a bit for the affected functions because of
  such a change?
  How do you think about to reduce the transmission for such tiny strings
  containing delimiters?

Regards,
Markus

^ permalink raw reply

* Re: MD-RAID: Use seq_putc() in three status functions?
From: SF Markus Elfring @ 2016-10-16 17:10 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-raid, Christoph Hellwig, Guoqing Jiang, Jens Axboe,
	Joe Perches, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors
In-Reply-To: <688764a4-072d-2faf-37ba-a222b190a5d9@suse.de>

> Does it improve code? Does it improve anything?

Yes. - I got such an impression.

* Is it more efficient to call the function "seq_printf" for the desired data processing
  for a single character than to pass it to the function "" in a string?

* Will the required data transfer shrink a bit for the affected functions because of
  such a change?

Regards,
Markus

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox