* [KJ] Correct imprecise >= comparisons
@ 2006-02-08 18:53 Andreas Mohr
2006-02-09 0:06 ` Eric Sesterhenn
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: Andreas Mohr @ 2006-02-08 18:53 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 14773 bytes --]
Hello all,
in the kernel source tree, there are several places (less important ones
and high-profile ones such as TCP/IP) which do "incorrect" checks such as:
if (val >= max)
val = max;
which can instead be expressed as
if (val > max)
val = max;
in order to cut down on both executed cycles and cache write invalidation.
One question would be whether not executing the branch would lead to
pipeline stalls which could perhaps turn out to be more costly than always
writing the variable. Anyone?
I took care to not modify code which has side effects within its context
(such as a consecutive "else") or did make sense logically the way it was.
Any comments?
I'll do the same thing for <= checks if there are no objections.
Patch against 2.6.16-rc2-mm1.
Signed-off-by: Andreas Mohr <andi@lisas.de>
diff -urN linux-2.6.16-rc2-mm1.orig/arch/i386/kernel/setup.c linux-2.6.16-rc2-mm1/arch/i386/kernel/setup.c
--- linux-2.6.16-rc2-mm1.orig/arch/i386/kernel/setup.c 2006-02-08 18:29:39.000000000 +0100
+++ linux-2.6.16-rc2-mm1/arch/i386/kernel/setup.c 2006-02-08 19:10:12.000000000 +0100
@@ -1051,7 +1051,7 @@
/* check max_low_pfn */
if (start >= ((max_low_pfn + 1) << PAGE_SHIFT))
return 0;
- if (end >= ((max_low_pfn + 1) << PAGE_SHIFT))
+ if (end > ((max_low_pfn + 1) << PAGE_SHIFT))
end = (max_low_pfn + 1) << PAGE_SHIFT;
if (start < end)
free_bootmem(start, end - start);
diff -urN linux-2.6.16-rc2-mm1.orig/arch/x86_64/kernel/e820.c linux-2.6.16-rc2-mm1/arch/x86_64/kernel/e820.c
--- linux-2.6.16-rc2-mm1.orig/arch/x86_64/kernel/e820.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/arch/x86_64/kernel/e820.c 2006-02-08 19:12:50.000000000 +0100
@@ -141,7 +141,7 @@
addr = start;
last = round_down(ei->addr + ei->size, PAGE_SIZE);
- if (last >= end)
+ if (last > end)
last = end;
if (last > addr && last-addr >= PAGE_SIZE)
@@ -211,7 +211,7 @@
addr = start;
last = round_down(ei->addr + ei->size, PAGE_SIZE);
- if (last >= end)
+ if (last > end)
last = end;
if (last > addr)
diff -urN linux-2.6.16-rc2-mm1.orig/arch/x86_64/mm/numa.c linux-2.6.16-rc2-mm1/arch/x86_64/mm/numa.c
--- linux-2.6.16-rc2-mm1.orig/arch/x86_64/mm/numa.c 2006-02-08 18:29:40.000000000 +0100
+++ linux-2.6.16-rc2-mm1/arch/x86_64/mm/numa.c 2006-02-08 19:13:28.000000000 +0100
@@ -319,7 +319,7 @@
#ifdef CONFIG_NUMA_EMU
if(!strncmp(opt, "fake=", 5)) {
numa_fake = simple_strtoul(opt+5,NULL,0); ;
- if (numa_fake >= MAX_NUMNODES)
+ if (numa_fake > MAX_NUMNODES)
numa_fake = MAX_NUMNODES;
}
#endif
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/isdn/hisax/config.c linux-2.6.16-rc2-mm1/drivers/isdn/hisax/config.c
--- linux-2.6.16-rc2-mm1.orig/drivers/isdn/hisax/config.c 2006-02-08 18:29:41.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/isdn/hisax/config.c 2006-02-08 19:02:49.000000000 +0100
@@ -630,7 +630,7 @@
len, HISAX_STATUS_BUFSIZE);
}
count = cs->status_end - cs->status_read + 1;
- if (count >= len)
+ if (count > len)
count = len;
copy_to_user(p, cs->status_read, count);
cs->status_read += count;
@@ -716,7 +716,7 @@
}
count = len;
i = cs->status_end - cs->status_write + 1;
- if (i >= len)
+ if (i > len)
i = len;
len -= i;
memcpy(cs->status_write, p, i);
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/pci/proc.c linux-2.6.16-rc2-mm1/drivers/pci/proc.c
--- linux-2.6.16-rc2-mm1.orig/drivers/pci/proc.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/pci/proc.c 2006-02-08 18:32:26.000000000 +0100
@@ -69,7 +69,7 @@
if (pos >= size)
return 0;
- if (nbytes >= size)
+ if (nbytes > size)
nbytes = size;
if (pos + nbytes > size)
nbytes = size - pos;
@@ -139,7 +139,7 @@
if (pos >= size)
return 0;
- if (nbytes >= size)
+ if (nbytes > size)
nbytes = size;
if (pos + nbytes > size)
nbytes = size - pos;
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/pnp/isapnp/proc.c linux-2.6.16-rc2-mm1/drivers/pnp/isapnp/proc.c
--- linux-2.6.16-rc2-mm1.orig/drivers/pnp/isapnp/proc.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/pnp/isapnp/proc.c 2006-02-08 19:08:17.000000000 +0100
@@ -65,7 +65,7 @@
if (pos >= size)
return 0;
- if (nbytes >= size)
+ if (nbytes > size)
nbytes = size;
if (pos + nbytes > size)
nbytes = size - pos;
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/scsi/BusLogic.c linux-2.6.16-rc2-mm1/drivers/scsi/BusLogic.c
--- linux-2.6.16-rc2-mm1.orig/drivers/scsi/BusLogic.c 2006-02-08 18:29:43.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/scsi/BusLogic.c 2006-02-08 19:07:19.000000000 +0100
@@ -3266,7 +3266,7 @@
BusLogic_Error("Message Buffer length %d exceeds size %d\n", HostAdapter, Length, BusLogic_MessageBufferSize);
if ((Length -= Offset) <= 0)
return 0;
- if (Length >= BytesAvailable)
+ if (Length > BytesAvailable)
Length = BytesAvailable;
memcpy(ProcBuffer, HostAdapter->MessageBuffer + Offset, Length);
*StartPointer = ProcBuffer;
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/scsi/osst.c linux-2.6.16-rc2-mm1/drivers/scsi/osst.c
--- linux-2.6.16-rc2-mm1.orig/drivers/scsi/osst.c 2006-02-08 18:29:44.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/scsi/osst.c 2006-02-08 19:05:21.000000000 +0100
@@ -4043,7 +4043,7 @@
for (i=0; i<arg; i++)
ioctl_result |= osst_write_filemark(STp, &SRpnt);
if (fileno >= 0) fileno += arg;
- if (blkno >= 0) blkno = 0;
+ if (blkno > 0) blkno = 0;
goto os_bypass;
case MTWSM:
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/scsi/sg.c linux-2.6.16-rc2-mm1/drivers/scsi/sg.c
--- linux-2.6.16-rc2-mm1.orig/drivers/scsi/sg.c 2006-02-08 18:29:44.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/scsi/sg.c 2006-02-08 19:03:44.000000000 +0100
@@ -826,7 +826,7 @@
return result;
if (val < 0)
return -EIO;
- if (val >= MULDIV (INT_MAX, USER_HZ, HZ))
+ if (val > MULDIV (INT_MAX, USER_HZ, HZ))
val = MULDIV (INT_MAX, USER_HZ, HZ);
sfp->timeout_user = val;
sfp->timeout = MULDIV (val, HZ, USER_HZ);
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/usb/mon/mon_text.c linux-2.6.16-rc2-mm1/drivers/usb/mon/mon_text.c
--- linux-2.6.16-rc2-mm1.orig/drivers/usb/mon/mon_text.c 2006-02-08 18:29:44.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/usb/mon/mon_text.c 2006-02-08 19:01:41.000000000 +0100
@@ -95,7 +95,7 @@
if (len <= 0)
return 'L';
- if (len >= DATA_MAX)
+ if (len > DATA_MAX)
len = DATA_MAX;
if (usb_pipein(pipe)) {
@@ -338,7 +338,7 @@
if ((data_len = ep->length) > 0) {
if (ep->data_flag == 0) {
cnt += snprintf(pbuf + cnt, limit - cnt, " =");
- if (data_len >= DATA_MAX)
+ if (data_len > DATA_MAX)
data_len = DATA_MAX;
for (i = 0; i < data_len; i++) {
if (i % 4 == 0) {
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/video/epson1355fb.c linux-2.6.16-rc2-mm1/drivers/video/epson1355fb.c
--- linux-2.6.16-rc2-mm1.orig/drivers/video/epson1355fb.c 2006-02-08 18:29:44.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/video/epson1355fb.c 2006-02-08 18:35:41.000000000 +0100
@@ -417,7 +417,7 @@
if (p >= info->fix.smem_len)
return 0;
- if (count >= info->fix.smem_len)
+ if (count > info->fix.smem_len)
count = info->fix.smem_len;
if (count + p > info->fix.smem_len)
count = info->fix.smem_len - p;
@@ -451,7 +451,7 @@
/* from fbmem.c except for our own copy_*_user */
if (p > info->fix.smem_len)
return -ENOSPC;
- if (count >= info->fix.smem_len)
+ if (count > info->fix.smem_len)
count = info->fix.smem_len;
err = 0;
if (count + p > info->fix.smem_len) {
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/video/fbmem.c linux-2.6.16-rc2-mm1/drivers/video/fbmem.c
--- linux-2.6.16-rc2-mm1.orig/drivers/video/fbmem.c 2006-02-08 18:29:44.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/video/fbmem.c 2006-02-08 18:33:35.000000000 +0100
@@ -596,7 +596,7 @@
if (p >= total_size)
return 0;
- if (count >= total_size)
+ if (count > total_size)
count = total_size;
if (count + p > total_size)
@@ -671,7 +671,7 @@
if (p > total_size)
return 0;
- if (count >= total_size)
+ if (count > total_size)
count = total_size;
if (count + p > total_size)
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/video/stifb.c linux-2.6.16-rc2-mm1/drivers/video/stifb.c
--- linux-2.6.16-rc2-mm1.orig/drivers/video/stifb.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/video/stifb.c 2006-02-08 18:36:12.000000000 +0100
@@ -924,7 +924,7 @@
if (p >= info->fix.smem_len)
return 0;
- if (count >= info->fix.smem_len)
+ if (count > info->fix.smem_len)
count = info->fix.smem_len;
if (count + p > info->fix.smem_len)
count = info->fix.smem_len - p;
@@ -959,7 +959,7 @@
if (p > info->fix.smem_len)
return -ENOSPC;
- if (count >= info->fix.smem_len)
+ if (count > info->fix.smem_len)
count = info->fix.smem_len;
err = 0;
if (count + p > info->fix.smem_len) {
diff -urN linux-2.6.16-rc2-mm1.orig/drivers/zorro/proc.c linux-2.6.16-rc2-mm1/drivers/zorro/proc.c
--- linux-2.6.16-rc2-mm1.orig/drivers/zorro/proc.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/drivers/zorro/proc.c 2006-02-08 19:09:22.000000000 +0100
@@ -55,7 +55,7 @@
if (pos >= sizeof(struct ConfigDev))
return 0;
- if (nbytes >= sizeof(struct ConfigDev))
+ if (nbytes > sizeof(struct ConfigDev))
nbytes = sizeof(struct ConfigDev);
if (pos + nbytes > sizeof(struct ConfigDev))
nbytes = sizeof(struct ConfigDev) - pos;
diff -urN linux-2.6.16-rc2-mm1.orig/fs/nfs/inode.c linux-2.6.16-rc2-mm1/fs/nfs/inode.c
--- linux-2.6.16-rc2-mm1.orig/fs/nfs/inode.c 2006-02-08 18:29:45.000000000 +0100
+++ linux-2.6.16-rc2-mm1/fs/nfs/inode.c 2006-02-08 19:15:52.000000000 +0100
@@ -224,7 +224,7 @@
{
if (bsize < NFS_MIN_FILE_IO_SIZE)
bsize = NFS_DEF_FILE_IO_SIZE;
- else if (bsize >= NFS_MAX_FILE_IO_SIZE)
+ else if (bsize > NFS_MAX_FILE_IO_SIZE)
bsize = NFS_MAX_FILE_IO_SIZE;
return nfs_block_bits(bsize, nrbitsp);
diff -urN linux-2.6.16-rc2-mm1.orig/fs/nfs/nfs3xdr.c linux-2.6.16-rc2-mm1/fs/nfs/nfs3xdr.c
--- linux-2.6.16-rc2-mm1.orig/fs/nfs/nfs3xdr.c 2006-02-08 18:29:45.000000000 +0100
+++ linux-2.6.16-rc2-mm1/fs/nfs/nfs3xdr.c 2006-02-08 19:19:07.000000000 +0100
@@ -148,7 +148,7 @@
int fmode;
type = ntohl(*p++);
- if (type >= NF3BAD)
+ if (type > NF3BAD)
type = NF3BAD;
fmode = nfs_type2fmt[type].mode;
fattr->type = nfs_type2fmt[type].nfs2type;
diff -urN linux-2.6.16-rc2-mm1.orig/fs/sysfs/file.c linux-2.6.16-rc2-mm1/fs/sysfs/file.c
--- linux-2.6.16-rc2-mm1.orig/fs/sysfs/file.c 2006-02-08 18:29:46.000000000 +0100
+++ linux-2.6.16-rc2-mm1/fs/sysfs/file.c 2006-02-08 19:20:41.000000000 +0100
@@ -191,7 +191,7 @@
if (!buffer->page)
return -ENOMEM;
- if (count >= PAGE_SIZE)
+ if (count > PAGE_SIZE)
count = PAGE_SIZE;
error = copy_from_user(buffer->page,buf,count);
buffer->needs_read_fill = 1;
diff -urN linux-2.6.16-rc2-mm1.orig/fs/ufs/balloc.c linux-2.6.16-rc2-mm1/fs/ufs/balloc.c
--- linux-2.6.16-rc2-mm1.orig/fs/ufs/balloc.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/fs/ufs/balloc.c 2006-02-08 19:19:53.000000000 +0100
@@ -743,7 +743,7 @@
*/
start = blkno + 1;
end = start + uspi->s_contigsumsize;
- if ( end >= ucpi->c_nclusterblks)
+ if ( end > ucpi->c_nclusterblks)
end = ucpi->c_nclusterblks;
i = ubh_find_next_zero_bit (UCPI_UBH, ucpi->c_clusteroff, end, start);
if (i > end)
diff -urN linux-2.6.16-rc2-mm1.orig/ipc/sem.c linux-2.6.16-rc2-mm1/ipc/sem.c
--- linux-2.6.16-rc2-mm1.orig/ipc/sem.c 2006-02-08 18:29:46.000000000 +0100
+++ linux-2.6.16-rc2-mm1/ipc/sem.c 2006-02-08 19:31:04.000000000 +0100
@@ -1090,7 +1090,7 @@
}
max = 0;
for (sop = sops; sop < sops + nsops; sop++) {
- if (sop->sem_num >= max)
+ if (sop->sem_num > max)
max = sop->sem_num;
if (sop->sem_flg & SEM_UNDO)
undos = 1;
diff -urN linux-2.6.16-rc2-mm1.orig/net/ipv4/ip_output.c linux-2.6.16-rc2-mm1/net/ipv4/ip_output.c
--- linux-2.6.16-rc2-mm1.orig/net/ipv4/ip_output.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/net/ipv4/ip_output.c 2006-02-08 19:22:18.000000000 +0100
@@ -985,7 +985,7 @@
unsigned int left;
if (page && (left = PAGE_SIZE - off) > 0) {
- if (copy >= left)
+ if (copy > left)
copy = left;
if (page != frag->page) {
if (i == MAX_SKB_FRAGS) {
diff -urN linux-2.6.16-rc2-mm1.orig/net/ipv6/ip6_output.c linux-2.6.16-rc2-mm1/net/ipv6/ip6_output.c
--- linux-2.6.16-rc2-mm1.orig/net/ipv6/ip6_output.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/net/ipv6/ip6_output.c 2006-02-08 19:25:17.000000000 +0100
@@ -1081,7 +1081,7 @@
unsigned int left;
if (page && (left = PAGE_SIZE - off) > 0) {
- if (copy >= left)
+ if (copy > left)
copy = left;
if (page != frag->page) {
if (i == MAX_SKB_FRAGS) {
diff -urN linux-2.6.16-rc2-mm1.orig/net/sunrpc/xprt.c linux-2.6.16-rc2-mm1/net/sunrpc/xprt.c
--- linux-2.6.16-rc2-mm1.orig/net/sunrpc/xprt.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/net/sunrpc/xprt.c 2006-02-08 19:23:21.000000000 +0100
@@ -456,7 +456,7 @@
req->rq_timeout <<= 1;
else
req->rq_timeout += to->to_increment;
- if (to->to_maxval && req->rq_timeout >= to->to_maxval)
+ if (to->to_maxval && req->rq_timeout > to->to_maxval)
req->rq_timeout = to->to_maxval;
req->rq_retries++;
pprintk("RPC: %lu retrans\n", jiffies);
diff -urN linux-2.6.16-rc2-mm1.orig/sound/oss/emu10k1/cardwo.c linux-2.6.16-rc2-mm1/sound/oss/emu10k1/cardwo.c
--- linux-2.6.16-rc2-mm1.orig/sound/oss/emu10k1/cardwo.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/sound/oss/emu10k1/cardwo.c 2006-02-08 19:26:00.000000000 +0100
@@ -59,7 +59,7 @@
else
woinst->num_voices = wave_fmt->channels;
- if (wave_fmt->samplingrate >= 0x2ee00)
+ if (wave_fmt->samplingrate > 0x2ee00)
wave_fmt->samplingrate = 0x2ee00;
wave_fmt->passthrough = 0;
diff -urN linux-2.6.16-rc2-mm1.orig/sound/oss/sequencer.c linux-2.6.16-rc2-mm1/sound/oss/sequencer.c
--- linux-2.6.16-rc2-mm1.orig/sound/oss/sequencer.c 2006-02-03 07:03:08.000000000 +0100
+++ linux-2.6.16-rc2-mm1/sound/oss/sequencer.c 2006-02-08 19:28:09.000000000 +0100
@@ -1602,7 +1602,7 @@
if (!base_freq)
return base_freq;
- if (range >= 8192)
+ if (range > 8192)
range = 8192;
bend = bend * range / 8192; /* Convert to cents */
diff -urN linux-2.6.16-rc2-mm1.orig/sound/synth/emux/soundfont.c linux-2.6.16-rc2-mm1/sound/synth/emux/soundfont.c
--- linux-2.6.16-rc2-mm1.orig/sound/synth/emux/soundfont.c 2006-02-08 18:29:47.000000000 +0100
+++ linux-2.6.16-rc2-mm1/sound/synth/emux/soundfont.c 2006-02-08 19:29:41.000000000 +0100
@@ -889,7 +889,7 @@
{
int val = (0x7f * 92 - msec) / 92;
if (val < 1) val = 1;
- if (val >= 126) val = 126;
+ if (val > 126) val = 126;
return val;
}
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
@ 2006-02-09 0:06 ` Eric Sesterhenn
2006-02-09 12:20 ` Andreas Mohr
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Eric Sesterhenn @ 2006-02-09 0:06 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 877 bytes --]
hi,
> in the kernel source tree, there are several places (less important ones
> and high-profile ones such as TCP/IP) which do "incorrect" checks such as:
>
> if (val >= max)
> val = max;
>
> which can instead be expressed as
>
> if (val > max)
> val = max;
>
> in order to cut down on both executed cycles and cache write invalidation.
I am wondering if it is actually worth the effort, since it seems to
me as if this would normally be an exception that val >= max. And
instead of returning with an error or aborting you normalize the value.
Therefore the val=max statement is just executed pretty rarely, when an
user supplies a bad value for example. Even if there would be something
like
foo(int x){
if (x>=2000)
x=2000;
// more code
}
bar (void){
for (int i=0; i<4000; i++)
foo(i);
}
you just win in one of the many cases...
Just my 2 Cents :)
Eric
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
2006-02-09 0:06 ` Eric Sesterhenn
@ 2006-02-09 12:20 ` Andreas Mohr
2006-02-09 12:44 ` Matthew Wilcox
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Andreas Mohr @ 2006-02-09 12:20 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 916 bytes --]
Hi,
On Thu, Feb 09, 2006 at 01:06:29AM +0100, Eric Sesterhenn wrote:
> hi,
>
> > in the kernel source tree, there are several places (less important ones
> > and high-profile ones such as TCP/IP) which do "incorrect" checks such as:
> >
> > if (val >= max)
> > val = max;
> >
> > which can instead be expressed as
> >
> > if (val > max)
> > val = max;
> >
> > in order to cut down on both executed cycles and cache write invalidation.
>
> I am wondering if it is actually worth the effort, since it seems to
> me as if this would normally be an exception that val >= max. And
> instead of returning with an error or aborting you normalize the value.
It probably is unlikely in several cases, but that makes it all the more a
janitorial effort ;)
It's just "unclean" to set a variable's limit if it didn't even actually
step beyond the limit, which makes this more or less a janitorial task.
Andreas Mohr
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
2006-02-09 0:06 ` Eric Sesterhenn
2006-02-09 12:20 ` Andreas Mohr
@ 2006-02-09 12:44 ` Matthew Wilcox
2006-02-09 12:52 ` Eric Sesterhenn
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2006-02-09 12:44 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]
On Thu, Feb 09, 2006 at 01:20:07PM +0100, Andreas Mohr wrote:
> > > if (val >= max)
> > > val = max;
> > >
> > > which can instead be expressed as
> > >
> > > if (val > max)
> > > val = max;
> > >
> > > in order to cut down on both executed cycles and cache write invalidation.
> >
> > I am wondering if it is actually worth the effort, since it seems to
> > me as if this would normally be an exception that val >= max. And
> > instead of returning with an error or aborting you normalize the value.
>
> It probably is unlikely in several cases, but that makes it all the more a
> janitorial effort ;)
> It's just "unclean" to set a variable's limit if it didn't even actually
> step beyond the limit, which makes this more or less a janitorial task.
I wonder if we want a macro similar to min()/max(). Maybe we could
write it as:
limit(val, max);
which could be defined as:
#define limit(val, max) { \
typeof(val) _val = (val); typeof(max) _max = (max); \
if (_val > _max) _val = _max; }
Thoughts?
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (2 preceding siblings ...)
2006-02-09 12:44 ` Matthew Wilcox
@ 2006-02-09 12:52 ` Eric Sesterhenn
2006-02-09 13:45 ` Andreas Mohr
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Eric Sesterhenn @ 2006-02-09 12:52 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]
hi,
> It probably is unlikely in several cases, but that makes it all the more a
> janitorial effort ;)
> It's just "unclean" to set a variable's limit if it didn't even actually
> step beyond the limit, which makes this more or less a janitorial task.
sound more like a madochistic task to me ... :)
Anyways, i wanted to check the cache trashing stuff...
snakebyte@alice:~$ cat test.c
#define rdtscl(low) \
__asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
int main(int argc, char **argv ) {
int a,b,i;
i = 3;
rdtscl(a);
if (i>=3)
i = 3;
rdtscl(b);
printf("i>=3 : %d\n", b-a);
rdtscl(a);
if(i>3)
i = 3;
rdtscl(b);
printf("i>3 : %d\n", b-a);
}
snakebyte@alice:~$ gcc test.c
snakebyte@alice:~$ ./a.out
i>=3 : 11
i>3 : 28
seems the >= actually performs faster on my athlon xp...
on a non xp athlon i get this:
i>=3 : 11
i>3 : 31
since the asm output seems pretty good, i it seems the jump wastes
more cycles :(
if (i>=3) case:
0x080483b7 <main+35>: rdtsc
0x080483b9 <main+37>: mov %eax,0xfffffffc(%ebp)
0x080483bc <main+40>: cmpl $0x2,0xfffffff4(%ebp)
0x080483c0 <main+44>: jle 0x80483c9 <main+53>
0x080483c2 <main+46>: movl $0x3,0xfffffff4(%ebp)
0x080483c9 <main+53>: rdtsc
if (i>3) case:
0x080483e6 <main+82>: rdtsc
0x080483e8 <main+84>: mov %eax,0xfffffffc(%ebp)
0x080483eb <main+87>: cmpl $0x3,0xfffffff4(%ebp)
0x080483ef <main+91>: jle 0x80483f8 <main+100>
0x080483f1 <main+93>: movl $0x3,0xfffffff4(%ebp)
0x080483f8 <main+100>: rdtsc
good i also checked an intel box :) my centrino gives the following:
i>=3 : 66
i>3 : 47
and a friends amd64
i >= 3 : 8
i>3 : 22
here the i>3 is faster... seems we need a macro for this stuff, which we
can change for each processor type ;-)
Greetings, Eric
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (3 preceding siblings ...)
2006-02-09 12:52 ` Eric Sesterhenn
@ 2006-02-09 13:45 ` Andreas Mohr
2006-02-09 14:06 ` Eric Sesterhenn
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Andreas Mohr @ 2006-02-09 13:45 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 1882 bytes --]
Hi,
On Thu, Feb 09, 2006 at 01:52:01PM +0100, Eric Sesterhenn wrote:
> hi,
>
> > It probably is unlikely in several cases, but that makes it all the more a
> > janitorial effort ;)
> > It's just "unclean" to set a variable's limit if it didn't even actually
> > step beyond the limit, which makes this more or less a janitorial task.
>
> sound more like a madochistic task to me ... :)
> Anyways, i wanted to check the cache trashing stuff...
>
>
> snakebyte@alice:~$ cat test.c
> #define rdtscl(low) \
> __asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
> int main(int argc, char **argv ) {
> int a,b,i;
> i = 3;
>
> rdtscl(a);
> if (i>=3)
> i = 3;
>
> rdtscl(b);
> printf("i>=3 : %d\n", b-a);
>
> rdtscl(a);
> if(i>3)
> i = 3;
> rdtscl(b);
>
> printf("i>3 : %d\n", b-a);
>
> }
Hmm, 11/28 ops, 11/31, 66/47, 8/22.
Sure sounds like one hell of a fluctuation. Would this code wait for
the memory write to hit the hardware? In that case the fluctuation
would be justified. Oh, and this means that the two operations are
inter-dependent! (the second most likely needs to wait for the first one
to get read back!). Better use "j" for a second test...
Plus, are you sure that there was no compiler optimization going on?
The
i = 3;
if (i >= 3)
i = 3;
case is very dangerous, I'd say. Better to fetch the value 3 from calling
another function that returns 3. Otherwise the compiler might be very tempted
to optimize the whole thing away.
> here the i>3 is faster... seems we need a macro for this stuff, which we
> can change for each processor type ;-)
Problem is that then one could easily object that this should have been
properly decided by the compiler itself already...
(and that objection would be very valid, I think)
Andreas Mohr
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (4 preceding siblings ...)
2006-02-09 13:45 ` Andreas Mohr
@ 2006-02-09 14:06 ` Eric Sesterhenn
2006-02-09 15:35 ` Håkon Løvdal
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Eric Sesterhenn @ 2006-02-09 14:06 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 2401 bytes --]
hi,
> Hmm, 11/28 ops, 11/31, 66/47, 8/22.
> Sure sounds like one hell of a fluctuation.
These were on different cpus, so this is expected. When i run it several
times on one cpu it stays constant
> Would this code wait for
> the memory write to hit the hardware?
i guess writing an int into the memory is pretty atomic
> Plus, are you sure that there was no compiler optimization going on?
> The
> i = 3;
> if (i >= 3)
> i = 3;
there was compiler optimization ( kind of, but it doesnt seem to
matter ) The if (i>=3) case gets optimized to if(i>2)
0x080483bc <main+40>: cmpl $0x2,0xfffffff4(%ebp)
> case is very dangerous, I'd say. Better to fetch the value 3 from calling
> another function that returns 3. Otherwise the compiler might be very tempted
> to optimize the whole thing away.
I changed it a bit, so gcc -O3 doesnt compile anything away, because i
wanted to check if it can optimize this case. using gcc version 3.4.5
the assembler output stays unoptimized
#define rdtscl(low) \
__asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
int main(int argc, char **argv ) {
int a,b,i;
i = atoi(argv[1]);
rdtscl(a);
if (i>=3)
i = 3;
rdtscl(b);
printf("i>=3 : %d %d\n", b-a, i);
i = atoi(argv[1]);
rdtscl(a);
if(i>3)
i = 3;
rdtscl(b);
printf("i>3 : %d %d\n", b-a, i);
i = atoi(argv[1]);
rdtscl(a);
if (i>=3)
i = 3;
rdtscl(b);
printf("i>=3 : %d %d\n", b-a, i);
}
It is still
0x080483fe <main+30>: rdtsc
0x08048400 <main+32>: cmp $0x2,%ecx
0x08048403 <main+35>: mov %eax,%ebx
0x08048405 <main+37>: jle 0x804840c <main+44>
0x08048407 <main+39>: mov $0x3,%ecx
0x0804840c <main+44>: rdtsc
vs
0x08048431 <main+81>: rdtsc
0x08048433 <main+83>: cmp $0x3,%ecx
0x08048436 <main+86>: mov %eax,%ebx
0x08048438 <main+88>: jle 0x804843f <main+95>
0x0804843a <main+90>: mov $0x3,%ecx
0x0804843f <main+95>: rdtsc
> > here the i>3 is faster... seems we need a macro for this stuff, which we
> > can change for each processor type ;-)
>
> Problem is that then one could easily object that this should have been
> properly decided by the compiler itself already...
> (and that objection would be very valid, I think)
sorry i forgot the <irony> tags :)
I created a bugzilla entry on gcc for this:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26196
Greetings, Eric
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (5 preceding siblings ...)
2006-02-09 14:06 ` Eric Sesterhenn
@ 2006-02-09 15:35 ` Håkon Løvdal
2006-02-09 22:16 ` Darren Jenkins
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Håkon Løvdal @ 2006-02-09 15:35 UTC (permalink / raw)
To: kernel-janitors
On 2/9/06, Matthew Wilcox <matthew@wil.cx> wrote:
> I wonder if we want a macro similar to min()/max(). Maybe we could
> write it as:
>
> limit(val, max);
>
> which could be defined as:
>
> #define limit(val, max) { \
> typeof(val) _val = (val); typeof(max) _max = (max); \
> if (_val > _max) _val = _max; }
>
> Thoughts?
I also though of making a macro based on the min/max macros, but I did
not come up with a name other than set_max and I did not think that was
good enough. Using "limit" instead of "set" on the other hand I think
will be good.
In order to just use existing macros and not introducing any new a
test like
...
if (var > max)
var = max;
...
could be written as
...
var = min(var, max);
...
however this approach would have two drawbacks,
1: it always executes an assignment which is negative for performance
2: it is a bit mentally confusing because you are using the /min/
macro to set a /max/ value.
so this is not an option.
I support creating two "limit_max(val, max)" and "limit_min(val, min)"
macros with bodies like above.
BR Håkon Løvdal
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (6 preceding siblings ...)
2006-02-09 15:35 ` Håkon Løvdal
@ 2006-02-09 22:16 ` Darren Jenkins
2006-02-10 0:30 ` Adrian Bunk
2006-02-10 1:05 ` Darren Jenkins
9 siblings, 0 replies; 11+ messages in thread
From: Darren Jenkins @ 2006-02-09 22:16 UTC (permalink / raw)
To: kernel-janitors
On 2/9/06, Eric Sesterhenn <snakebyte@gmx.de> wrote:
> snakebyte@alice:~$ gcc test.c
> snakebyte@alice:~$ ./a.out
> i>=3 : 11
> i>3 : 28
>
> seems the >= actually performs faster on my athlon xp...
>
> on a non xp athlon i get this:
> i>=3 : 11
> i>3 : 31
>
> good i also checked an intel box :) my centrino gives the following:
>
> i>=3 : 66
> i>3 : 47
>
> and a friends amd64
>
> i >= 3 : 8
> i>3 : 22
>
> here the i>3 is faster... seems we need a macro for this stuff, which we
> can change for each processor type ;-)
>
> Greetings, Eric
Interesting
I guess on alot of arch's doing
if (value >= max+1)
value = max;
might be the more efficient.
hmm will have to investigate further.
Darren J.
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (7 preceding siblings ...)
2006-02-09 22:16 ` Darren Jenkins
@ 2006-02-10 0:30 ` Adrian Bunk
2006-02-10 1:05 ` Darren Jenkins
9 siblings, 0 replies; 11+ messages in thread
From: Adrian Bunk @ 2006-02-10 0:30 UTC (permalink / raw)
To: kernel-janitors
[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]
On Fri, Feb 10, 2006 at 09:16:36AM +1100, Darren Jenkins wrote:
> On 2/9/06, Eric Sesterhenn <snakebyte@gmx.de> wrote:
>
> > snakebyte@alice:~$ gcc test.c
> > snakebyte@alice:~$ ./a.out
> > i>=3 : 11
> > i>3 : 28
> >
> > seems the >= actually performs faster on my athlon xp...
> >
> > on a non xp athlon i get this:
> > i>=3 : 11
> > i>3 : 31
> >
> > good i also checked an intel box :) my centrino gives the following:
> >
> > i>=3 : 66
> > i>3 : 47
> >
> > and a friends amd64
> >
> > i >= 3 : 8
> > i>3 : 22
> >
> > here the i>3 is faster... seems we need a macro for this stuff, which we
> > can change for each processor type ;-)
> >
> > Greetings, Eric
>
> Interesting
> I guess on alot of arch's doing
>
> if (value >= max+1)
> value = max;
>
> might be the more efficient.
>
> hmm will have to investigate further.
What is the purpose of all this?
With modern compilers, the general rule is to write readable C code and
let the compiler do the rest.
Exceptions could be in real hot paths, but even in such cases there's
the problem that tricky code performing better with today's compilers
might perform worse with future compilers (similar to the goto tricks
people used in the past to circumvent some compiler limitations).
> Darren J.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
[-- Attachment #2: Type: text/plain, Size: 168 bytes --]
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [KJ] Correct imprecise >= comparisons
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
` (8 preceding siblings ...)
2006-02-10 0:30 ` Adrian Bunk
@ 2006-02-10 1:05 ` Darren Jenkins
9 siblings, 0 replies; 11+ messages in thread
From: Darren Jenkins @ 2006-02-10 1:05 UTC (permalink / raw)
To: kernel-janitors
On 2/10/06, Adrian Bunk <bunk@stusta.de> wrote:
> What is the purpose of all this?
>
Good question really,
Me I work with very small Uc's so I habitually optimize everything. I
guess It's not such a good idea when working with the linux kernel.
I think however understanding what is going on is fairly important,
especially for the less expert/experienced people on the list. (Like
me). So while a conversation like this will probably not benefit the
kernel, it may benefit the people involved.
> With modern compilers, the general rule is to write readable C code and
> let the compiler do the rest.
>
> Exceptions could be in real hot paths, but even in such cases there's
> the problem that tricky code performing better with today's compilers
> might perform worse with future compilers (similar to the goto tricks
> people used in the past to circumvent some compiler limitations).
So we should concentrate our efforts on readability/correctness not
optimisations
Sounds good to me.
Darren J.
_______________________________________________
Kernel-janitors mailing list
Kernel-janitors@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/kernel-janitors
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-02-10 1:05 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-08 18:53 [KJ] Correct imprecise >= comparisons Andreas Mohr
2006-02-09 0:06 ` Eric Sesterhenn
2006-02-09 12:20 ` Andreas Mohr
2006-02-09 12:44 ` Matthew Wilcox
2006-02-09 12:52 ` Eric Sesterhenn
2006-02-09 13:45 ` Andreas Mohr
2006-02-09 14:06 ` Eric Sesterhenn
2006-02-09 15:35 ` Håkon Løvdal
2006-02-09 22:16 ` Darren Jenkins
2006-02-10 0:30 ` Adrian Bunk
2006-02-10 1:05 ` Darren Jenkins
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.