OOPS: 2.6.16-rc6 cpufreq

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* OOPS: 2.6.16-rc6 cpufreq_conservative
@ 2006-03-18 20:25 Parag Warudkar
  2006-03-18 21:40 ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Parag Warudkar @ 2006-03-18 20:25 UTC (permalink / raw)
  To: linux-kernel, alex-kernel, jun.nakajima, Linus Torvalds

I can reproduce the below OOPS by doing 
$ modprobe cpufreq_conservative
$ echo conservative > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
$ echo conservative > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor

Which brings up a question - Do we really support difference scaling governors 
for different cpu cores?

This is  a Centrino Duo Laptop.

Unable to handle kernel NULL pointer dereference at virtual address 0000001c
 printing eip:
f834e6d0
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in: cpufreq_conservative oprofile ntfs snd_pcm_oss 
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device eth1394 
ohci1394 i2c_i801 i2c_core hw_random ipw3945 ieee80211 ieee80211_crypt 
firmware_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd 
snd_page_alloc i915 drm cpufreq_ondemand speedstep_centrino b44
CPU:    1
EIP:    0060:[<f834e6d0>]    Not tainted VLI
EFLAGS: 00010206   (2.6.16-rc6 #3)
EIP is at dbs_check_cpu+0x220/0x400 [cpufreq_conservative]
eax: 00000000   ebx: 00000001   ecx: 00000001   edx: 7840d1bc
esi: 78445730   edi: 00000002   ebp: 78445730   esp: 7a1c1ef0
ds: 007b   es: 007b   ss: 0068
Process events/1 (pid: 9, threadinfo=7a1c0000 task=7a33fa90)
Stack: <0>7840d1bc 00000002 00000001 7a2d3a80 00000000 f834f704 7a1aa1c0 
00000000
       f834e938 00000000 00000202 7a1c0000 f834f700 78132869 00000000 00000000
       18a60e00 7a1aa1d0 7a1aa1e8 f834e8b0 00000202 7a1c0000 7a1aa1d8 7a1aa1d0
Call Trace:
 [<f834e938>] do_dbs_timer+0x88/0xc0 [cpufreq_conservative]
 [<78132869>] run_workqueue+0x79/0xf0
 [<f834e8b0>] do_dbs_timer+0x0/0xc0 [cpufreq_conservative]
 [<78132a38>] worker_thread+0x158/0x180
 [<7811b0d0>] default_wake_function+0x0/0x20
 [<7811b0d0>] default_wake_function+0x0/0x20
 [<781328e0>] worker_thread+0x0/0x180
 [<7813664c>] kthread+0xbc/0x100
 [<78136590>] kthread+0x0/0x100
 [<78101285>] kernel_thread_helper+0x5/0x10
Code: 0f bc c0 83 f8 03 bb 02 00 00 00 0f 4c d8 83 fb 01 77 49 89 ee 8d b6 00 
00 00 00 8b 04 9d 04 80 44 78 bf 02 00 00 00 01 f0 8b 00 <8b> 40 1c 89 7c 24 
04 c7 04 24 bc d1 40 78 89 04 9d 48 fa 34 f8

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-18 20:25 OOPS: 2.6.16-rc6 cpufreq_conservative Parag Warudkar
@ 2006-03-18 21:40 ` Linus Torvalds
  2006-03-18 22:09   ` Parag Warudkar
                     ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-18 21:40 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Linux Kernel Mailing List, alex-kernel, jun.nakajima, Dave Jones


Gaah. Looking at the assembly code to try to figure out where the oops 
happened, one thing is clear: "for_each_cpu_mask()" generates some truly 
horrible code. C code that looks simple ends up being tons of complex 
assembly language due to inline functions etc.

We've made it way too easy for people to write code that just absolutely 
_sucks_.

It's extra sad, because disassembling Parag's oops, it looks like he has 
limited NR_CPU's to 2 (appropriate for his setup), yet we literally do 
things like:

	<code+0>:    bsf    %eax,%eax
	<code+3>:    cmp    $0x3,%eax
	<code+6>:    mov    $0x2,%ebx
	<code+11>:   cmovl  %eax,%ebx
	<code+14>:   cmp    $0x1,%ebx
	<code+17>:   ja     0x80495fc
	<code+19>:   mov    %ebp,%esi
	<code+21>:   lea    0x0(%esi),%esi
	<code+27>:   mov    0x78448004(,%ebx,4),%eax
	<code+34>:   mov    $0x2,%edi
	<code+39>:   add    %esi,%eax
**	<code+41>:   mov    (%eax),%eax		** oops here **
	<code+43>:   mov    0x1c(%eax),%eax
	<code+46>:   mov    %edi,0x4(%esp)
	<code+50>:   movl   $0x7840d1bc,(%esp)
	<code+57>:   mov    %eax,0x???(,%ebx,4)

where 90% of that code seems to be just cruft from __next_cpu(): the 
"bsf", the comparisons against 3 (n+1), the $0x2 setup for calling 
find_next_bit()..

Anyway, I _think_ it's this one:

                for_each_online_cpu(j) {
                        dbs_info = &per_cpu(cpu_dbs_info, j);
                        requested_freq[j] = dbs_info->cur_policy->cur;
                }

where dbs_info->cur_policy seems to be NULL. Maybe. Exactly because 90% of 
the instructions come from that generic stuff around it, it's hard to be 
sure, and I don't have the same compiler version Parag does.

		Linus

On Sat, 18 Mar 2006, Parag Warudkar wrote:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0000001c
>  printing eip:
> f834e6d0
> *pde = 00000000
> Oops: 0000 [#1]
> PREEMPT SMP
> Modules linked in: cpufreq_conservative oprofile ntfs snd_pcm_oss 
> snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device eth1394 
> ohci1394 i2c_i801 i2c_core hw_random ipw3945 ieee80211 ieee80211_crypt 
> firmware_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd 
> snd_page_alloc i915 drm cpufreq_ondemand speedstep_centrino b44
> CPU:    1
> EIP:    0060:[<f834e6d0>]    Not tainted VLI
> EFLAGS: 00010206   (2.6.16-rc6 #3)
> EIP is at dbs_check_cpu+0x220/0x400 [cpufreq_conservative]
> eax: 00000000   ebx: 00000001   ecx: 00000001   edx: 7840d1bc
> esi: 78445730   edi: 00000002   ebp: 78445730   esp: 7a1c1ef0
> ds: 007b   es: 007b   ss: 0068
> Process events/1 (pid: 9, threadinfo=7a1c0000 task=7a33fa90)
> Stack: <0>7840d1bc 00000002 00000001 7a2d3a80 00000000 f834f704 7a1aa1c0 
> 00000000
>        f834e938 00000000 00000202 7a1c0000 f834f700 78132869 00000000 00000000
>        18a60e00 7a1aa1d0 7a1aa1e8 f834e8b0 00000202 7a1c0000 7a1aa1d8 7a1aa1d0
> Call Trace:
>  [<f834e938>] do_dbs_timer+0x88/0xc0 [cpufreq_conservative]
>  [<78132869>] run_workqueue+0x79/0xf0
>  [<f834e8b0>] do_dbs_timer+0x0/0xc0 [cpufreq_conservative]
>  [<78132a38>] worker_thread+0x158/0x180
>  [<7811b0d0>] default_wake_function+0x0/0x20
>  [<7811b0d0>] default_wake_function+0x0/0x20
>  [<781328e0>] worker_thread+0x0/0x180
>  [<7813664c>] kthread+0xbc/0x100
>  [<78136590>] kthread+0x0/0x100
>  [<78101285>] kernel_thread_helper+0x5/0x10
> Code: 0f bc c0 83 f8 03 bb 02 00 00 00 0f 4c d8 83 fb 01 77 49 89 ee 8d b6 00 
> 00 00 00 8b 04 9d 04 80 44 78 bf 02 00 00 00 01 f0 8b 00 <8b> 40 1c 89 7c 24 
> 04 c7 04 24 bc d1 40 78 89 04 9d 48 fa 34 f8
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-18 21:40 ` Linus Torvalds
@ 2006-03-18 22:09   ` Parag Warudkar
  2006-03-18 23:12     ` Linus Torvalds
  2006-03-18 22:26   ` Parag Warudkar
  2006-03-19  0:53   ` Andrew Morton
  2 siblings, 1 reply; 31+ messages in thread
From: Parag Warudkar @ 2006-03-18 22:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, alex-kernel, jun.nakajima, Dave Jones

On Saturday 18 March 2006 16:40, Linus Torvalds wrote:
> Anyway, I _think_ it's this one:
>
>                 for_each_online_cpu(j) {
>                         dbs_info = &per_cpu(cpu_dbs_info, j);
>                         requested_freq[j] = dbs_info->cur_policy->cur;
>                 }
Here is the disassembly of cpufrequency_conservative.o:dbs_check_cpu() from my 
setup, if it is of any help.

000004b0 <dbs_check_cpu>:
 4b0:   55                      push   %ebp
 4b1:   bd 00 00 00 00          mov    $0x0,%ebp
 4b6:   89 e8                   mov    %ebp,%eax
 4b8:   57                      push   %edi
 4b9:   56                      push   %esi
 4ba:   53                      push   %ebx
 4bb:   83 ec 10                sub    $0x10,%esp
 4be:   8b 54 24 24             mov    0x24(%esp),%edx
 4c2:   8b 0c 95 00 00 00 00    mov    0x0(,%edx,4),%ecx
 4c9:   01 c8                   add    %ecx,%eax
 4cb:   8b 50 0c                mov    0xc(%eax),%edx
 4ce:   85 d2                   test   %edx,%edx
 4d0:   0f 84 ba 01 00 00       je     690 <dbs_check_cpu+0x1e0>
 4d6:   66 83 3d 10 00 00 00    cmpw   $0x0,0x10
 4dd:   00
 4de:   8b 00                   mov    (%eax),%eax
 4e0:   89 44 24 0c             mov    %eax,0xc(%esp)
 4e4:   0f 84 ae 01 00 00       je     698 <dbs_check_cpu+0x1e8>
 4ea:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 4ee:   bf ff ff ff ff          mov    $0xffffffff,%edi
 4f3:   8b 01                   mov    (%ecx),%eax
 4f5:   85 c0                   test   %eax,%eax
 4f7:   0f 85 23 02 00 00       jne    720 <dbs_check_cpu+0x270>
 4fd:   b8 20 00 00 00          mov    $0x20,%eax
 502:   eb 52                   jmp    556 <dbs_check_cpu+0xa6>
 504:   8b 14 9d 00 00 00 00    mov    0x0(,%ebx,4),%edx
 50b:   89 e8                   mov    %ebp,%eax
 50d:   8b 0d 60 00 00 00       mov    0x60,%ecx
 513:   8d 34 02                lea    (%edx,%eax,1),%esi
 516:   b8 00 00 00 00          mov    $0x0,%eax
 51b:   8d 04 02                lea    (%edx,%eax,1),%eax
 51e:   8b 50 30                mov    0x30(%eax),%edx
 521:   03 50 28                add    0x28(%eax),%edx
 524:   85 c9                   test   %ecx,%ecx
 526:   74 03                   je     52b <dbs_check_cpu+0x7b>
 528:   03 50 08                add    0x8(%eax),%edx
 52b:   8b 4e 04                mov    0x4(%esi),%ecx
 52e:   89 d0                   mov    %edx,%eax
 530:   89 56 04                mov    %edx,0x4(%esi)
 533:   29 c8                   sub    %ecx,%eax
 535:   39 f8                   cmp    %edi,%eax
 537:   0f 42 f8                cmovb  %eax,%edi
 53a:   8d 43 01                lea    0x1(%ebx),%eax
 53d:   89 44 24 08             mov    %eax,0x8(%esp)
541:   b8 02 00 00 00          mov    $0x2,%eax
 546:   89 44 24 04             mov    %eax,0x4(%esp)
 54a:   8b 44 24 0c             mov    0xc(%esp),%eax
 54e:   89 04 24                mov    %eax,(%esp)
 551:   e8 fc ff ff ff          call   552 <dbs_check_cpu+0xa2>
 556:   83 f8 03                cmp    $0x3,%eax
 559:   bb 02 00 00 00          mov    $0x2,%ebx
 55e:   0f 4c d8                cmovl  %eax,%ebx
 561:   83 fb 01                cmp    $0x1,%ebx
 564:   76 9e                   jbe    504 <dbs_check_cpu+0x54>
 566:   8d 04 bf                lea    (%edi,%edi,4),%eax
 569:   8b 35 50 00 00 00       mov    0x50,%esi
 56f:   b9 64 00 00 00          mov    $0x64,%ecx
 574:   8d 04 80                lea    (%eax,%eax,4),%eax
 577:   ba fe ff ff 7f          mov    $0x7ffffffe,%edx
 57c:   8d 1c 85 00 00 00 00    lea    0x0(,%eax,4),%ebx
 583:   a1 58 00 00 00          mov    0x58,%eax
 588:   29 c1                   sub    %eax,%ecx
 58a:   81 fe c0 e0 ff ff       cmp    $0xffffe0c0,%esi
 590:   77 10                   ja     5a2 <dbs_check_cpu+0xf2>
 592:   8d 96 9f 0f 00 00       lea    0xf9f(%esi),%edx
 598:   b8 d3 4d 62 10          mov    $0x10624dd3,%eax
 59d:   f7 e2                   mul    %edx
 59f:   c1 ea 08                shr    $0x8,%edx
 5a2:   0f af ca                imul   %edx,%ecx
 5a5:   39 cb                   cmp    %ecx,%ebx
 5a7:   0f 83 7b 01 00 00       jae    728 <dbs_check_cpu+0x278>
 5ad:   8b 54 24 24             mov    0x24(%esp),%edx
 5b1:   31 c0                   xor    %eax,%eax
 5b3:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 5b7:   89 04 95 00 00 00 00    mov    %eax,0x0(,%edx,4)
 5be:   8b 01                   mov    (%ecx),%eax
 5c0:   85 c0                   test   %eax,%eax
 5c2:   0f 84 ca 02 00 00       je     892 <dbs_check_cpu+0x3e2>
 5c8:   0f bc c0                bsf    %eax,%eax
 5cb:   83 f8 03                cmp    $0x3,%eax
 5ce:   bb 02 00 00 00          mov    $0x2,%ebx
 5d3:   0f 4c d8                cmovl  %eax,%ebx
 5d6:   83 fb 01                cmp    $0x1,%ebx
 5d9:   77 40                   ja     61b <dbs_check_cpu+0x16b>
 5db:   89 ee                   mov    %ebp,%esi
 5dd:   8d 76 00                lea    0x0(%esi),%esi
 5e0:   8b 14 9d 00 00 00 00    mov    0x0(,%ebx,4),%edx
 5e7:   01 f2                   add    %esi,%edx
5e9:   8b 42 04                mov    0x4(%edx),%eax
 5ec:   89 42 08                mov    %eax,0x8(%edx)
 5ef:   8d 43 01                lea    0x1(%ebx),%eax
 5f2:   bb 02 00 00 00          mov    $0x2,%ebx
 5f7:   89 44 24 08             mov    %eax,0x8(%esp)
 5fb:   b8 02 00 00 00          mov    $0x2,%eax
 600:   89 44 24 04             mov    %eax,0x4(%esp)
 604:   8b 44 24 0c             mov    0xc(%esp),%eax
 608:   89 04 24                mov    %eax,(%esp)
 60b:   e8 fc ff ff ff          call   60c <dbs_check_cpu+0x15c>
 610:   83 f8 03                cmp    $0x3,%eax
 613:   0f 4c d8                cmovl  %eax,%ebx
 616:   83 fb 01                cmp    $0x1,%ebx
 619:   76 c5                   jbe    5e0 <dbs_check_cpu+0x130>
 61b:   8b 54 24 24             mov    0x24(%esp),%edx
 61f:   8b 0c 95 08 00 00 00    mov    0x8(,%edx,4),%ecx
 626:   8b 54 24 0c             mov    0xc(%esp),%edx
 62a:   8b 42 18                mov    0x18(%edx),%eax
 62d:   39 c1                   cmp    %eax,%ecx
 62f:   74 5f                   je     690 <dbs_check_cpu+0x1e0>
 631:   8b 15 64 00 00 00       mov    0x64,%edx
 637:   0f af d0                imul   %eax,%edx
 63a:   b8 1f 85 eb 51          mov    $0x51eb851f,%eax
 63f:   f7 e2                   mul    %edx
 641:   b8 05 00 00 00          mov    $0x5,%eax
 646:   c1 ea 05                shr    $0x5,%edx
 649:   0f 44 d0                cmove  %eax,%edx
 64c:   8d 04 11                lea    (%ecx,%edx,1),%eax
 64f:   8b 4c 24 24             mov    0x24(%esp),%ecx
 653:   89 04 8d 08 00 00 00    mov    %eax,0x8(,%ecx,4)
 65a:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 65e:   8b 51 18                mov    0x18(%ecx),%edx
 661:   39 d0                   cmp    %edx,%eax
 663:   76 0d                   jbe    672 <dbs_check_cpu+0x1c2>
 665:   89 d0                   mov    %edx,%eax
 667:   8b 54 24 24             mov    0x24(%esp),%edx
 66b:   89 04 95 08 00 00 00    mov    %eax,0x8(,%edx,4)
 672:   89 44 24 04             mov    %eax,0x4(%esp)
 676:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 67a:   ba 01 00 00 00          mov    $0x1,%edx
 67f:   89 54 24 08             mov    %edx,0x8(%esp)
 683:   89 0c 24                mov    %ecx,(%esp)
 686:   e8 fc ff ff ff          call   687 <dbs_check_cpu+0x1d7>
 68b:   90                      nop
68c:   8d 74 26 00             lea    0x0(%esi),%esi
 690:   83 c4 10                add    $0x10,%esp
 693:   5b                      pop    %ebx
 694:   5e                      pop    %esi
 695:   5f                      pop    %edi
 696:   5d                      pop    %ebp
 697:   c3                      ret
 698:   a1 00 00 00 00          mov    0x0,%eax
 69d:   85 c0                   test   %eax,%eax
 69f:   0f 84 e3 01 00 00       je     888 <dbs_check_cpu+0x3d8>
 6a5:   0f bc c0                bsf    %eax,%eax
 6a8:   83 f8 03                cmp    $0x3,%eax
 6ab:   bb 02 00 00 00          mov    $0x2,%ebx
 6b0:   0f 4c d8                cmovl  %eax,%ebx
 6b3:   83 fb 01                cmp    $0x1,%ebx
 6b6:   77 49                   ja     701 <dbs_check_cpu+0x251>
 6b8:   89 ee                   mov    %ebp,%esi
 6ba:   8d b6 00 00 00 00       lea    0x0(%esi),%esi
 6c0:   8b 04 9d 00 00 00 00    mov    0x0(,%ebx,4),%eax
 6c7:   bf 02 00 00 00          mov    $0x2,%edi
 6cc:   01 f0                   add    %esi,%eax
 6ce:   8b 00                   mov    (%eax),%eax
 6d0:   8b 40 1c                mov    0x1c(%eax),%eax 
 6d3:   89 7c 24 04             mov    %edi,0x4(%esp)
 6d7:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
 6de:   89 04 9d 08 00 00 00    mov    %eax,0x8(,%ebx,4)
 6e5:   8d 43 01                lea    0x1(%ebx),%eax
 6e8:   bb 02 00 00 00          mov    $0x2,%ebx
 6ed:   89 44 24 08             mov    %eax,0x8(%esp)
 6f1:   e8 fc ff ff ff          call   6f2 <dbs_check_cpu+0x242>
 6f6:   83 f8 03                cmp    $0x3,%eax
 6f9:   0f 4c d8                cmovl  %eax,%ebx
 6fc:   83 fb 01                cmp    $0x1,%ebx
 6ff:   76 bf                   jbe    6c0 <dbs_check_cpu+0x210>
 701:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 705:   bb 01 00 00 00          mov    $0x1,%ebx
 70a:   bf ff ff ff ff          mov    $0xffffffff,%edi
 70f:   66 89 1d 10 00 00 00    mov    %bx,0x10
 716:   8b 01                   mov    (%ecx),%eax
 718:   85 c0                   test   %eax,%eax
 71a:   0f 84 dd fd ff ff       je     4fd <dbs_check_cpu+0x4d>
 720:   0f bc c0                bsf    %eax,%eax
 723:   e9 2e fe ff ff          jmp    556 <dbs_check_cpu+0xa6>
 728:   8b 54 24 24             mov    0x24(%esp),%edx
 72c:   8b 04 95 00 00 00 00    mov    0x0(,%edx,4),%eax
733:   40                      inc    %eax
 734:   89 04 95 00 00 00 00    mov    %eax,0x0(,%edx,4)
 73b:   8b 15 54 00 00 00       mov    0x54,%edx
 741:   39 d0                   cmp    %edx,%eax
 743:   0f 82 47 ff ff ff       jb     690 <dbs_check_cpu+0x1e0>
 749:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 74d:   bf ff ff ff ff          mov    $0xffffffff,%edi
 752:   8b 01                   mov    (%ecx),%eax
 754:   85 c0                   test   %eax,%eax
 756:   0f 85 40 01 00 00       jne    89c <dbs_check_cpu+0x3ec>
 75c:   b8 20 00 00 00          mov    $0x20,%eax
 761:   83 f8 03                cmp    $0x3,%eax
 764:   bb 02 00 00 00          mov    $0x2,%ebx
 769:   0f 4c d8                cmovl  %eax,%ebx
 76c:   83 fb 01                cmp    $0x1,%ebx
 76f:   77 64                   ja     7d5 <dbs_check_cpu+0x325>
 771:   89 ee                   mov    %ebp,%esi
 773:   8d b6 00 00 00 00       lea    0x0(%esi),%esi
 779:   8d bc 27 00 00 00 00    lea    0x0(%edi),%edi
 780:   8b 04 9d 00 00 00 00    mov    0x0(,%ebx,4),%eax
 787:   01 f0                   add    %esi,%eax
 789:   8b 50 04                mov    0x4(%eax),%edx
 78c:   8b 68 08                mov    0x8(%eax),%ebp
 78f:   89 d1                   mov    %edx,%ecx
 791:   29 e9                   sub    %ebp,%ecx
 793:   89 50 08                mov    %edx,0x8(%eax)
 796:   8d 43 01                lea    0x1(%ebx),%eax
 799:   39 f9                   cmp    %edi,%ecx
 79b:   89 44 24 08             mov    %eax,0x8(%esp)
 79f:   8b 44 24 0c             mov    0xc(%esp),%eax
 7a3:   bd 02 00 00 00          mov    $0x2,%ebp
 7a8:   0f 42 f9                cmovb  %ecx,%edi
 7ab:   89 6c 24 04             mov    %ebp,0x4(%esp)
 7af:   89 04 24                mov    %eax,(%esp)
 7b2:   e8 fc ff ff ff          call   7b3 <dbs_check_cpu+0x303>
 7b7:   83 f8 03                cmp    $0x3,%eax
 7ba:   ba 02 00 00 00          mov    $0x2,%edx
7bf:   0f 4d c2                cmovge %edx,%eax
 7c2:   83 f8 01                cmp    $0x1,%eax
 7c5:   89 c3                   mov    %eax,%ebx
 7c7:   76 b7                   jbe    780 <dbs_check_cpu+0x2d0>
 7c9:   8b 35 50 00 00 00       mov    0x50,%esi
 7cf:   8b 15 54 00 00 00       mov    0x54,%edx
 7d5:   8d 04 bf                lea    (%edi,%edi,4),%eax
 7d8:   8b 4c 24 24             mov    0x24(%esp),%ecx
 7dc:   8d 04 80                lea    (%eax,%eax,4),%eax
 7df:   8d 1c 85 00 00 00 00    lea    0x0(,%eax,4),%ebx
 7e6:   89 f0                   mov    %esi,%eax
 7e8:   0f af c2                imul   %edx,%eax
 7eb:   8b 35 5c 00 00 00       mov    0x5c,%esi
 7f1:   31 ff                   xor    %edi,%edi
 7f3:   ba fe ff ff 7f          mov    $0x7ffffffe,%edx
 7f8:   89 3c 8d 00 00 00 00    mov    %edi,0x0(,%ecx,4)
 7ff:   b9 64 00 00 00          mov    $0x64,%ecx
 804:   29 f1                   sub    %esi,%ecx
 806:   3d c0 e0 ff ff          cmp    $0xffffe0c0,%eax
 80b:   77 10                   ja     81d <dbs_check_cpu+0x36d>
 80d:   8d 90 9f 0f 00 00       lea    0xf9f(%eax),%edx
 813:   b8 d3 4d 62 10          mov    $0x10624dd3,%eax
 818:   f7 e2                   mul    %edx
 81a:   c1 ea 08                shr    $0x8,%edx
 81d:   0f af ca                imul   %edx,%ecx
 820:   39 cb                   cmp    %ecx,%ebx
 822:   0f 86 68 fe ff ff       jbe    690 <dbs_check_cpu+0x1e0>
 828:   8b 44 24 24             mov    0x24(%esp),%eax
 82c:   8b 54 24 0c             mov    0xc(%esp),%edx
 830:   8b 0c 85 08 00 00 00    mov    0x8(,%eax,4),%ecx
 837:   3b 4a 14                cmp    0x14(%edx),%ecx
 83a:   0f 84 50 fe ff ff       je     690 <dbs_check_cpu+0x1e0>
 840:   a1 64 00 00 00          mov    0x64,%eax
 845:   85 c0                   test   %eax,%eax
 847:   0f 84 43 fe ff ff       je     690 <dbs_check_cpu+0x1e0>
 84d:   8b 5a 18                mov    0x18(%edx),%ebx
 850:   ba 1f 85 eb 51          mov    $0x51eb851f,%edx
 855:   0f af c3                imul   %ebx,%eax
 858:   f7 e2                   mul    %edx
 85a:   b8 05 00 00 00          mov    $0x5,%eax
 85f:   c1 ea 05                shr    $0x5,%edx
 862:   0f 44 d0                cmove  %eax,%edx
 865:   89 c8                   mov    %ecx,%eax
 867:   8b 4c 24 24             mov    0x24(%esp),%ecx
 86b:   29 d0                   sub    %edx,%eax
 86d:   89 04 8d 08 00 00 00    mov    %eax,0x8(,%ecx,4)
 874:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 878:   8b 51 14                mov    0x14(%ecx),%edx
 87b:   39 d0                   cmp    %edx,%eax
 87d:   0f 83 ef fd ff ff       jae    672 <dbs_check_cpu+0x1c2>
 883:   e9 dd fd ff ff          jmp    665 <dbs_check_cpu+0x1b5>
 888:   b8 20 00 00 00          mov    $0x20,%eax
 88d:   e9 16 fe ff ff          jmp    6a8 <dbs_check_cpu+0x1f8>
 892:   b8 20 00 00 00          mov    $0x20,%eax
 897:   e9 2f fd ff ff          jmp    5cb <dbs_check_cpu+0x11b>
 89c:   0f bc c0                bsf    %eax,%eax
 89f:   e9 bd fe ff ff          jmp    761 <dbs_check_cpu+0x2b1>
 8a4:   8d b6 00 00 00 00       lea    0x0(%esi),%esi
 8aa:   8d bf 00 00 00 00       lea    0x0(%edi),%edi

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-18 21:40 ` Linus Torvalds
  2006-03-18 22:09   ` Parag Warudkar
@ 2006-03-18 22:26   ` Parag Warudkar
  2006-03-19  0:53   ` Andrew Morton
  2 siblings, 0 replies; 31+ messages in thread
From: Parag Warudkar @ 2006-03-18 22:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, alex-kernel, jun.nakajima, Dave Jones

On Saturday 18 March 2006 16:40, Linus Torvalds wrote:
> Anyway, I _think_ it's this one:
>
>                 for_each_online_cpu(j) {
>                         dbs_info = &per_cpu(cpu_dbs_info, j);
>                         requested_freq[j] = dbs_info->cur_policy->cur;
>                 }
>
> where dbs_info->cur_policy seems to be NULL. Maybe.

That was right on target.
I just put  a check in the code which confirms that for some reason cur_policy 
for cpu0 is NULL.

Parag 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-18 22:09   ` Parag Warudkar
@ 2006-03-18 23:12     ` Linus Torvalds
  0 siblings, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-18 23:12 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Linux Kernel Mailing List, alex-kernel, jun.nakajima, Dave Jones



On Sat, 18 Mar 2006, Parag Warudkar wrote:
>
> Here is the disassembly of cpufrequency_conservative.o:dbs_check_cpu() from my 
> setup, if it is of any help.

Yup. Just confirming that this seems to be what I was pointing to.

It's the first "for_each_online_cpu()" loop: the reason it's pretty far 
down in the function disassembly is that the early "init_flag == 0" test 
jumps off into the latter part of the thing, even though the code is 
early. That's just gcc re-ordering stuff.

> 000004b0 <dbs_check_cpu>:
>  4b0:   55                      push   %ebp
>  4b1:   bd 00 00 00 00          mov    $0x0,%ebp
>  4b6:   89 e8                   mov    %ebp,%eax
>  4b8:   57                      push   %edi
>  4b9:   56                      push   %esi
>  4ba:   53                      push   %ebx
>  4bb:   83 ec 10                sub    $0x10,%esp
>  4be:   8b 54 24 24             mov    0x24(%esp),%edx
>  4c2:   8b 0c 95 00 00 00 00    mov    0x0(,%edx,4),%ecx
>  4c9:   01 c8                   add    %ecx,%eax
>  4cb:   8b 50 0c                mov    0xc(%eax),%edx
>  4ce:   85 d2                   test   %edx,%edx
>  4d0:   0f 84 ba 01 00 00       je     690 <dbs_check_cpu+0x1e0>

This is the "init_flag == 0" test:

>  4d6:   66 83 3d 10 00 00 00    cmpw   $0x0,0x10
>  4dd:   00
>  4de:   8b 00                   mov    (%eax),%eax
>  4e0:   89 44 24 0c             mov    %eax,0xc(%esp)
>  4e4:   0f 84 ae 01 00 00       je     698 <dbs_check_cpu+0x1e8>
.......

This is

	for_each_online_cpu(j) {
>  698:   a1 00 00 00 00          mov    0x0,%eax
>  69d:   85 c0                   test   %eax,%eax
>  69f:   0f 84 e3 01 00 00       je     888 <dbs_check_cpu+0x3d8>
>  6a5:   0f bc c0                bsf    %eax,%eax
>  6a8:   83 f8 03                cmp    $0x3,%eax
>  6ab:   bb 02 00 00 00          mov    $0x2,%ebx
>  6b0:   0f 4c d8                cmovl  %eax,%ebx
>  6b3:   83 fb 01                cmp    $0x1,%ebx
>  6b6:   77 49                   ja     701 <dbs_check_cpu+0x251>
>  6b8:   89 ee                   mov    %ebp,%esi
>  6ba:   8d b6 00 00 00 00       lea    0x0(%esi),%esi


	dbs_info = &per_cpu(cpu_dbs_info, j);


>  6c0:   8b 04 9d 00 00 00 00    mov    0x0(,%ebx,4),%eax
>  6c7:   bf 02 00 00 00          mov    $0x2,%edi
>  6cc:   01 f0                   add    %esi,%eax
>  6ce:   8b 00                   mov    (%eax),%eax

And this is the oopsing instruction:

>  6d0:   8b 40 1c                mov    0x1c(%eax),%eax
>  6d3:   89 7c 24 04             mov    %edi,0x4(%esp)
>  6d7:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)

And this is the "requested_freq[j]" assignment:

>  6de:   89 04 9d 08 00 00 00    mov    %eax,0x8(,%ebx,4)
>  6e5:   8d 43 01                lea    0x1(%ebx),%eax
>  6e8:   bb 02 00 00 00          mov    $0x2,%ebx
>  6ed:   89 44 24 08             mov    %eax,0x8(%esp)

This is really a "call find_next_bit"

>  6f1:   e8 fc ff ff ff          call   6f2 <dbs_check_cpu+0x242>
>  6f6:   83 f8 03                cmp    $0x3,%eax
>  6f9:   0f 4c d8                cmovl  %eax,%ebx
>  6fc:   83 fb 01                cmp    $0x1,%ebx
>  6ff:   76 bf                   jbe    6c0 <dbs_check_cpu+0x210>

And that wass the end of that loop:

>  701:   8b 4c 24 0c             mov    0xc(%esp),%ecx
>  705:   bb 01 00 00 00          mov    $0x1,%ebx
>  70a:   bf ff ff ff ff          mov    $0xffffffff,%edi
>  70f:   66 89 1d 10 00 00 00    mov    %bx,0x10

And that was setting "init_flag = 1"

Horrid, horrid assembly language from that simple "for_each_online_cpu(j)" 
loop. Oh, well.

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-18 21:40 ` Linus Torvalds
  2006-03-18 22:09   ` Parag Warudkar
  2006-03-18 22:26   ` Parag Warudkar
@ 2006-03-19  0:53   ` Andrew Morton
  2006-03-19  2:38     ` Linus Torvalds
  2006-03-19  6:34     ` Parag Warudkar
  2 siblings, 2 replies; 31+ messages in thread
From: Andrew Morton @ 2006-03-19  0:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel-stuff, linux-kernel, alex-kernel, jun.nakajima, davej

Linus Torvalds <torvalds@osdl.org> wrote:
>
>  Gaah. Looking at the assembly code to try to figure out where the oops 
>  happened, one thing is clear: "for_each_cpu_mask()" generates some truly 
>  horrible code. C code that looks simple ends up being tons of complex 
>  assembly language due to inline functions etc.
> 
>  We've made it way too easy for people to write code that just absolutely 
>  _sucks_.

Yes, uninlining merely first_cpu() shrinks my usual vmlinux by 2.5k.  I'll
fix it up.

Meanwhile, I suppose we need to check that pointer for NULL as Parag
suggests.  I might stick a once-off WARN_ON() in there so someone gets in
and works out why we keep on having to graft mysterious null-pointer
avoidances into cpufreq.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19  0:53   ` Andrew Morton
@ 2006-03-19  2:38     ` Linus Torvalds
  2006-03-19  5:08       ` Paul Jackson
  2006-03-19 18:46       ` Linus Torvalds
  2006-03-19  6:34     ` Parag Warudkar
  1 sibling, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19  2:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-stuff, linux-kernel, alex-kernel, jun.nakajima, davej

On Sat, 18 Mar 2006, Andrew Morton wrote:
>
> Yes, uninlining merely first_cpu() shrinks my usual vmlinux by 2.5k.  I'll
> fix it up.

The thing is, we could do _so_ much better if we just fixed the "calling 
convention" for that loop.

In particular, if we instead of

	for_each_cpu_mask(cpu, mask) 
		.. do something with cpu ..

had

	for_each_cpu_mask(cpu, mask) {
		.. do something with cpu ..
	} end_for_each_online_cpu;

we could do

	#if NR_CPUS <= BITS_IN_LONG
	#define for_each_cpu_mask(cpu, mask) 				\
		{ unsigned long __cpubits = (mask)->bits[0];		\
		  for (cpu = 0; __cpubits; cpu++, cpubits >>= 1) {	\
			if (!(__cpubits & 1))				\
				continue;

	#define end_for_each_cpu_mask	} }

then the code we'd generate would be
 (a) tons smaller
 (b) lots faster
than what we do now. No uninlining necessary, because we'd simply not have 
any complex code to either inline or to call.

The reason the code sucks right now is that the interface is bad, and 
causes us to do insane things to make it work at all with that syntax.

(We already have this kind of thing for the "do_each_thread / 
while_each_thread" interface to make it work sanely. Also, anybody who has 
ever played with "sparse" has seen the wonderful and flexible 
FOR_EACH_PTR() .. END_FOR_EACH_PTR() interface that allows for efficient 
pointer list traversal).

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19  2:38     ` Linus Torvalds
@ 2006-03-19  5:08       ` Paul Jackson
  2006-03-19 17:43         ` Linus Torvalds
  2006-03-19 18:46       ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Paul Jackson @ 2006-03-19  5:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: akpm, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej

If find_first_bit and find_next_bit suck so bad, then why just fix
their use in the for_each_cpu_mask().  How about the other uses?

Such as the other 50 odd places in the kernel that call them directly,
such as:

	$ grep ' = find_[a-z]*_bit' lib/bitmap.c
	    rbot = cur = find_first_bit(maskp, nmaskbits);
		    cur = find_next_bit(maskp, nmaskbits, cur+1);
	    i = find_first_bit(buf, bits);
		    i = find_next_bit(buf, bits, i + 1);
		    for (i = find_first_bit(buf, bits);
			 i = find_next_bit(buf, bits, i + 1))
	    for (oldbit = find_first_bit(src, bits);
		 oldbit = find_next_bit(src, bits, oldbit + 1)) {

Perhaps the common interface to these find_*_bit routines should be out
of line, with perhaps just a couple of key calls from the scheduler
using the inline form.

And if we fix the cpu loop to the API Linus suggests, we should do the
same with the node loops, such as used in:

	$ grep for_each.*_node mm/slab.c
		    for_each_node(i) {
	    for_each_node(i)
	    for_each_online_node(i) {
		    for_each_online_node(node) {
	    for_each_online_node(node) {
			    for_each_online_node(node) {
	    for_each_online_node(node) {
	    for_each_online_node(i) {
	    for_each_online_node(i) {
	    for_each_online_node(node) {
	    for_each_online_node(node) {
	    for_each_online_node(node) {
	    for_each_online_node(node) {

And for those of us with too many CPUs, how about something like
(totally untested and probably totally bogus):

	#if NR_CPUS <= BITS_IN_LONG
	 ... as per Linus, shifting a mask right ...
	#else
	#define for_each_cpu_mask(cpu, mask)
		{ for (cpu = 0; cpu < NR_CPUS; cpu++) {
			if (!(test_bit(cpu, mask))
				continue;
	#endif

	#define end_for_each_cpu_mask	} }	

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19  0:53   ` Andrew Morton
  2006-03-19  2:38     ` Linus Torvalds
@ 2006-03-19  6:34     ` Parag Warudkar
  2006-03-19 12:00       ` Alexander Clouter
  1 sibling, 1 reply; 31+ messages in thread
From: Parag Warudkar @ 2006-03-19  6:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, linux-kernel, alex-kernel, jun.nakajima, davej

On Saturday 18 March 2006 19:53, Andrew Morton wrote:
> I might stick a once-off WARN_ON() in there so someone gets in
> and works out why we keep on having to graft mysterious null-pointer
> avoidances into cpufreq.
cpufreq_conservative should be marked broken on SMP - I have used it on UP 
boxes without trouble but I can't even safely modprobe it on SMP - it nearly 
ate my filesystem.  

And there seem to be multiple different problems with it - I get different 
oopses depending upon whether or not I have loaded it before or after the 
ondemand module.  Weird enough - cpufreq_conservative shares much of it's 
code with cpufreq_ondemand, which works without any problem. 

Let me know if anyone has objection to marking cpufreq_conservative 
depends !SMP - I am planning to submit  a patch soon.

Parag

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19  6:34     ` Parag Warudkar
@ 2006-03-19 12:00       ` Alexander Clouter
  2006-03-19 14:06         ` Parag Warudkar
  0 siblings, 1 reply; 31+ messages in thread
From: Alexander Clouter @ 2006-03-19 12:00 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, jun.nakajima, davej

[-- Attachment #1: Type: text/plain, Size: 1719 bytes --]

Hi,

Parag Warudkar <kernel-stuff@comcast.net> [20060319 01:34:01 -0500]:
>
> On Saturday 18 March 2006 19:53, Andrew Morton wrote:
> > I might stick a once-off WARN_ON() in there so someone gets in
> > and works out why we keep on having to graft mysterious null-pointer
> > avoidances into cpufreq.
> cpufreq_conservative should be marked broken on SMP - I have used it on UP 
> boxes without trouble but I can't even safely modprobe it on SMP - it nearly 
> ate my filesystem.  
> 
> And there seem to be multiple different problems with it - I get different 
> oopses depending upon whether or not I have loaded it before or after the 
> ondemand module.  Weird enough - cpufreq_conservative shares much of it's 
> code with cpufreq_ondemand, which works without any problem. 
> 
Well its drifted a bit, however I submitted a number of patches here about 
two weeks ago to bring it back into line and hopefully make it HOTPLUG safe.

The new set of patches pretty much make conservative's codebase identical to 
ondemands....as no one has posted back having used these or anything what am 
I to do?!

> Let me know if anyone has objection to marking cpufreq_conservative 
> depends !SMP - I am planning to submit  a patch soon.
> 
Doesn't bother me, what does is no one is trying to updates to conservative 
before deciding to declare it borked?

Hey ho....

Alex

> Parag

-- 
 _______________________________________ 
/ Misfortunes arrive on wings and leave \
\ on foot.                              /
 --------------------------------------- 
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 12:00       ` Alexander Clouter
@ 2006-03-19 14:06         ` Parag Warudkar
  2006-03-19 17:34           ` Alexander Clouter
  0 siblings, 1 reply; 31+ messages in thread
From: Parag Warudkar @ 2006-03-19 14:06 UTC (permalink / raw)
  To: Alexander Clouter
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, jun.nakajima, davej

On Sunday 19 March 2006 07:00, Alexander Clouter wrote:
> Well its drifted a bit, however I submitted a number of patches here about
> two weeks ago to bring it back into line and hopefully make it HOTPLUG
> safe.
>
> The new set of patches pretty much make conservative's codebase identical
> to ondemands....as no one has posted back having used these or anything
> what am I to do?!

The codebase already seems identical to ondemand - Are your patches in 
2.6.16-rc6 or -mm? If they are - let me know which. If you posted them but 
they haven't yet made it into either -mm or mainline can you please post 
links to all your patches please? I can test them.

Why do we even have conservative and ondemand as two separate modules given 
they share huge amount of code - perhaps make conservative an optional 
behaviour of ondemand or alteast make a common lib which both use?

Thanks
Parag

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 14:06         ` Parag Warudkar
@ 2006-03-19 17:34           ` Alexander Clouter
  0 siblings, 0 replies; 31+ messages in thread
From: Alexander Clouter @ 2006-03-19 17:34 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, jun.nakajima, davej

[-- Attachment #1: Type: text/plain, Size: 2651 bytes --]

Hi,

Parag Warudkar <kernel-stuff@comcast.net> [20060319 09:06:25 -0500]:
>
> The codebase already seems identical to ondemand - Are your patches in 
> 2.6.16-rc6 or -mm? If they are - let me know which. If you posted them but 
> they haven't yet made it into either -mm or mainline can you please post 
> links to all your patches please? I can test them.
> 
Well I submitted them back on 2006-02-24:

http://marc.theaimsgroup.com/?l=linux-kernel&m=114079151404567&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=114079151425558&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=114079151417220&w=2

It has been pointed out I did forget to CC the cpufreq folk, so thats 
completely my fault; I thought I had. :-/

> Why do we even have conservative and ondemand as two separate modules given 
> they share huge amount of code - perhaps make conservative an optional 
> behaviour of ondemand or alteast make a common lib which both use?
>
Originally the 'conservative' feature was just a sysfs flag that could be set 
however it was rejected for a number of reasons; one of them quite rightly, 
we have a modular system that can take stacks of cpufreq governors so lets 
use that :)

Also, more importantly, bugs in my bits do not affect the original author. 
It makes more sense to keep things this way as the internal code then does
not need a bunch of if{}'s scattered around and also I have a number of extra
sysfs files to tweak which would either do nothing in 'ondemand' mode or have
to be magically created and destroyed.

Either way, its probably neater this way and its *my* duty to make sure
anything changing for ondemand is considered for conservative.  If you look
at a lot of the userland tools that have come out, it would be a pain to have
them consider the exception class of handling the combined
ondemand/conservative.

Breaking out the thing into a library probably would be awkward as all the
similar code is actually inline in functions, putting that in a seperate file
would be pointless.  Hopefully when you apply my patches and then do a diff
between the ondemand and conservative governors you will see what I mean.

Cheers

Alex

-- 
 ________________________________________ 
/ Fortune finishes the great quotations, \
| #3                                     |
|                                        |
| Birds of a feather flock to a newly    |
\ washed car.                            /
 ---------------------------------------- 
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19  5:08       ` Paul Jackson
@ 2006-03-19 17:43         ` Linus Torvalds
  0 siblings, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 17:43 UTC (permalink / raw)
  To: Paul Jackson
  Cc: akpm, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej

On Sat, 18 Mar 2006, Paul Jackson wrote:
>
> If find_first_bit and find_next_bit suck so bad, then why just fix
> their use in the for_each_cpu_mask().  How about the other uses?

find_first_bit/find_next_bit don't actually suck. They're perfectly 
efficient for large bitmaps, which is what they were designed for (ie they 
were written for doing the free-block/inode bitmaps for minixfs 
originally).

In other words, they are meant for bitmaps of _thousands_ of bits, and 
for potentially sparse bitmaps.

Now, the CPU masks _can_ be large too, but commonly they are not. In this 
example, the "bitmap" was exactly two bits: cpu 0 and cpu 1. Using a 
complex "find_next_bit" for something like that is just overkill.

> And if we fix the cpu loop to the API Linus suggests, we should do the
> same with the node loops

Yes. Almost ever "for_each_xyzzy" tends to be more easily written as a 
macro if we also have a ending condition.

> And for those of us with too many CPUs, how about something like
> (totally untested and probably totally bogus):
> 
> 	#if NR_CPUS <= BITS_IN_LONG
> 	 ... as per Linus, shifting a mask right ...
> 	#else
> 	#define for_each_cpu_mask(cpu, mask)
> 		{ for (cpu = 0; cpu < NR_CPUS; cpu++) {
> 			if (!(test_bit(cpu, mask))
> 				continue;
> 	#endif
> 
> 	#define end_for_each_cpu_mask	} }	

Yes, except you'd probably be better off with testing whole words at a 
time, since a large NR_CPUs can often be because somebody wanted to 
support 256 CPU machines, and then used the same binary on a 8-way setup.

That would most helpfully be done with a double loop (outer one iterating 
over the bitmask words, inner one over the bits), but then it's hard to 
make "break" do the right thing inside the loop.

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19  2:38     ` Linus Torvalds
  2006-03-19  5:08       ` Paul Jackson
@ 2006-03-19 18:46       ` Linus Torvalds
  2006-03-19 19:02         ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 18:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-stuff, linux-kernel, alex-kernel, jun.nakajima, davej

On Sat, 18 Mar 2006, Linus Torvalds wrote:
> 
> The thing is, we could do _so_ much better if we just fixed the "calling 
> convention" for that loop.

Actually, looking at it a bit more, I think we can do better even 
_without_ fixing the "calling convention".

Namely like this.

This should make the "NR_CPUS <= 16" case have a _much_ simpler loop. 

I picked "16" at random. The loop is much faster and much simpler (it 
doesn't use complex instructions like "find first bit", but the downside 
is that if the CPU mask is very sparse, it will take more of those (very 
simple) iterations to figure out that it's empty.

I suspect the "16" could be raised to 32 (at which point it would cover 
all the default vaules), but somebody should probably take a look at 
how much this shrinks the kernel.

I have to admit that it's a bit evil to leave a dangling "else" in a 
macro, but it beautifully handles the "must be followed by exactly one C 
statement" requirement of the old macro ;)

So instead of calling it "evil", let's just say that it's "creative 
macro-writing".

(Btw, we could make this even cleaner by making the non-SMP case define 
"cpu_isset()" to 1 - at that point the UP and the "low CPU count" #defines 
could be merged into one).

		Linus

--
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 60e56c6..a659f42 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -313,11 +313,17 @@ static inline void __cpus_remap(cpumask_
 	bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
 }

-#if NR_CPUS > 1
+#if NR_CPUS > 16
 #define for_each_cpu_mask(cpu, mask)		\
 	for ((cpu) = first_cpu(mask);		\
 		(cpu) < NR_CPUS;		\
 		(cpu) = next_cpu((cpu), (mask)))
+#elif NR_CPUS > 1
+#define for_each_cpu_mask(cpu, mask)		\
+	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) \
+		if (!cpu_isset(cpu, mask))	\
+			continue;		\
+		else
 #else /* NR_CPUS == 1 */
 #define for_each_cpu_mask(cpu, mask) for ((cpu) = 0; (cpu) < 1; (cpu)++)
 #endif /* NR_CPUS */

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 18:46       ` Linus Torvalds
@ 2006-03-19 19:02         ` Linus Torvalds
  2006-03-19 19:33           ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 19:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-stuff, linux-kernel, alex-kernel, jun.nakajima, davej

On Sun, 19 Mar 2006, Linus Torvalds wrote:
> 
> Actually, looking at it a bit more, I think we can do better even 
> _without_ fixing the "calling convention".

Here's the "before and after" example of nr_context_switches() from 
kernel/sched.c with this patch on x86-64 (selected because the whole 
function is just basically one simple "loop over all possible CPU's and 
return the sum of context switches").

I'll do the "after" first, because it's actually readable:

	nr_context_switches:
	        pushq   %rbp
	        xorl    %esi, %esi
	        xorl    %ecx, %ecx
	        movq    %rsp, %rbp
	.L210:
	#APP
	        btl %ecx,cpu_possible_map(%rip)
	        sbbl %eax,%eax
	#NO_APP
	        testl   %eax, %eax
	        je      .L211
	        movq    _cpu_pda(,%rcx,8), %rdx
	        movq    $per_cpu__runqueues, %rax
	        addq    8(%rdx), %rax
	        addq    40(%rax), %rsi
	.L211:
	        incq    %rcx
	        cmpq    $8, %rcx
	        jne     .L210
	        leave
	        movq    %rsi, %rax
	        ret

ie the loop starts at .L210, end basically iterates %rcx from 0..7, and 
you can read the assembly and it actually makes sense.

(Whether it _works_ is another matter: I haven't actually booted the 
patched kernel ;)

Compare that to the "before" state:

	nr_context_switches:
	        pushq   %rbp
	        movl    $8, %eax
	        movq    cpu_possible_map(%rip), %rdi
	#APP
	        bsfq %rdi,%rdx ; cmovz %rax,%rdx
	#NO_APP
	        cmpl    $9, %edx
	        movq    %rdx, %rax
	        movl    $8, %edx
	        cmovge  %edx, %eax
	        movq    %rsp, %rbp
	        pushq   %rbx
	        movslq  %eax,%rcx
	        xorl    %r8d, %r8d
	        jmp     .L261
	.L262:
	        movq    _cpu_pda(,%rcx,8), %rdx
	        movl    $8, %esi
	        movq    $per_cpu__runqueues, %rax
	        movq    %rdi, %rbx
	        movq    8(%rdx), %rdx
	        addq    40(%rax,%rdx), %r8
	        movl    %ecx, %edx
	        movl    %esi, %eax
	        leal    1(%rdx), %ecx
	        subl    %edx, %eax
	        decl    %eax
	        shrq    %cl, %rbx
	        cltq
	        movq    %rbx, %rcx
	#APP
	        bsfq %rcx,%rbx ; cmovz %rax,%rbx
	#NO_APP
	        leal    1(%rdx,%rbx), %edx
	        cmpl    $9, %edx
	        cmovge  %esi, %edx
	        movslq  %edx,%rcx
	.L261:
	        cmpq    $7, %rcx
	        jbe     .L262
	        popq    %rbx
	        leave
	        movq    %r8, %rax
	        ret

which is not only obviously bigger (40 instructions vs just 18), I also 
dare anybody to actually read and understand the assembly.

Now, the simple version will iterate 8 times over the loop, while the more 
complex version will iterate just as many times as there are CPU's in the 
actual system. So there's a trade-off. The "load the CPU mask first and 
then shift it down by one every time" thing (that required different 
syntax) would have been able to exit early. This one isn't. So the syntax 
change would still help a lot (and would avoid the "btl").

Of course, if people were to set CONFIG_NR_CPUS appropriately for their 
system, the shorter version gets increasingly better (since it then won't 
do any unnecessary work).

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 19:02         ` Linus Torvalds
@ 2006-03-19 19:33           ` Linus Torvalds
  2006-03-19 19:40             ` Al Viro
  2006-03-20  6:12             ` Willy Tarreau
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kernel-stuff, linux-kernel, alex-kernel, jun.nakajima, davej

On Sun, 19 Mar 2006, Linus Torvalds wrote:
> 
> Now, the simple version will iterate 8 times over the loop, while the more 
> complex version will iterate just as many times as there are CPU's in the 
> actual system. So there's a trade-off. The "load the CPU mask first and 
> then shift it down by one every time" thing (that required different 
> syntax) would have been able to exit early. This one isn't. So the syntax 
> change would still help a lot (and would avoid the "btl").

Hah. My "abuse every gcc feature and do really ugly macros" dark side has 
been vetted, and it came up with the appended work of art.

Doing a "break" inside a conditional by using the gcc statement 
expressions is sublime. 

And it works. It's a couple of instructions longer than the shortest 
version (but still about half the size of the horror we have now), but the 
instructions are simpler (just a shift rather than a "btl"), and it now 
knows to break out early if there are no CPU's left to check.

It has the added advantage that because it uses simpler core instructions, 
gcc can actually optimize it better - in "nr_context_switches()" gcc 
happily hoisted the "mask->bits[0]" load to outside the loop, for example.

With NR_CPUS==2, gcc even apparently unrolls the loop, to the point where 
it isn't even a loop at all. Good boy (at least for that case - I sure 
hope it won't do that for some of the larger loops ;).

Of course, I shouldn't say "works", since it is still totally untested. It 
_looks_ good, and that statement expression usage is just _so_ ugly it's 
cute.

		Linus

---
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 60e56c6..17a965c 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -313,11 +313,20 @@ static inline void __cpus_remap(cpumask_
 	bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
 }

-#if NR_CPUS > 1
+#define check_for_each_cpu(cpu, mask) \
+	({ unsigned long __bits = (mask).bits[0] >> (cpu); if (!__bits) break; __bits & 1; })
+
+#if NR_CPUS > 32
 #define for_each_cpu_mask(cpu, mask)		\
 	for ((cpu) = first_cpu(mask);		\
 		(cpu) < NR_CPUS;		\
 		(cpu) = next_cpu((cpu), (mask)))
+#elif NR_CPUS > 1
+#define for_each_cpu_mask(cpu, mask)		\
+	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) \
+		if (!check_for_each_cpu(cpu, mask))	\
+			continue;		\
+		else
 #else /* NR_CPUS == 1 */
 #define for_each_cpu_mask(cpu, mask) for ((cpu) = 0; (cpu) < 1; (cpu)++)
 #endif /* NR_CPUS */

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 19:33           ` Linus Torvalds
@ 2006-03-19 19:40             ` Al Viro
  2006-03-19 20:01               ` Linus Torvalds
  2006-03-20  6:12             ` Willy Tarreau
  1 sibling, 1 reply; 31+ messages in thread
From: Al Viro @ 2006-03-19 19:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej

On Sun, Mar 19, 2006 at 11:33:17AM -0800, Linus Torvalds wrote:
> Doing a "break" inside a conditional by using the gcc statement 
> expressions is sublime. 
> 
> And it works.

In the version of gcc you've tested.  With options and phase of moon
being what they had been.  IOW, you are awfully optimistic - it's not
just using gcc extension, it's using undocumented (in the best case)
behaviour outside the intended use of that extension.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 19:40             ` Al Viro
@ 2006-03-19 20:01               ` Linus Torvalds
  2006-03-19 20:31                 ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 20:01 UTC (permalink / raw)
  To: Al Viro
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej

On Sun, 19 Mar 2006, Al Viro wrote:
>
> In the version of gcc you've tested.  With options and phase of moon
> being what they had been.  IOW, you are awfully optimistic - it's not
> just using gcc extension, it's using undocumented (in the best case)
> behaviour outside the intended use of that extension.

I admit that it's ugly, but it's not undocumented. It flows directly from 
"statements as expression". Once you do that, you have to do flow control 
with them.

The end result may be _surprising_, the same way Duff's device is 
surprising (and for the same reason). But a C compiler that doesn't 
support Duff's device is not a C compiler. And this is really no 
different: it may not bestandard C: but it _is_ standard and documented 
GNU C.

And btw, this is _not_ new behaviour for the kernel. We have used 
non-local control behaviour in statement expressions before, just do a

	git grep '({.*return' 

to see at least ten cases of that (in fact, check out NFA_PUT(), which 
does a goto for the failure case in networking). That grep misses all the 
multi-line cases, so I assume there are more of them.

So this definitely works, and is not new. 

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 20:01               ` Linus Torvalds
@ 2006-03-19 20:31                 ` Linus Torvalds
  2006-03-19 20:47                   ` Andrew Morton
  2006-03-19 20:57                   ` Parag Warudkar
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 20:31 UTC (permalink / raw)
  To: Al Viro
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej

On Sun, 19 Mar 2006, Linus Torvalds wrote:
> 
> So this definitely works, and is not new. 

Btw, don't get me wrong - I think it would be preferable to use the 
"for_each/end_for_each" syntax, since that would make the macros simpler, 
and would possibly allow for somewhat simpler generated code.

But that would require each of the 300+ "for_each.*_cpu()" uses in the 
current kernel to be modified and checked. In the meantime, people who are 
interested can test out how much difference the trivial patch makes for 
them..

For me, it made a 4970 byte difference in code size. Noticeable? Maybe 
not (it's about 0.15% of my kernel text-size), but it _is_ better code, 
and next time somebody gets an oops near there, at least the crap code 
won't be hiding the cause ;)

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 20:31                 ` Linus Torvalds
@ 2006-03-19 20:47                   ` Andrew Morton
  2006-03-19 22:18                     ` Linus Torvalds
  2006-03-19 20:57                   ` Parag Warudkar
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2006-03-19 20:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: viro, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej

Linus Torvalds <torvalds@osdl.org> wrote:
>
>  For me, it made a 4970 byte difference in code size.
>

That's about the same saving as uninlining first_cpu() and next_cpu()
provides.

Anything which iterates across multiple CPUs is cachemiss heaven - I doubt
if this is performance-critical code.  Or at least if it is, we have bigger
problems..


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 20:31                 ` Linus Torvalds
  2006-03-19 20:47                   ` Andrew Morton
@ 2006-03-19 20:57                   ` Parag Warudkar
  1 sibling, 0 replies; 31+ messages in thread
From: Parag Warudkar @ 2006-03-19 20:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Andrew Morton, linux-kernel, alex-kernel, jun.nakajima,
	davej

On Sunday 19 March 2006 15:31, Linus Torvalds wrote:
> In the meantime, people who are
> interested can test out how much difference the trivial patch makes for
> them..

4464 bytes saving in text size for me on x86 - boots and seems to work fine.

Parag

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 20:47                   ` Andrew Morton
@ 2006-03-19 22:18                     ` Linus Torvalds
  2006-03-19 22:35                       ` Andrew Morton
  2006-03-19 22:46                       ` Linus Torvalds
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 22:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: viro, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej



On Sun, 19 Mar 2006, Andrew Morton wrote:
> 
> That's about the same saving as uninlining first_cpu() and next_cpu()
> provides.

Btw, we could do that better for UP too.

How does this patch look? It makes "for_each_cpu()" look the same 
regardless of whether it is UP/SMP, by simply making first_cpu() and 
next_cpu() work appropriately. 

No ugly macros this time ;)

		Linus

---
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 60e56c6..c8dbc24 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -212,17 +212,15 @@ static inline void __cpus_shift_left(cpu
 	bitmap_shift_left(dstp->bits, srcp->bits, n, nbits);
 }
 
-#define first_cpu(src) __first_cpu(&(src), NR_CPUS)
-static inline int __first_cpu(const cpumask_t *srcp, int nbits)
-{
-	return min_t(int, nbits, find_first_bit(srcp->bits, nbits));
-}
-
-#define next_cpu(n, src) __next_cpu((n), &(src), NR_CPUS)
-static inline int __next_cpu(int n, const cpumask_t *srcp, int nbits)
-{
-	return min_t(int, nbits, find_next_bit(srcp->bits, nbits, n+1));
-}
+#if NR_CPUS > 1
+extern int fastcall __first_cpu(const cpumask_t *srcp);
+extern int fastcall __next_cpu(int n, const cpumask_t *srcp);
+#define first_cpu(srcp) __first_cpu(&(srcp))
+#define next_cpu(n,srcp) __next_cpu((n),&(srcp))
+#else
+#define first_cpu(srcp)		(0)
+#define next_cpu(n, srcp)	((n)+1)
+#endif
 
 #define cpumask_of_cpu(cpu)						\
 ({									\
@@ -313,14 +311,10 @@ static inline void __cpus_remap(cpumask_
 	bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
 }
 
-#if NR_CPUS > 1
 #define for_each_cpu_mask(cpu, mask)		\
 	for ((cpu) = first_cpu(mask);		\
 		(cpu) < NR_CPUS;		\
 		(cpu) = next_cpu((cpu), (mask)))
-#else /* NR_CPUS == 1 */
-#define for_each_cpu_mask(cpu, mask) for ((cpu) = 0; (cpu) < 1; (cpu)++)
-#endif /* NR_CPUS */
 
 /*
  * The following particular system cpumasks and operations manage
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e882c6b..0a9cc97 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -10,6 +10,7 @@
 #include <linux/sched.h>
 #include <linux/unistd.h>
 #include <linux/cpu.h>
+#include <linux/cpumask.h>
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/stop_machine.h>
@@ -20,6 +21,18 @@ static DECLARE_MUTEX(cpucontrol);
 
 static struct notifier_block *cpu_chain;
 
+int __first_cpu(const cpumask_t *srcp)
+{
+	return min_t(int, NR_CPUS, find_first_bit(srcp->bits, NR_CPUS));
+}
+EXPORT_SYMBOL(__first_cpu);
+
+int __next_cpu(int n, const cpumask_t *srcp)
+{
+	return min_t(int, NR_CPUS, find_next_bit(srcp->bits, NR_CPUS, n+1));
+}
+EXPORT_SYMBOL(__next_cpu);
+
 #ifdef CONFIG_HOTPLUG_CPU
 static struct task_struct *lock_cpu_hotplug_owner;
 static int lock_cpu_hotplug_depth;

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 22:18                     ` Linus Torvalds
@ 2006-03-19 22:35                       ` Andrew Morton
  2006-03-19 22:55                         ` Linus Torvalds
  2006-03-19 22:46                       ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2006-03-19 22:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: viro, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej

Linus Torvalds <torvalds@osdl.org> wrote:
>
>  How does this patch look?
>

Wonderful - coz it's just like mine ;)

(4 patches temp-joined below)

- I made SMP=n next_cpu() just a constant "1".

- It's odd that the code uses (NR_CPUS>1) rather than CONFIG_SMP.  On x86
  (at least), SMP=y,NR_CPUS==1 isn't allowed anyway.



From: Andrew Morton <akpm@osdl.org>

           text    data     bss     dec     hex filename
before: 3490577 1322408  360000 5172985  4eeef9 vmlinux
after:  3488027 1322496  360128 5170651  4ee5db vmlinux

Cc: Paul Jackson <pj@sgi.com>
DESC
cpumask: uninline next_cpu()
EDESC
From: Andrew Morton <akpm@osdl.org>

           text    data     bss     dec     hex filename
before: 3488027 1322496  360128 5170651  4ee5db vmlinux
after:  3485112 1322480  359968 5167560  4ed9c8 vmlinux

2931 bytes saved

Cc: Paul Jackson <pj@sgi.com>
DESC
cpumask: uninline highest_possible_processor_id()
EDESC
From: Andrew Morton <akpm@osdl.org>

Shrinks the only caller (net/bridge/netfilter/ebtables.c) by 174 bytes.

Also, optimise highest_possible_processor_id() out of existence on
CONFIG_SMP=n.

Cc: Paul Jackson <pj@sgi.com>
DESC
cpumask: uninline any_online_cpu()
EDESC
From: Andrew Morton <akpm@osdl.org>

           text    data     bss     dec     hex filename
before: 3605597 1363528  363328 5332453  515de5 vmlinux
after:  3605295 1363612  363200 5332107  515c8b vmlinux

218 bytes saved.

Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.

This function seems inefficient.  Can't we simply AND the two masks, then use
find_first_bit()?

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 include/linux/cpumask.h |   46 ++++++++++++++------------------------
 lib/Makefile            |    2 +
 lib/cpumask.c           |   45 +++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 29 deletions(-)

diff -puN include/linux/cpumask.h~cpumask-uninline-first_cpu include/linux/cpumask.h
--- devel/include/linux/cpumask.h~cpumask-uninline-first_cpu	2006-03-19 14:26:35.000000000 -0800
+++ devel-akpm/include/linux/cpumask.h	2006-03-19 14:28:45.000000000 -0800
@@ -212,17 +212,15 @@ static inline void __cpus_shift_left(cpu
 	bitmap_shift_left(dstp->bits, srcp->bits, n, nbits);
 }
 
-#define first_cpu(src) __first_cpu(&(src), NR_CPUS)
-static inline int __first_cpu(const cpumask_t *srcp, int nbits)
-{
-	return min_t(int, nbits, find_first_bit(srcp->bits, nbits));
-}
-
-#define next_cpu(n, src) __next_cpu((n), &(src), NR_CPUS)
-static inline int __next_cpu(int n, const cpumask_t *srcp, int nbits)
-{
-	return min_t(int, nbits, find_next_bit(srcp->bits, nbits, n+1));
-}
+#ifdef CONFIG_SMP
+int __first_cpu(const cpumask_t *srcp);
+#define first_cpu(src) __first_cpu(&(src))
+int __next_cpu(int n, const cpumask_t *srcp);
+#define next_cpu(n, src) __next_cpu((n), &(src))
+#else
+#define first_cpu(src)		0
+#define next_cpu(n, src)	1
+#endif
 
 #define cpumask_of_cpu(cpu)						\
 ({									\
@@ -398,27 +396,17 @@ extern cpumask_t cpu_present_map;
 #define cpu_present(cpu)	((cpu) == 0)
 #endif
 
-#define any_online_cpu(mask)			\
-({						\
-	int cpu;				\
-	for_each_cpu_mask(cpu, (mask))		\
-		if (cpu_online(cpu))		\
-			break;			\
-	cpu;					\
-})
+#ifdef CONFIG_SMP
+int highest_possible_processor_id(void);
+#define any_online_cpu(mask) __any_online_cpu(&(mask))
+int __any_online_cpu(const cpumask_t *mask);
+#else
+#define highest_possible_processor_id()	0
+#define any_online_cpu(mask)		0
+#endif
 
 #define for_each_cpu(cpu)	  for_each_cpu_mask((cpu), cpu_possible_map)
 #define for_each_online_cpu(cpu)  for_each_cpu_mask((cpu), cpu_online_map)
 #define for_each_present_cpu(cpu) for_each_cpu_mask((cpu), cpu_present_map)
 
-/* Find the highest possible smp_processor_id() */
-#define highest_possible_processor_id() \
-({ \
-	unsigned int cpu, highest = 0; \
-	for_each_cpu_mask(cpu, cpu_possible_map) \
-		highest = cpu; \
-	highest; \
-})
-
-
 #endif /* __LINUX_CPUMASK_H */
diff -puN /dev/null lib/cpumask.c
--- /dev/null	2003-09-15 06:40:47.000000000 -0700
+++ devel-akpm/lib/cpumask.c	2006-03-19 14:28:45.000000000 -0800
@@ -0,0 +1,45 @@
+#include <linux/kernel.h>
+#include <linux/bitops.h>
+#include <linux/cpumask.h>
+#include <linux/module.h>
+
+int __first_cpu(const cpumask_t *srcp)
+{
+	return min_t(int, NR_CPUS, find_first_bit(srcp->bits, NR_CPUS));
+}
+EXPORT_SYMBOL(__first_cpu);
+
+int __next_cpu(int n, const cpumask_t *srcp)
+{
+	return min_t(int, NR_CPUS, find_next_bit(srcp->bits, NR_CPUS, n+1));
+}
+EXPORT_SYMBOL(__next_cpu);
+
+/*
+ * Find the highest possible smp_processor_id()
+ *
+ * Note: if we're prepared to assume that cpu_possible_map never changes
+ * (reasonable) then this function should cache its return value.
+ */
+int highest_possible_processor_id(void)
+{
+	unsigned int cpu;
+	unsigned highest = 0;
+
+	for_each_cpu_mask(cpu, cpu_possible_map)
+		highest = cpu;
+	return highest;
+}
+EXPORT_SYMBOL(highest_possible_processor_id);
+
+int __any_online_cpu(const cpumask_t *mask)
+{
+	int cpu;
+
+	for_each_cpu_mask(cpu, *mask) {
+		if (cpu_online(cpu))
+			break;
+	}
+	return cpu;
+}
+EXPORT_SYMBOL(__any_online_cpu);
diff -puN lib/Makefile~cpumask-uninline-first_cpu lib/Makefile
--- devel/lib/Makefile~cpumask-uninline-first_cpu	2006-03-19 14:26:35.000000000 -0800
+++ devel-akpm/lib/Makefile	2006-03-19 14:26:35.000000000 -0800
@@ -7,6 +7,8 @@ lib-y := errno.o ctype.o string.o vsprin
 	 idr.o div64.o int_sqrt.o bitmap.o extable.o prio_tree.o \
 	 sha1.o
 
+lib-$(CONFIG_SMP) += cpumask.o
+
 lib-y	+= kobject.o kref.o kobject_uevent.o klist.o
 
 obj-y += sort.o parser.o halfmd4.o iomap_copy.o
_


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 22:18                     ` Linus Torvalds
  2006-03-19 22:35                       ` Andrew Morton
@ 2006-03-19 22:46                       ` Linus Torvalds
  2006-03-19 23:04                         ` Andrew Morton
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 22:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: viro, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej



On Sun, 19 Mar 2006, Linus Torvalds wrote:
> 
> How does this patch look? It makes "for_each_cpu()" look the same 
> regardless of whether it is UP/SMP, by simply making first_cpu() and 
> next_cpu() work appropriately. 

It's still about 200 bytes larger than the inlined "smarter code", but 
yeah, the unlinlining is certainly the smaller change and doesn't depend 
on the compiler being as smart.

Here's another uninlining patch if you want it.

It doesn't make much of a difference on x86 (since the ffs/fls things are 
inlined to single instructions rather than making use of the generic 
routines, so the fact that we now link against the generic versions 
actually increases code size that isn't ever used), but if anybody is 
collecting patches that removes unnecessary inlining, it definitely looks 
like the right thing for v850/sh/sh64/arm etc that use the generic 
versions..

		Linus

---
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 208650b..2d3ed35 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -1,74 +1,28 @@
 #ifndef _LINUX_BITOPS_H
 #define _LINUX_BITOPS_H
+
+#include <linux/linkage.h>
 #include <asm/types.h>
 
 /*
  * ffs: find first bit set. This is defined the same way as
  * the libc and compiler builtin ffs routines, therefore
  * differs in spirit from the above ffz (man ffs).
+ *
+ * fls: find last bit set.
  */
 
-static inline int generic_ffs(int x)
-{
-	int r = 1;
-
-	if (!x)
-		return 0;
-	if (!(x & 0xffff)) {
-		x >>= 16;
-		r += 16;
-	}
-	if (!(x & 0xff)) {
-		x >>= 8;
-		r += 8;
-	}
-	if (!(x & 0xf)) {
-		x >>= 4;
-		r += 4;
-	}
-	if (!(x & 3)) {
-		x >>= 2;
-		r += 2;
-	}
-	if (!(x & 1)) {
-		x >>= 1;
-		r += 1;
-	}
-	return r;
-}
+extern int fastcall generic_ffs(int x);
+extern int fastcall generic_fls(int x);
 
 /*
- * fls: find last bit set.
+ * hweightN: returns the hamming weight (i.e. the number
+ * of bits set) of a N-bit word
  */
-
-static __inline__ int generic_fls(int x)
-{
-	int r = 32;
-
-	if (!x)
-		return 0;
-	if (!(x & 0xffff0000u)) {
-		x <<= 16;
-		r -= 16;
-	}
-	if (!(x & 0xff000000u)) {
-		x <<= 8;
-		r -= 8;
-	}
-	if (!(x & 0xf0000000u)) {
-		x <<= 4;
-		r -= 4;
-	}
-	if (!(x & 0xc0000000u)) {
-		x <<= 2;
-		r -= 2;
-	}
-	if (!(x & 0x80000000u)) {
-		x <<= 1;
-		r -= 1;
-	}
-	return r;
-}
+unsigned int fastcall generic_hweight64(__u64 w);
+unsigned int fastcall generic_hweight32(unsigned int w);
+unsigned int fastcall generic_hweight16(unsigned int w);
+unsigned int fastcall generic_hweight8(unsigned int w);
 
 /*
  * Include this here because some architectures need generic_ffs/fls in
@@ -76,7 +30,6 @@ static __inline__ int generic_fls(int x)
  */
 #include <asm/bitops.h>
 
-
 static inline int generic_fls64(__u64 x)
 {
 	__u32 h = x >> 32;
@@ -103,51 +56,6 @@ static __inline__ int get_count_order(un
 	return order;
 }
 
-/*
- * hweightN: returns the hamming weight (i.e. the number
- * of bits set) of a N-bit word
- */
-
-static inline unsigned int generic_hweight32(unsigned int w)
-{
-        unsigned int res = (w & 0x55555555) + ((w >> 1) & 0x55555555);
-        res = (res & 0x33333333) + ((res >> 2) & 0x33333333);
-        res = (res & 0x0F0F0F0F) + ((res >> 4) & 0x0F0F0F0F);
-        res = (res & 0x00FF00FF) + ((res >> 8) & 0x00FF00FF);
-        return (res & 0x0000FFFF) + ((res >> 16) & 0x0000FFFF);
-}
-
-static inline unsigned int generic_hweight16(unsigned int w)
-{
-        unsigned int res = (w & 0x5555) + ((w >> 1) & 0x5555);
-        res = (res & 0x3333) + ((res >> 2) & 0x3333);
-        res = (res & 0x0F0F) + ((res >> 4) & 0x0F0F);
-        return (res & 0x00FF) + ((res >> 8) & 0x00FF);
-}
-
-static inline unsigned int generic_hweight8(unsigned int w)
-{
-        unsigned int res = (w & 0x55) + ((w >> 1) & 0x55);
-        res = (res & 0x33) + ((res >> 2) & 0x33);
-        return (res & 0x0F) + ((res >> 4) & 0x0F);
-}
-
-static inline unsigned long generic_hweight64(__u64 w)
-{
-#if BITS_PER_LONG < 64
-	return generic_hweight32((unsigned int)(w >> 32)) +
-				generic_hweight32((unsigned int)w);
-#else
-	u64 res;
-	res = (w & 0x5555555555555555ul) + ((w >> 1) & 0x5555555555555555ul);
-	res = (res & 0x3333333333333333ul) + ((res >> 2) & 0x3333333333333333ul);
-	res = (res & 0x0F0F0F0F0F0F0F0Ful) + ((res >> 4) & 0x0F0F0F0F0F0F0F0Ful);
-	res = (res & 0x00FF00FF00FF00FFul) + ((res >> 8) & 0x00FF00FF00FF00FFul);
-	res = (res & 0x0000FFFF0000FFFFul) + ((res >> 16) & 0x0000FFFF0000FFFFul);
-	return (res & 0x00000000FFFFFFFFul) + ((res >> 32) & 0x00000000FFFFFFFFul);
-#endif
-}
-
 static inline unsigned long hweight_long(unsigned long w)
 {
 	return sizeof(w) == 4 ? generic_hweight32(w) : generic_hweight64(w);
diff --git a/lib/Makefile b/lib/Makefile
index 648b2c1..bba36f1 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@
 lib-y := errno.o ctype.o string.o vsprintf.o cmdline.o \
 	 bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
 	 idr.o div64.o int_sqrt.o bitmap.o extable.o prio_tree.o \
-	 sha1.o
+	 sha1.o bitops.o
 
 lib-y	+= kobject.o kref.o kobject_uevent.o klist.o
 
diff --git a/lib/bitops.c b/lib/bitops.c
new file mode 100644
index 0000000..e5e4cab
--- /dev/null
+++ b/lib/bitops.c
@@ -0,0 +1,106 @@
+#include <linux/bitops.h>
+#include <linux/module.h>
+
+int generic_ffs(int x)
+{
+	int r = 1;
+
+	if (!x)
+		return 0;
+	if (!(x & 0xffff)) {
+		x >>= 16;
+		r += 16;
+	}
+	if (!(x & 0xff)) {
+		x >>= 8;
+		r += 8;
+	}
+	if (!(x & 0xf)) {
+		x >>= 4;
+		r += 4;
+	}
+	if (!(x & 3)) {
+		x >>= 2;
+		r += 2;
+	}
+	if (!(x & 1)) {
+		x >>= 1;
+		r += 1;
+	}
+	return r;
+}
+EXPORT_SYMBOL(generic_ffs);
+
+int generic_fls(int x)
+{
+	int r = 32;
+
+	if (!x)
+		return 0;
+	if (!(x & 0xffff0000u)) {
+		x <<= 16;
+		r -= 16;
+	}
+	if (!(x & 0xff000000u)) {
+		x <<= 8;
+		r -= 8;
+	}
+	if (!(x & 0xf0000000u)) {
+		x <<= 4;
+		r -= 4;
+	}
+	if (!(x & 0xc0000000u)) {
+		x <<= 2;
+		r -= 2;
+	}
+	if (!(x & 0x80000000u)) {
+		x <<= 1;
+		r -= 1;
+	}
+	return r;
+}
+EXPORT_SYMBOL(generic_fls);
+
+unsigned int generic_hweight32(unsigned int w)
+{
+        unsigned int res = (w & 0x55555555) + ((w >> 1) & 0x55555555);
+        res = (res & 0x33333333) + ((res >> 2) & 0x33333333);
+        res = (res & 0x0F0F0F0F) + ((res >> 4) & 0x0F0F0F0F);
+        res = (res & 0x00FF00FF) + ((res >> 8) & 0x00FF00FF);
+        return (res & 0x0000FFFF) + ((res >> 16) & 0x0000FFFF);
+}
+EXPORT_SYMBOL(generic_hweight32);
+
+unsigned int generic_hweight16(unsigned int w)
+{
+        unsigned int res = (w & 0x5555) + ((w >> 1) & 0x5555);
+        res = (res & 0x3333) + ((res >> 2) & 0x3333);
+        res = (res & 0x0F0F) + ((res >> 4) & 0x0F0F);
+        return (res & 0x00FF) + ((res >> 8) & 0x00FF);
+}
+EXPORT_SYMBOL(generic_hweight16);
+
+unsigned int generic_hweight8(unsigned int w)
+{
+        unsigned int res = (w & 0x55) + ((w >> 1) & 0x55);
+        res = (res & 0x33) + ((res >> 2) & 0x33);
+        return (res & 0x0F) + ((res >> 4) & 0x0F);
+}
+EXPORT_SYMBOL(generic_hweight8);
+
+unsigned int generic_hweight64(__u64 w)
+{
+#if BITS_PER_LONG < 64
+	return generic_hweight32((unsigned int)(w >> 32)) +
+				generic_hweight32((unsigned int)w);
+#else
+	u64 res;
+	res = (w & 0x5555555555555555ul) + ((w >> 1) & 0x5555555555555555ul);
+	res = (res & 0x3333333333333333ul) + ((res >> 2) & 0x3333333333333333ul);
+	res = (res & 0x0F0F0F0F0F0F0F0Ful) + ((res >> 4) & 0x0F0F0F0F0F0F0F0Ful);
+	res = (res & 0x00FF00FF00FF00FFul) + ((res >> 8) & 0x00FF00FF00FF00FFul);
+	res = (res & 0x0000FFFF0000FFFFul) + ((res >> 16) & 0x0000FFFF0000FFFFul);
+	return (res & 0x00000000FFFFFFFFul) + ((res >> 32) & 0x00000000FFFFFFFFul);
+#endif
+}
+EXPORT_SYMBOL(generic_hweight64);

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 22:35                       ` Andrew Morton
@ 2006-03-19 22:55                         ` Linus Torvalds
  0 siblings, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2006-03-19 22:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: viro, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej

On Sun, 19 Mar 2006, Andrew Morton wrote:
> 
> Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.
> 
> This function seems inefficient.  Can't we simply AND the two masks, then use
> find_first_bit()?

Then you'd need to generate a temporary cpumask thing. Not a big deal as 
long as it fits in an "unsigned long", but since the online-cpu thing is 
likely dense in any relevant cpu-mask, I actually think "any_online_cpu()" 
as it stands now is likely to be simpler/more efficient than generating a 
temporary mask.

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 22:46                       ` Linus Torvalds
@ 2006-03-19 23:04                         ` Andrew Morton
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Morton @ 2006-03-19 23:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: viro, kernel-stuff, linux-kernel, alex-kernel, jun.nakajima,
	davej

Linus Torvalds <torvalds@osdl.org> wrote:
>
> Here's another uninlining patch if you want it.
>

Yes, the bitops need that.

We have a 50-patch series against the bitops code queued up.  It's from
Akinobu Mita.  It consolidates of all the C-coded operations which
archtectures are presently duplicating.  In toto:

 80 files changed, 1271 insertions(+), 4999 deletions(-)

It does include uninlining of the hweight functions, although I note that
include/asm-generic/bitops/__ffs.h remains inlined.  I don't know how many
architectures are using the generic code though.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-19 19:33           ` Linus Torvalds
  2006-03-19 19:40             ` Al Viro
@ 2006-03-20  6:12             ` Willy Tarreau
  2006-03-20  6:26               ` Linus Torvalds
  2006-03-20  8:22               ` Peter T. Breuer
  1 sibling, 2 replies; 31+ messages in thread
From: Willy Tarreau @ 2006-03-20  6:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej, viro

On Sun, Mar 19, 2006 at 11:33:17AM -0800, Linus Torvalds wrote:

> Of course, I shouldn't say "works", since it is still totally untested. It 
> _looks_ good, and that statement expression usage is just _so_ ugly it's 
> cute.
>
> 		Linus

At least, you could have moved the macro closer to where it's used.
It's very uncommon to break a statement within an if condition, and
it's not expected that the macro you're calling does a break under
you. It took me several minutes to understand precisely how this
works. Now it seems trivial, but I guess that at 3am I would have
gone to bed instead.

One first enhancement would be to make it easier to understand
by putting it closer to its user :

#elif NR_CPUS > 1
#define check_for_each_cpu(cpu, mask) \
	({ unsigned long __bits = (mask).bits[0] >> (cpu); if (!__bits) break; __bits & 1; })

#define for_each_cpu_mask(cpu, mask)		\
	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) \
		if (!check_for_each_cpu(cpu, mask))	\
			continue;		\
		else

Now, does removing the macro completely change the output code ?
I think that if something written like this produces the same
code, it would be easier to read :

#define for_each_cpu_mask(cpu, mask)			\
	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) {	\
		unsigned long __bits = (mask).bits[0] >> (cpu); \
		if (!__bits)				\
			break;				\
		if (!__bits & 1)			\
			continue;			\
		else

Regards,
Willy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-20  6:12             ` Willy Tarreau
@ 2006-03-20  6:26               ` Linus Torvalds
  2006-03-20  7:18                 ` Willy Tarreau
  2006-03-20  8:22               ` Peter T. Breuer
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2006-03-20  6:26 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej, viro

On Mon, 20 Mar 2006, Willy Tarreau wrote:
> 
> Now, does removing the macro completely change the output code ?
> I think that if something written like this produces the same
> code, it would be easier to read :
> 
> #define for_each_cpu_mask(cpu, mask)			\
> 	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) {	\
> 		unsigned long __bits = (mask).bits[0] >> (cpu); \
> 		if (!__bits)				\
> 			break;				\
> 		if (!__bits & 1)			\
> 			continue;			\
> 		else

Absolutely, but now it has a dangling "{" that didn't get closed. So the 
above would definitely be more readable, it just doesn't actually work.

Unless you'd do the "end_for_each_cpu" define (to close the statement), 
and update the 300+ places that use this. Which might well be worth it.

So the subtle "break from the middle of a statement expression" was just a 
rather hacky way to avoid having to change all the users of this macro.

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-20  6:26               ` Linus Torvalds
@ 2006-03-20  7:18                 ` Willy Tarreau
  2006-03-21  6:32                   ` Willy Tarreau
  0 siblings, 1 reply; 31+ messages in thread
From: Willy Tarreau @ 2006-03-20  7:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej, viro

On Sun, Mar 19, 2006 at 10:26:30PM -0800, Linus Torvalds wrote:
> 
> 
> On Mon, 20 Mar 2006, Willy Tarreau wrote:
> > 
> > Now, does removing the macro completely change the output code ?
> > I think that if something written like this produces the same
> > code, it would be easier to read :
> > 
> > #define for_each_cpu_mask(cpu, mask)			\
> > 	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) {	\
> > 		unsigned long __bits = (mask).bits[0] >> (cpu); \
> > 		if (!__bits)				\
> > 			break;				\
> > 		if (!__bits & 1)			\
> > 			continue;			\
> > 		else
> 
> Absolutely, but now it has a dangling "{" that didn't get closed. So the 
> above would definitely be more readable, it just doesn't actually work.
> 
> Unless you'd do the "end_for_each_cpu" define (to close the statement), 
> and update the 300+ places that use this. Which might well be worth it.
> 
> So the subtle "break from the middle of a statement expression" was just a 
> rather hacky way to avoid having to change all the users of this macro.
>
> 			Linus

Oh, you're right, now I understand your motivation in doing this.
Then perhaps using your trick but applying it to the whole for loop
would make it easier to read ?

#define for_each_cpu_mask(cpu, mask)			\
	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) 	\
	    ({	unsigned long __bits = (mask).bits[0] >> (cpu); \
		if (!__bits)				\
			break;				\
		if (!__bits & 1)			\
			continue;			\
		else					\
                ...					\
            })

Please note that I've not read the rest of the code, so there
may be some problems left. However, if the above works, I find
it easier to read. And in this case, yes, it's interesting to
be able to break from within an expression.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-20  6:12             ` Willy Tarreau
  2006-03-20  6:26               ` Linus Torvalds
@ 2006-03-20  8:22               ` Peter T. Breuer
  1 sibling, 0 replies; 31+ messages in thread
From: Peter T. Breuer @ 2006-03-20  8:22 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel

In article <20060320061212.GG21493@w.ods.org> you wrote:
> On Sun, Mar 19, 2006 at 11:33:17AM -0800, Linus Torvalds wrote:
>> Of course, I shouldn't say "works", since it is still totally untested. It 
>> _looks_ good, and that statement expression usage is just _so_ ugly it's 
>> cute.

> #define for_each_cpu_mask(cpu, mask)			\
> 	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) {	\
> 		unsigned long __bits = (mask).bits[0] >> (cpu); \
> 		if (!__bits)				\
> 			break;				\
> 		if (!__bits & 1)			\
> 			continue;			\
> 		else

While that's slightly wrong (needs a closing } supplied by the user), it
does inspire me to point out that one can put the skips in the ordinary
statement part of the for and use the if else idea to make sure that the
for needs just one statement following (i.e.  no dangling } supplied by
the user)

#define for_each_cpu_mask(cpu, mask)                     \
        for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++)        \
            if (!((mask).bits[0] >> (cpu)) {             \
                break;                                   \   
            } else if (!((mask).bits[0] >> (cpu)) & 1) { \
                continue;                                \   
            } else

I expect common subexpression optimization in the compiler to remove the
repetition here.  If not, somebody else can think about it!




Peter

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: OOPS: 2.6.16-rc6 cpufreq_conservative
  2006-03-20  7:18                 ` Willy Tarreau
@ 2006-03-21  6:32                   ` Willy Tarreau
  0 siblings, 0 replies; 31+ messages in thread
From: Willy Tarreau @ 2006-03-21  6:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, kernel-stuff, linux-kernel, alex-kernel,
	jun.nakajima, davej, viro

On Mon, Mar 20, 2006 at 08:18:46AM +0100, Willy Tarreau wrote:
> On Sun, Mar 19, 2006 at 10:26:30PM -0800, Linus Torvalds wrote:
> > 
> > 
> > On Mon, 20 Mar 2006, Willy Tarreau wrote:
> > > 
> > > Now, does removing the macro completely change the output code ?
> > > I think that if something written like this produces the same
> > > code, it would be easier to read :
> > > 
> > > #define for_each_cpu_mask(cpu, mask)			\
> > > 	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) {	\
> > > 		unsigned long __bits = (mask).bits[0] >> (cpu); \
> > > 		if (!__bits)				\
> > > 			break;				\
> > > 		if (!__bits & 1)			\
> > > 			continue;			\
> > > 		else
> > 
> > Absolutely, but now it has a dangling "{" that didn't get closed. So the 
> > above would definitely be more readable, it just doesn't actually work.
> > 
> > Unless you'd do the "end_for_each_cpu" define (to close the statement), 
> > and update the 300+ places that use this. Which might well be worth it.
> > 
> > So the subtle "break from the middle of a statement expression" was just a 
> > rather hacky way to avoid having to change all the users of this macro.
> >
> > 			Linus
> 
> Oh, you're right, now I understand your motivation in doing this.
> Then perhaps using your trick but applying it to the whole for loop
> would make it easier to read ?
> 
> #define for_each_cpu_mask(cpu, mask)			\
> 	for ((cpu) = 0; (cpu) < NR_CPUS; (cpu)++) 	\
> 	    ({	unsigned long __bits = (mask).bits[0] >> (cpu); \
> 		if (!__bits)				\
> 			break;				\
> 		if (!__bits & 1)			\
> 			continue;			\
> 		else					\
>                 ...					\
>             })

Well, forget it !
I did not realize that the 'else' was what called the inner loop.
So this construct it not possible at all either for the same reason.

Sorry for not having read the code before posting.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2006-03-21  6:36 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-18 20:25 OOPS: 2.6.16-rc6 cpufreq_conservative Parag Warudkar
2006-03-18 21:40 ` Linus Torvalds
2006-03-18 22:09   ` Parag Warudkar
2006-03-18 23:12     ` Linus Torvalds
2006-03-18 22:26   ` Parag Warudkar
2006-03-19  0:53   ` Andrew Morton
2006-03-19  2:38     ` Linus Torvalds
2006-03-19  5:08       ` Paul Jackson
2006-03-19 17:43         ` Linus Torvalds
2006-03-19 18:46       ` Linus Torvalds
2006-03-19 19:02         ` Linus Torvalds
2006-03-19 19:33           ` Linus Torvalds
2006-03-19 19:40             ` Al Viro
2006-03-19 20:01               ` Linus Torvalds
2006-03-19 20:31                 ` Linus Torvalds
2006-03-19 20:47                   ` Andrew Morton
2006-03-19 22:18                     ` Linus Torvalds
2006-03-19 22:35                       ` Andrew Morton
2006-03-19 22:55                         ` Linus Torvalds
2006-03-19 22:46                       ` Linus Torvalds
2006-03-19 23:04                         ` Andrew Morton
2006-03-19 20:57                   ` Parag Warudkar
2006-03-20  6:12             ` Willy Tarreau
2006-03-20  6:26               ` Linus Torvalds
2006-03-20  7:18                 ` Willy Tarreau
2006-03-21  6:32                   ` Willy Tarreau
2006-03-20  8:22               ` Peter T. Breuer
2006-03-19  6:34     ` Parag Warudkar
2006-03-19 12:00       ` Alexander Clouter
2006-03-19 14:06         ` Parag Warudkar
2006-03-19 17:34           ` Alexander Clouter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox