* Re: [PATCH] sched/topology: Use Identity node only if required [not found] ` <20180808075840.GO2494@hirez.programming.kicks-ass.net> @ 2018-08-10 16:45 ` Srikar Dronamraju 2018-08-29 8:43 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Srikar Dronamraju @ 2018-08-10 16:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild * Peter Zijlstra <peterz@infradead.org> [2018-08-08 09:58:41]: > On Wed, Aug 08, 2018 at 12:39:31PM +0530, Srikar Dronamraju wrote: > > With Commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node > > sched domain") scheduler introduces an extra numa level. However that > > leads to > > > > - numa topology on 2 node systems no more marked as NUMA_DIRECT. After > > this commit, it gets reported as NUMA_BACKPLANE. This is because > > sched_domains_numa_level now equals 2 on 2 node systems. > > > > - Extra numa sched domain that gets added and degenerated on most > > machines. The Identity node is only needed on very few systems. > > Also all non-numa systems will end up populating > > sched_domains_numa_distance and sched_domains_numa_masks tables. > > > > - On shared lpars like powerpc, this extra sched domain creation can > > lead to repeated rcu stalls, sometimes even causing unresponsive > > systems on boot. On such stalls, it was noticed that > > init_sched_groups_capacity() (sg != sd->groups is always true). > > The idea was that if the topology level is redundant (as it often is); > then the degenerate code would take it out. > > Why is that not working (right) and can we fix that instead? > Here is my analysis on another box showing same issue. numactl o/p available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39 64 65 66 67 68 69 70 71 96 97 98 99 100 101 102 103 128 129 130 131 132 133 134 135 160 161 162 163 164 165 166 167 192 193 194 195 196 197 198 199 224 225 226 227 228 229 230 231 256 257 258 259 260 261 262 263 288 289 290 291 292 293 294 295 node 0 size: 536888 MB node 0 free: 533582 MB node 1 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63 88 89 90 91 92 93 94 95 120 121 122 123 124 125 126 127 152 153 154 155 156 157 158 159 184 185 186 187 188 189 190 191 216 217 218 219 220 221 222 223 248 249 250 251 252 253 254 255 280 281 282 283 284 285 286 287 node 1 size: 502286 MB node 1 free: 501283 MB node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 80 81 82 83 84 85 86 87 112 113 114 115 116 117 118 119 144 145 146 147 148 149 150 151 176 177 178 179 180 181 182 183 208 209 210 211 212 213 214 215 240 241 242 243 244 245 246 247 272 273 274 275 276 277 278 279 node 2 size: 503054 MB node 2 free: 502854 MB node 3 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 104 105 106 107 108 109 110 111 136 137 138 139 140 141 142 143 168 169 170 171 172 173 174 175 200 201 202 203 204 205 206 207 232 233 234 235 236 237 238 239 264 265 266 267 268 269 270 271 296 297 298 299 300 301 302 303 node 3 size: 503310 MB node 3 free: 498465 MB node distances: node 0 1 2 3 0: 10 40 40 40 1: 40 10 40 40 2: 40 40 10 40 3: 40 40 40 10 Extracting the contents of dmesg using sched_debug kernel parameter CPU0 attaching NULL sched-domain. CPU1 attaching NULL sched-domain. .... .... CPU302 attaching NULL sched-domain. CPU303 attaching NULL sched-domain. BUG: arch topology borken the DIE domain not a subset of the NODE domain BUG: arch topology borken the DIE domain not a subset of the NODE domain ..... ..... BUG: arch topology borken the DIE domain not a subset of the NODE domain BUG: arch topology borken the DIE domain not a subset of the NODE domain BUG: arch topology borken the DIE domain not a subset of the NODE domain CPU0 attaching sched-domain(s): domain-2: sdA, span=0-303 level=NODE groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } CPU1 attaching sched-domain(s): domain-2: sdB, span=0-303 level=NODE [ 367.739387] groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } CPU8 attaching sched-domain(s): domain-2: sdC, span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE groups: sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 } domain-3: sdD, span=0-303 level=NUMA groups: sgX 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sgY 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgZ 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } ERROR: groups don't span domain->span CPU9 attaching sched-domain(s): domain-2: sdE span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE groups: sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 } domain-3: sdF span=0-303 level=NUMA groups: sgP 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sgQ 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgR 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } ERROR: groups don't span domain->span Trying to summarize further + Node sched domain groups are initialised with build_sched_groups (that tried to share groups) + Numa sched domain groups are initialised with build_overlap_sched_groups Cpu 0: sdA->groups sgL ->next= sgM ->next= sgN ->next= sgO Cpu 1: sdB->groups sgL ->next= sgM ->next= sgN ->next= sgO However Cpu 8: sdC->groups -> sgM ->next= sgM (NODE) Cpu 8: sdD->groups sgX ->next= sgY ->next= sgZ (NUMA) Cpu 9: sdE->groups -> sgM ->next= sgM (NODE) Cpu 1: sdB->groups sgP ->next= sgQ ->next= sg (NUMA) In init_sched_group_capacity(), When we start with sdB->groups and reach sgM but sgM->next happens to be sgM. However sdB->groups != sdM With non-identity NUMA sched_domains, build_overlap_sched_groups creates new groups per sched-domain, so the problem is masked. i.e On a topology update, the sched_domain_numa_mask aren't getting updated. causing very wierd sched domains. The Identity node sched domain further complicates the problem. One solution would be to expose sched_domain_numa_mask_set/clear so that the archs can help build correct/proper sched_domains. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched/topology: Use Identity node only if required 2018-08-10 16:45 ` [PATCH] sched/topology: Use Identity node only if required Srikar Dronamraju @ 2018-08-29 8:43 ` Peter Zijlstra 2018-08-29 8:57 ` Peter Zijlstra 2018-08-31 10:22 ` Srikar Dronamraju 0 siblings, 2 replies; 7+ messages in thread From: Peter Zijlstra @ 2018-08-29 8:43 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild On Fri, Aug 10, 2018 at 09:45:33AM -0700, Srikar Dronamraju wrote: > available: 4 nodes (0-3) > node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39 64 65 66 67 68 69 70 71 96 97 98 99 100 101 102 103 128 129 130 131 132 133 134 135 160 161 162 163 164 165 166 167 192 193 194 195 196 197 198 199 224 225 226 227 228 229 230 231 256 257 258 259 260 261 262 263 288 289 290 291 292 293 294 295 > node 0 size: 536888 MB > node 0 free: 533582 MB > node 1 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63 88 89 90 91 92 93 94 95 120 121 122 123 124 125 126 127 152 153 154 155 156 157 158 159 184 185 186 187 188 189 190 191 216 217 218 219 220 221 222 223 248 249 250 251 252 253 254 255 280 281 282 283 284 285 286 287 > node 1 size: 502286 MB > node 1 free: 501283 MB > node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 80 81 82 83 84 85 86 87 112 113 114 115 116 117 118 119 144 145 146 147 148 149 150 151 176 177 178 179 180 181 182 183 208 209 210 211 212 213 214 215 240 241 242 243 244 245 246 247 272 273 274 275 276 277 278 279 > node 2 size: 503054 MB > node 2 free: 502854 MB > node 3 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 104 105 106 107 108 109 110 111 136 137 138 139 140 141 142 143 168 169 170 171 172 173 174 175 200 201 202 203 204 205 206 207 232 233 234 235 236 237 238 239 264 265 266 267 268 269 270 271 296 297 298 299 300 301 302 303 > node 3 size: 503310 MB > node 3 free: 498465 MB > node distances: > node 0 1 2 3 > 0: 10 40 40 40 > 1: 40 10 40 40 > 2: 40 40 10 40 > 3: 40 40 40 10 > > Extracting the contents of dmesg using sched_debug kernel parameter > > CPU0 attaching NULL sched-domain. > CPU1 attaching NULL sched-domain. > .... > .... > CPU302 attaching NULL sched-domain. > CPU303 attaching NULL sched-domain. > BUG: arch topology borken > the DIE domain not a subset of the NODE domain ^^^^^ CLUE!! but nowhere did you show what it thinks the DIE mask is. > CPU0 attaching sched-domain(s): > domain-2: sdA, span=0-303 level=NODE > groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > CPU1 attaching sched-domain(s): > domain-2: sdB, span=0-303 level=NODE > [ 367.739387] groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } You forgot to provide the rest of it... what's domain-[01] look like? DIE(j) should be: cpu_cpu_mask(j) := cpumask_of_node(cpu_to_node(j)) and NODE(j) should be: \Union_k cpumask_of_node(k) ; where node_distance(j,k) <= node_distance(0,0) which, _should_ reduce to: cpumask_of_node(j) and thus DIE and NODE _should_ be the same here. So what's going sideways? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched/topology: Use Identity node only if required 2018-08-29 8:43 ` Peter Zijlstra @ 2018-08-29 8:57 ` Peter Zijlstra 2018-08-31 10:22 ` Srikar Dronamraju 1 sibling, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2018-08-29 8:57 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild On Wed, Aug 29, 2018 at 10:43:48AM +0200, Peter Zijlstra wrote: > DIE(j) should be: > > cpu_cpu_mask(j) := cpumask_of_node(cpu_to_node(j)) FWIW, I was expecting that to be topology_core_cpumask(), so I'm a little confused myself just now. > and NODE(j) should be: > > \Union_k cpumask_of_node(k) ; where node_distance(j,k) <= node_distance(0,0) > > which, _should_ reduce to: > > cpumask_of_node(j) > > and thus DIE and NODE _should_ be the same here. > > So what's going sideways? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched/topology: Use Identity node only if required 2018-08-29 8:43 ` Peter Zijlstra 2018-08-29 8:57 ` Peter Zijlstra @ 2018-08-31 10:22 ` Srikar Dronamraju 2018-08-31 10:41 ` Peter Zijlstra 1 sibling, 1 reply; 7+ messages in thread From: Srikar Dronamraju @ 2018-08-31 10:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild, Benjamin Herrenschmidt * Peter Zijlstra <peterz@infradead.org> [2018-08-29 10:43:48]: > On Fri, Aug 10, 2018 at 09:45:33AM -0700, Srikar Dronamraju wrote: >=20 > > .... > > CPU302 attaching NULL sched-domain. > > CPU303 attaching NULL sched-domain. > > BUG: arch topology borken > > the DIE domain not a subset of the NODE domain >=20 > ^^^^^ CLUE!! >=20 > but nowhere did you show what it thinks the DIE mask is. >=20 > > CPU0 attaching sched-domain(s): > > domain-2: sdA, span=3D0-303 level=3DNODE > > groups: sg=3DsgL 0:{ span=3D0-7,32-39,64-71,96-103,128-135,160-167,= 192-199,224-231,256-263,288-295 cap=3D81920 }, sgM 8:{ span=3D8-15,40-47,72= -79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=3D81920 }, = sdN 16:{ span=3D16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,2= 72-279 cap=3D73728 }, sgO 24:{ span=3D24-31,56-63,88-95,120-127,152-159,184= -191,216-223,248-255,280-287 cap=3D73728 } > > CPU1 attaching sched-domain(s): > > domain-2: sdB, span=3D0-303 level=3DNODE > > [ 367.739387] groups: sg=3DsgL 0:{ span=3D0-7,32-39,64-71,96-103,1= 28-135,160-167,192-199,224-231,256-263,288-295 cap=3D81920 }, sgM 8:{ span= =3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303= cap=3D81920 }, sdN 16:{ span=3D16-23,48-55,80-87,112-119,144-151,176-183,2= 08-215,240-247,272-279 cap=3D73728 }, sgO 24:{ span=3D24-31,56-63,88-95,120= -127,152-159,184-191,216-223,248-255,280-287 cap=3D73728 } >=20 > You forgot to provide the rest of it... what's domain-[01] look like? At boot: Before topology update. For CPU 0=20 domain-0: span=3D0-7 level=3DSMT groups: 0:{ span=3D0 }, 1:{ span=3D1 }, 2:{ span=3D2 }, 3:{ span=3D3 }, 4:= { span=3D4 }, 5:{ span=3D5 }, 6:{ span=3D6 }, 7:{ span=3D7 } domain-1: span=3D0-303 level=3DDIE groups: 0:{ span=3D0-7 cap=3D8192 }, 8:{ span=3D8-15 cap=3D8192 }, 16:{ s= pan=3D16-23 cap=3D8192 }, 24:{ span=3D24-31 cap=3D8192 }, 32:{ span=3D32-39= cap=3D8192 }, 40:{ span=3D40-47 cap=3D8192 }, 48:{ span=3D48-55 cap=3D8192= }, 56:{ span=3D56-63 cap=3D8192 }, 64:{ span=3D64-71 cap=3D8192 }, 72:{ sp= an=3D72-79 cap=3D8192 }, 80:{ span=3D80-87 cap=3D8192 }, 88:{ span=3D88-95 = cap=3D8192 }, 96:{ span=3D96-103 cap=3D8192 }, 104:{ span=3D104-111 cap=3D8= 192 }, 112:{ span=3D112-119 cap=3D8192 }, 120:{ span=3D120-127 cap=3D8192 }= , 128:{ span=3D128-135 cap=3D8192 }, 136:{ span=3D136-143 cap=3D8192 }, 144= :{ span=3D144-151 cap=3D8192 }, 152:{ span=3D152-159 cap=3D8192 }, 160:{ sp= an=3D160-167 cap=3D8192 }, 168:{ span=3D168-175 cap=3D8192 }, 176:{ span=3D= 176-183 cap=3D8192 }, 184:{ span=3D184-191 cap=3D8192 }, 192:{ span=3D192-1= 99 cap=3D8192 }, 200:{ span=3D200-207 cap=3D8192 }, 208:{ span=3D208-215 ca= p=3D8192 }, 216:{ span=3D216-223 cap=3D8192 }, 224:{ span=3D224-231 cap=3D8= 192 }, 232:{ span=3D232-239 cap=3D8192 }, 240:{ span=3D240-247 cap=3D8192 }= , 248:{ span=3D248-255 cap=3D8192 }, 256:{ span=3D256-263 cap=3D8192 }, 264= :{ span=3D264-271 cap=3D8192 }, 272:{ span=3D272-279 cap=3D8192 }, 280:{ sp= an=3D280-287 cap=3D8192 }, 288:{ span=3D288-295 cap=3D8192 }, 296:{ span=3D= 296-303 cap=3D8192 } For CPU 1=20 domain-0: span=3D0-7 level=3DSMT groups: 1:{ span=3D1 }, 2:{ span=3D2 }, 3:{ span=3D3 }, 4:{ span=3D4 }, 5:= { span=3D5 }, 6:{ span=3D6 }, 7:{ span=3D7 }, 0:{ span=3D0 } domain-1: span=3D0-303 level=3DDIE groups: 0:{ span=3D0-7 cap=3D8192 }, 8:{ span=3D8-15 cap=3D8192 }, 16:{ s= pan=3D16-23 cap=3D8192 }, 24:{ span=3D24-31 cap=3D8192 }, 32:{ span=3D32-39= cap=3D8192 }, 40:{ span=3D40-47 cap=3D8192 }, 48:{ span=3D48-55 cap=3D8192= }, 56:{ span=3D56-63 cap=3D8192 }, 64:{ span=3D64-71 cap=3D8192 }, 72:{ sp= an=3D72-79 cap=3D8192 }, 80:{ span=3D80-87 cap=3D8192 }, 88:{ span=3D88-95 = cap=3D8192 }, 96:{ span=3D96-103 cap=3D8192 }, 104:{ span=3D104-111 cap=3D8= 192 }, 112:{ span=3D112-119 cap=3D8192 }, 120:{ span=3D120-127 cap=3D8192 }= , 128:{ span=3D128-135 cap=3D8192 }, 136:{ span=3D136-143 cap=3D8192 }, 144= :{ span=3D144-151 cap=3D8192 }, 152:{ span=3D152-159 cap=3D8192 }, 160:{ sp= an=3D160-167 cap=3D8192 }, 168:{ span=3D168-175 cap=3D8192 }, 176:{ span=3D= 176-183 cap=3D8192 }, 184:{ span=3D184-191 cap=3D8192 }, 192:{ span=3D192-1= 99 cap=3D8192 }, 200:{ span=3D200-207 cap=3D8192 }, 208:{ span=3D208-215 ca= p=3D8192}, 216:{ span=3D216-223 cap=3D8192 }, 224:{ span=3D224-231 cap=3D81= 92 }, 232:{ span=3D232-239 cap=3D8192 }, 240:{ span=3D240-247 cap=3D8192 },= 248:{ span=3D248-255 cap=3D8192 }, 256:{ span=3D256-263 cap=3D8192 }, 264:= { span=3D264-271 cap=3D8192 }, 272:{ span=3D272-279 cap=3D8192 }, 280:{ spa= n=3D280-287 cap=3D8192 }, 288:{ span=3D288-295 cap=3D8192 }, 296:{ span=3D2= 96-303 cap=3D8192 } For CPU 8 domain-0: span=3D8-15 level=3DSMT groups: 8:{ span=3D8 }, 9:{ span=3D9 }, 10:{ span=3D10 }, 11:{ span=3D11 }= , 12:{ span=3D12 }, 13:{ span=3D13 }, 14:{ span=3D14 }, 15:{ span=3D15 } domain-1: span=3D0-303 level=3DDIE groups: 8:{ span=3D8-15 cap=3D8192 }, 16:{ span=3D16-23 cap=3D8192 }, 24:= { span=3D24-31 cap=3D8192 }, 32:{ span=3D32-39 cap=3D8192 }, 40:{ span=3D40= -47 cap=3D8192 }, 48:{ span=3D48-55 cap=3D8192 }, 56:{ span=3D56-63 cap=3D8= 192 }, 64:{ span=3D64-71 cap=3D8192 }, 72:{ span=3D72-79 cap=3D8192 }, 80:{= span=3D80-87 cap=3D8192 }, 88:{ span=3D88-95 cap=3D8192 }, 96:{ span=3D96-= 103 cap=3D8192 }, 104:{ span=3D104-111 cap=3D8192 }, 112:{ span=3D112-119 c= ap=3D8192 }, 120:{ span=3D120-127 cap=3D8192 }, 128:{ span=3D128-135 cap=3D= 8192 }, 136:{ span=3D136-143 cap=3D8192 }, 144:{ span=3D144-151 cap=3D8192 = }, 152:{ span=3D152-159 cap=3D8192 }, 160:{ span=3D160-167 cap=3D8192 }, 16= 8:{ span=3D168-175 cap=3D8192 }, 176:{ span=3D176-183 cap=3D8192 }, 184:{ s= pan=3D184-191 cap=3D8192 }, 192:{ span=3D192-199 cap=3D8192 }, 200:{ span= =3D200-207 cap=3D8192 }, 208:{ span=3D208-215 cap=3D8192 }, 216:{ span=3D21= 6-223 cap=3D8192 }, 224:{ span=3D224-231 cap=3D8192 }, 232:{ span=3D232-239= cap=3D8192 }, 240:{ span=3D240-247 cap=3D8192 }, 248:{ span=3D248-255 cap= =3D8192 }, 256:{ span=3D256-263 cap=3D8192 }, 264:{ span=3D264-271 cap=3D81= 92 }, 272:{ span=3D272-279 cap=3D8192 }, 280:{ span=3D280-287 cap=3D8192 },= 288:{ span=3D288-295 cap=3D8192 }, 296:{ span=3D296-303 cap=3D8192 }, 0:{ = span=3D0-7 cap=3D8192 } For CPU 9=20 domain-0: span=3D8-15 level=3DSMT groups: 9:{ span=3D9 }, 10:{ span=3D10 }, 11:{ span=3D11 }, 12:{ span=3D12= }, 13:{ span=3D13 }, 14:{ span=3D14 }, 15:{ span=3D15 }, 8:{ span=3D8 } domain-1: span=3D0-303 level=3DDIE groups: 8:{ span=3D8-15 cap=3D8192 }, 16:{ span=3D16-23 cap=3D8192 }, 24:= { span=3D24-31 cap=3D8192 }, 32:{ span=3D32-39 cap=3D8192 }, 40:{ span=3D40= -47 cap=3D8192 }, 48:{ span=3D48-55 cap=3D8192 }, 56:{ span=3D56-63 cap=3D8= 192 }, 64:{ span=3D64-71 cap=3D8192 }, 72:{ span=3D72-79 cap=3D8192 }, 80:{= span=3D80-87 cap=3D8192 }, 88:{ span=3D88-95 cap=3D8192 }, 96:{ span=3D96-= 103 cap=3D8192 }, 104:{ span=3D104-111 cap=3D8192 }, 112:{ span=3D112-119 c= ap=3D8192 }, 120:{ span=3D120-127 cap=3D8192 }, 128:{ span=3D128-135 cap=3D= 8192 }, 136:{ span=3D136-143 cap=3D8192 }, 144:{ span=3D144-151 cap=3D8192 = }, 152:{ span=3D152-159 cap=3D8192 }, 160:{ span=3D160-167 cap=3D8192 }, 16= 8:{ span=3D168-175 cap=3D8192 }, 176:{ span=3D176-183 cap=3D8192 }, 184:{ s= pan=3D184-191 cap=3D8192 }, 192:{ span=3D192-199 cap=3D8192 }, 200:{ span= =3D200-207 cap=3D8192 }, 208:{ span=3D208-215 cap=3D8192 }, 216:{ span=3D21= 6-223 cap=3D8192 }, 224:{ span=3D224-231 cap=3D8192 }, 232:{ span=3D232-239= cap=3D8192 }, 240:{ span=3D240-247 cap=3D8192 }, 248:{ span=3D248-255 cap= =3D8192 }, 256:{ span=3D256-263 cap=3D8192 }, 264:{ span=3D264-271 cap=3D81= 92 }, 272:{ span=3D272-279 cap=3D8192 }, 280:{ span=3D280-287 cap=3D8192 },= 288:{ span=3D288-295 cap=3D8192 }, 296:{ span=3D296-303 cap=3D8192 }, 0:{ = span=3D0-7 cap=3D8192 } After topology update. For CPU 0 domain-0: span=3D0-7 level=3DSMT groups: 0:{ span=3D0 }, 1:{ span=3D1 }, 2:{ span=3D2 }, 3:{ span=3D3 }, 4:= { span=3D4 }, 5:{ span=3D5 }, 6:{ span=3D6 }, 7:{ span=3D7 } domain-1: span=3D0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,25= 6-263,288-295 level=3DDIE groups: 0:{ span=3D0-7 cap=3D8192 }, 32:{ span=3D32-39 cap=3D8192 }, 64:{= span=3D64-71 cap=3D8192 }, 96:{ span=3D96-103 cap=3D8192 }, 128:{ span=3D1= 28-135 cap=3D8192 }, 160:{ span=3D160-167 cap=3D8192 }, 192:{ span=3D192-19= 9 cap=3D8192 }, 224:{ span=3D224-231 cap=3D8192 }, 256:{ span=3D256-263 cap= =3D8192 }, 288:{ span=3D288-295 cap=3D8192 } domain-2: span=3D0-303 level=3DNODE groups: 0:{ span=3D0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-23= 1,256-263,288-295 cap=3D81920 }, 8:{ span=3D8-15,40-47,72-79,104-111,136-14= 3,168-175,200-207,232-239,264-271,296-303 cap=3D81920 }, 16:{ span=3D16-23,= 48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=3D73728 }, = 24:{ span=3D24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-2= 87 cap=3D73728 } For CPU 1 domain-0: span=3D0-7 level=3DSMT groups: 1:{ span=3D1 }, 2:{ span=3D2 }, 3:{ span=3D3 }, 4:{ span=3D4 }, 5:= { span=3D5 }, 6:{ span=3D6 }, 7:{ span=3D7 }, 0:{ span=3D0 } domain-1: span=3D0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,25= 6-263,288-295 level=3DDIE groups: 0:{ span=3D0-7 cap=3D8192 }, 32:{ span=3D32-39 cap=3D8192 }, 64:{= span=3D64-71 cap=3D8192 }, 96:{ span=3D96-103 cap=3D8192 }, 128:{ span=3D1= 28-135 cap=3D8192 }, 160:{ span=3D160-167 cap=3D8192 }, 192:{ span=3D192-19= 9 cap=3D8192 }, 224:{ span=3D224-231 cap=3D8192 }, 256:{ span=3D256-263 cap= =3D8192 }, 288:{ span=3D288-295 cap=3D8192 } domain-2: span=3D0-303 level=3DNODE groups: 0:{ span=3D0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-23= 1,256-263,288-295 cap=3D81920 }, 8:{ span=3D8-15,40-47,72-79,104-111,136-14= 3,168-175,200-207,232-239,264-271,296-303 cap=3D81920 }, 16:{ span=3D16-23,= 48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=3D73728 }, = 24:{ span=3D24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-2= 87 cap=3D73728 } For CPU 8 domain-0: span=3D8-15 level=3DSMT groups: 8:{ span=3D8 }, 9:{ span=3D9 }, 10:{ span=3D10 }, 11:{ span=3D11 = }, 12:{ span=3D12 }, 13:{ span=3D13 }, 14:{ span=3D14 }, 15:{ span=3D15 } domain-1: span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239= ,264-271,296-303 level=3DDIE groups: 8:{ span=3D8-15 cap=3D8192 }, 40:{ span=3D40-47 cap=3D8192 }, 72= :{ span=3D72-79 cap=3D8192 }, 104:{ span=3D104-111 cap=3D8192 }, 136:{ span= =3D136-143 cap=3D8192 }, 168:{ span=3D168-175 cap=3D8192 }, 200:{ span=3D20= 0-207 cap=3D8192 }, 232:{ span=3D232-239 cap=3D8192 }, 264:{ span=3D264-271= cap=3D8192 }, 296:{ span=3D296-303 cap=3D8192 } domain-2: span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-23= 9,264-271,296-303 level=3DNODE groups: 8:{ span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232= -239,264-271,296-303 cap=3D81920 } domain-3: span=3D0-303 level=3DNUMA groups: 8:{ span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,23= 2-239,264-271,296-303 cap=3D81920 }, 16:{ span=3D16-23,48-55,80-87,112-119,= 144-151,176-183,208-215,240-247,272-279 cap=3D73728 }, 24:{ span=3D24-31,56= -63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=3D73728 } ERROR: groups don't span domain->span For CPU 9 domain-0: span=3D8-15 level=3DSMT groups: 9:{ span=3D9 }, 10:{ span=3D10 }, 11:{ span=3D11 }, 12:{ span=3D1= 2 }, 13:{ span=3D13 }, 14:{ span=3D14 }, 15:{ span=3D15 }, 8:{ span=3D8 } domain-1: span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239= ,264-271,296-303 level=3DDIE groups: 8:{ span=3D8-15 cap=3D8192 }, 40:{ span=3D40-47 cap=3D8192 }, 72= :{ span=3D72-79 cap=3D8192 }, 104:{ span=3D104-111 cap=3D8192 }, 136:{ span= =3D136-143 cap=3D8192 }, 168:{ span=3D168-175 cap=3D8192 }, 200:{ span=3D20= 0-207 cap=3D8192 }, 232:{ span=3D232-239 cap=3D8192 }, 264:{ span=3D264-271= cap=3D8192 }, 296:{ span=3D296-303 cap=3D8192 } domain-2: span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-23= 9,264-271,296-303 level=3DNODE groups: 8:{ span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,232= -239,264-271,296-303 cap=3D81920 } domain-3: span=3D0-303 level=3DNUMA groups: 8:{ span=3D8-15,40-47,72-79,104-111,136-143,168-175,200-207,23= 2-239,264-271,296-303 cap=3D81920 }, 16:{ span=3D16-23,48-55,80-87,112-119,= 144-151,176-183,208-215,240-247,272-279 cap=3D73728 }, 24:{ span=3D24-31,56= -63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=3D73728 } ERROR: groups don't span domain->span --=20 Thanks and Regards Srikar Dronamraju ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched/topology: Use Identity node only if required 2018-08-31 10:22 ` Srikar Dronamraju @ 2018-08-31 10:41 ` Peter Zijlstra 2018-08-31 11:26 ` Srikar Dronamraju 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2018-08-31 10:41 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild, Benjamin Herrenschmidt On Fri, Aug 31, 2018 at 03:22:48AM -0700, Srikar Dronamraju wrote: > At boot: Before topology update. How does that work; you do SMP bringup _before_ you know the topology !? > After topology update. > > For CPU 0 > domain-0: span=0-7 level=SMT > groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 } > domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE > groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 } > domain-2: span=0-303 level=NODE > groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > > For CPU 1 > domain-0: span=0-7 level=SMT > groups: 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }, 0:{ span=0 } > domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE > groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 } > domain-2: span=0-303 level=NODE > groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > > > For CPU 8 > domain-0: span=8-15 level=SMT > groups: 8:{ span=8 }, 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 } > domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE > groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 } > domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE > groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 } > domain-3: span=0-303 level=NUMA > groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > ERROR: groups don't span domain->span > > For CPU 9 > domain-0: span=8-15 level=SMT > groups: 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }, 8:{ span=8 } > domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE > groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 } > domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE > groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 } > domain-3: span=0-303 level=NUMA > groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > ERROR: groups don't span domain->span This is all very confused... and does not include the error we saw earlier. CPU 0 has: SMT, DIE, NODE CPU 8 has: SMT, DIE, NODE, NUMA Something is completely buggered in your topology setup. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched/topology: Use Identity node only if required 2018-08-31 10:41 ` Peter Zijlstra @ 2018-08-31 11:26 ` Srikar Dronamraju 2018-08-31 12:06 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Srikar Dronamraju @ 2018-08-31 11:26 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild, Benjamin Herrenschmidt * Peter Zijlstra <peterz@infradead.org> [2018-08-31 12:41:15]: > On Fri, Aug 31, 2018 at 03:22:48AM -0700, Srikar Dronamraju wrote: > > > At boot: Before topology update. > > How does that work; you do SMP bringup _before_ you know the topology !? > If you look at the other mail that I sent, the system boots to its regular state with a certain topology. The hypervisor might detect and push topology updates after the system has been booted and initialized. This topology update can happen much much later after boot. We boot with a particular topology and a later point of time, the topology update event occurs. > > After topology update. > > > > For CPU 0 > > domain-0: span=0-7 level=SMT > > groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 } > > domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE > > groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 } > > domain-2: span=0-303 level=NODE > > groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > > > > For CPU 9 > > domain-0: span=8-15 level=SMT > > groups: 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }, 8:{ span=8 } > > domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE > > groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 } > > domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE > > groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 } > > domain-3: span=0-303 level=NUMA > > groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 } > > ERROR: groups don't span domain->span > > This is all very confused... and does not include the error we saw > earlier. > > CPU 0 has: SMT, DIE, NODE > CPU 8 has: SMT, DIE, NODE, NUMA > This was the same in my previous posting too. Before the topology update happened, all the cpus would be in SMT, DIE. The topology updates can be disabled using a kernel parameter topology_updates=off. Its documented under https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html as topology_updates= [KNL, PPC, NUMA] Format: {off} Specify if the kernel should ignore (off) topology updates sent by the hypervisor to this LPAR. and is not something new in powerpc. > Something is completely buggered in your topology setup. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched/topology: Use Identity node only if required 2018-08-31 11:26 ` Srikar Dronamraju @ 2018-08-31 12:06 ` Peter Zijlstra 0 siblings, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2018-08-31 12:06 UTC (permalink / raw) To: Srikar Dronamraju Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner, Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit, linuxppc-dev, Andre Wild, Benjamin Herrenschmidt On Fri, Aug 31, 2018 at 04:56:18PM +0530, Srikar Dronamraju wrote: > This was the same in my previous posting too. Before the topology update > happened, all the cpus would be in SMT, DIE. The topology updates can be > disabled using a kernel parameter topology_updates=off. Its documented under > https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html as > > topology_updates= [KNL, PPC, NUMA] Format: {off} Specify if the kernel > should ignore (off) topology updates sent by the hypervisor to this > LPAR. > > and is not something new in powerpc. Doesn't mean it isn't utterly broken. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-08-31 12:38 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1533712172-11428-1-git-send-email-srikar@linux.vnet.ibm.com> [not found] ` <20180808075840.GO2494@hirez.programming.kicks-ass.net> 2018-08-10 16:45 ` [PATCH] sched/topology: Use Identity node only if required Srikar Dronamraju 2018-08-29 8:43 ` Peter Zijlstra 2018-08-29 8:57 ` Peter Zijlstra 2018-08-31 10:22 ` Srikar Dronamraju 2018-08-31 10:41 ` Peter Zijlstra 2018-08-31 11:26 ` Srikar Dronamraju 2018-08-31 12:06 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).