* [PATCH 0/2] XSA-74 follow-ups @ 2013-11-27 8:07 Jan Beulich 2013-11-27 8:25 ` [PATCH 1/2] fix locking in offline_page() Jan Beulich ` (3 more replies) 0 siblings, 4 replies; 20+ messages in thread From: Jan Beulich @ 2013-11-27 8:07 UTC (permalink / raw) To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, Tim Deegan 1: fix locking in offline_page() 2: use return value of domain_adjust_tot_pages() where feasible Signed-off-by: Jan Beulich <jbeulich@suse.com> ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] fix locking in offline_page() 2013-11-27 8:07 [PATCH 0/2] XSA-74 follow-ups Jan Beulich @ 2013-11-27 8:25 ` Jan Beulich 2013-11-27 10:01 ` Andrew Cooper 2013-11-27 14:48 ` George Dunlap 2013-11-27 8:26 ` [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible Jan Beulich ` (2 subsequent siblings) 3 siblings, 2 replies; 20+ messages in thread From: Jan Beulich @ 2013-11-27 8:25 UTC (permalink / raw) To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, Tim Deegan [-- Attachment #1: Type: text/plain, Size: 3049 bytes --] Coverity ID 1055655 Apart from the Coverity-detected lock order reversal (a domain's page_alloc_lock taken with the heap lock already held), calling put_page() with heap_lock is a bad idea too (as a possible descendant from put_page() is free_heap_pages(), which wants to take this very lock). >From all I can tell the region over which heap_lock was held was far too large: All we need to protect are the call to mark_page_offline() and reserve_heap_page() (and I'd even put under question the need for the former). Hence by slightly re-arranging the if/else-if chain we can drop the lock much earlier, at once no longer covering the two put_page() invocations. Once at it, do a little bit of other cleanup: Put the "pod_replace" code path inline rather than at its own label, and drop the effectively unused variable "ret". Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int { unsigned long old_info = 0; struct domain *owner; - int ret = 0; struct page_info *pg; if ( !mfn_valid(mfn) ) @@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int if ( page_state_is(pg, offlined) ) { reserve_heap_page(pg); - *status = PG_OFFLINE_OFFLINED; + + spin_unlock(&heap_lock); + + *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN + : PG_OFFLINE_OFFLINED; + return 0; } - else if ( (owner = page_get_owner_and_reference(pg)) ) + + spin_unlock(&heap_lock); + + if ( (owner = page_get_owner_and_reference(pg)) ) { if ( p2m_pod_offline_or_broken_hit(pg) ) - goto pod_replace; + { + put_page(pg); + p2m_pod_offline_or_broken_replace(pg); + *status = PG_OFFLINE_OFFLINED; + } else { *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); /* Release the reference since it will not be allocated anymore */ put_page(pg); } @@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int else if ( old_info & PGC_xen_heap ) { *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING | - (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); + (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); } else { @@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int if ( broken ) *status |= PG_OFFLINE_BROKEN; - spin_unlock(&heap_lock); - - return ret; - -pod_replace: - put_page(pg); - spin_unlock(&heap_lock); - - p2m_pod_offline_or_broken_replace(pg); - *status = PG_OFFLINE_OFFLINED; - - if ( broken ) - *status |= PG_OFFLINE_BROKEN; - - return ret; + return 0; } /* [-- Attachment #2: offline-page-locking.patch --] [-- Type: text/plain, Size: 3075 bytes --] fix locking in offline_page() Coverity ID 1055655 Apart from the Coverity-detected lock order reversal (a domain's page_alloc_lock taken with the heap lock already held), calling put_page() with heap_lock is a bad idea too (as a possible descendant from put_page() is free_heap_pages(), which wants to take this very lock). From all I can tell the region over which heap_lock was held was far too large: All we need to protect are the call to mark_page_offline() and reserve_heap_page() (and I'd even put under question the need for the former). Hence by slightly re-arranging the if/else-if chain we can drop the lock much earlier, at once no longer covering the two put_page() invocations. Once at it, do a little bit of other cleanup: Put the "pod_replace" code path inline rather than at its own label, and drop the effectively unused variable "ret". Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int { unsigned long old_info = 0; struct domain *owner; - int ret = 0; struct page_info *pg; if ( !mfn_valid(mfn) ) @@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int if ( page_state_is(pg, offlined) ) { reserve_heap_page(pg); - *status = PG_OFFLINE_OFFLINED; + + spin_unlock(&heap_lock); + + *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN + : PG_OFFLINE_OFFLINED; + return 0; } - else if ( (owner = page_get_owner_and_reference(pg)) ) + + spin_unlock(&heap_lock); + + if ( (owner = page_get_owner_and_reference(pg)) ) { if ( p2m_pod_offline_or_broken_hit(pg) ) - goto pod_replace; + { + put_page(pg); + p2m_pod_offline_or_broken_replace(pg); + *status = PG_OFFLINE_OFFLINED; + } else { *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); /* Release the reference since it will not be allocated anymore */ put_page(pg); } @@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int else if ( old_info & PGC_xen_heap ) { *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING | - (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); + (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); } else { @@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int if ( broken ) *status |= PG_OFFLINE_BROKEN; - spin_unlock(&heap_lock); - - return ret; - -pod_replace: - put_page(pg); - spin_unlock(&heap_lock); - - p2m_pod_offline_or_broken_replace(pg); - *status = PG_OFFLINE_OFFLINED; - - if ( broken ) - *status |= PG_OFFLINE_BROKEN; - - return ret; + return 0; } /* [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 8:25 ` [PATCH 1/2] fix locking in offline_page() Jan Beulich @ 2013-11-27 10:01 ` Andrew Cooper 2013-11-27 10:05 ` Jan Beulich 2013-11-27 14:48 ` George Dunlap 1 sibling, 1 reply; 20+ messages in thread From: Andrew Cooper @ 2013-11-27 10:01 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, xen-devel, Keir Fraser, Tim Deegan On 27/11/13 08:25, Jan Beulich wrote: > Coverity ID 1055655 > > Apart from the Coverity-detected lock order reversal (a domain's > page_alloc_lock taken with the heap lock already held), calling > put_page() with heap_lock is a bad idea too (as a possible descendant > from put_page() is free_heap_pages(), which wants to take this very > lock). > > From all I can tell the region over which heap_lock was held was far > too large: All we need to protect are the call to mark_page_offline() > and reserve_heap_page() (and I'd even put under question the need for > the former). Hence by slightly re-arranging the if/else-if chain we > can drop the lock much earlier, at once no longer covering the two > put_page() invocations. > > Once at it, do a little bit of other cleanup: Put the "pod_replace" > code path inline rather than at its own label, and drop the effectively > unused variable "ret". > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int > { > unsigned long old_info = 0; > struct domain *owner; > - int ret = 0; > struct page_info *pg; > > if ( !mfn_valid(mfn) ) > @@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int > if ( page_state_is(pg, offlined) ) > { > reserve_heap_page(pg); > - *status = PG_OFFLINE_OFFLINED; > + > + spin_unlock(&heap_lock); > + > + *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN > + : PG_OFFLINE_OFFLINED; > + return 0; > } > - else if ( (owner = page_get_owner_and_reference(pg)) ) > + > + spin_unlock(&heap_lock); > + > + if ( (owner = page_get_owner_and_reference(pg)) ) > { > if ( p2m_pod_offline_or_broken_hit(pg) ) > - goto pod_replace; > + { > + put_page(pg); > + p2m_pod_offline_or_broken_replace(pg); > + *status = PG_OFFLINE_OFFLINED; > + } > else > { > *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | > - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); domain_id will be promoted from a uint16_t to an int32_t, then shifted left by 16 bits which is undefined if the top of domain_id bit was set. Do we care about the likelyhood of a domain_id with the top bit set? I certainly cant see how one would get into that state. ~Andrew > /* Release the reference since it will not be allocated anymore */ > put_page(pg); > } > @@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int > else if ( old_info & PGC_xen_heap ) > { > *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING | > - (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); > + (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); > } > else > { > @@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int > if ( broken ) > *status |= PG_OFFLINE_BROKEN; > > - spin_unlock(&heap_lock); > - > - return ret; > - > -pod_replace: > - put_page(pg); > - spin_unlock(&heap_lock); > - > - p2m_pod_offline_or_broken_replace(pg); > - *status = PG_OFFLINE_OFFLINED; > - > - if ( broken ) > - *status |= PG_OFFLINE_BROKEN; > - > - return ret; > + return 0; > } > > /* > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:01 ` Andrew Cooper @ 2013-11-27 10:05 ` Jan Beulich 2013-11-27 10:34 ` Tim Deegan 0 siblings, 1 reply; 20+ messages in thread From: Jan Beulich @ 2013-11-27 10:05 UTC (permalink / raw) To: Andrew Cooper; +Cc: George Dunlap, xen-devel, Keir Fraser, Tim Deegan >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 27/11/13 08:25, Jan Beulich wrote: >> else >> { >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > > domain_id will be promoted from a uint16_t to an int32_t, then shifted > left by 16 bits which is undefined if the top of domain_id bit was set. That promotion is done by zero extension, so I don't figure what you think is undefined here. Furthermore I suppose you realize that all the patch does here is adjust indentation. Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:05 ` Jan Beulich @ 2013-11-27 10:34 ` Tim Deegan 2013-11-27 10:43 ` Jan Beulich 0 siblings, 1 reply; 20+ messages in thread From: Tim Deegan @ 2013-11-27 10:34 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, xen-devel At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote: > >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 27/11/13 08:25, Jan Beulich wrote: > >> else > >> { > >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | > >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > > > > domain_id will be promoted from a uint16_t to an int32_t, then shifted > > left by 16 bits which is undefined if the top of domain_id bit was set. > > That promotion is done by zero extension, so I don't figure what > you think is undefined here. If the domid has bit 15 set, it will be shifted into the sign bit of the promoted integer; it should be explicitly cast to an unsigned type before the shift. Tim. > Furthermore I suppose you realize that all the patch does here is > adjust indentation. > > Jan > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:34 ` Tim Deegan @ 2013-11-27 10:43 ` Jan Beulich 2013-11-27 10:48 ` Tim Deegan 0 siblings, 1 reply; 20+ messages in thread From: Jan Beulich @ 2013-11-27 10:43 UTC (permalink / raw) To: Tim Deegan; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, xen-devel >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote: > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote: >> >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> > On 27/11/13 08:25, Jan Beulich wrote: >> >> else >> >> { >> >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | >> >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); >> >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); >> > >> > domain_id will be promoted from a uint16_t to an int32_t, then shifted >> > left by 16 bits which is undefined if the top of domain_id bit was set. >> >> That promotion is done by zero extension, so I don't figure what >> you think is undefined here. > > If the domid has bit 15 set, it will be shifted into the sign bit of > the promoted integer; it should be explicitly cast to an unsigned type > before the shift. Again - I don't see this. Promotion happens before the shift, i.e. 0x8000 -> 0x00008000 -> 0x80000000. Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:43 ` Jan Beulich @ 2013-11-27 10:48 ` Tim Deegan 2013-11-27 10:53 ` Tim Deegan 2013-11-27 10:55 ` Jan Beulich 0 siblings, 2 replies; 20+ messages in thread From: Tim Deegan @ 2013-11-27 10:48 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, xen-devel At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote: > >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote: > > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote: > >> >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > >> > On 27/11/13 08:25, Jan Beulich wrote: > >> >> else > >> >> { > >> >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | > >> >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > >> >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > >> > > >> > domain_id will be promoted from a uint16_t to an int32_t, then shifted > >> > left by 16 bits which is undefined if the top of domain_id bit was set. > >> > >> That promotion is done by zero extension, so I don't figure what > >> you think is undefined here. > > > > If the domid has bit 15 set, it will be shifted into the sign bit of > > the promoted integer; it should be explicitly cast to an unsigned type > > before the shift. > > Again - I don't see this. Promotion happens before the shift, > i.e. 0x8000 -> 0x00008000 -> 0x80000000. AIUI the default promotion is to a signed integer if the value will fit, i.e.: (unsigned short) 0x8000 promoted (signed int) 0x00008000 shifted left (signed int) 0x80000000 (undefined behaviour) (but I don't have my copy of the standard here to check). Tim. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:48 ` Tim Deegan @ 2013-11-27 10:53 ` Tim Deegan 2013-11-27 10:55 ` Jan Beulich 1 sibling, 0 replies; 20+ messages in thread From: Tim Deegan @ 2013-11-27 10:53 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, xen-devel At 11:48 +0100 on 27 Nov (1385549319), Tim Deegan wrote: > At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote: > > >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote: > > > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote: > > >> >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > >> > On 27/11/13 08:25, Jan Beulich wrote: > > >> >> else > > >> >> { > > >> >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | > > >> >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > > >> >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > > >> > > > >> > domain_id will be promoted from a uint16_t to an int32_t, then shifted > > >> > left by 16 bits which is undefined if the top of domain_id bit was set. > > >> > > >> That promotion is done by zero extension, so I don't figure what > > >> you think is undefined here. > > > > > > If the domid has bit 15 set, it will be shifted into the sign bit of > > > the promoted integer; it should be explicitly cast to an unsigned type > > > before the shift. > > > > Again - I don't see this. Promotion happens before the shift, > > i.e. 0x8000 -> 0x00008000 -> 0x80000000. > > AIUI the default promotion is to a signed integer if the value will > fit, i.e.: > (unsigned short) 0x8000 > promoted (signed int) 0x00008000 > shifted left (signed int) 0x80000000 (undefined behaviour) > > (but I don't have my copy of the standard here to check). (and AFAICT there's no way for a real domain's domain_id to hva the top bit set so it's a moot point). Tim. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:48 ` Tim Deegan 2013-11-27 10:53 ` Tim Deegan @ 2013-11-27 10:55 ` Jan Beulich 2013-11-28 10:25 ` Tim Deegan 1 sibling, 1 reply; 20+ messages in thread From: Jan Beulich @ 2013-11-27 10:55 UTC (permalink / raw) To: Tim Deegan; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, xen-devel >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org> wrote: > At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote: >> >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote: >> > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote: >> >> >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> >> > On 27/11/13 08:25, Jan Beulich wrote: >> >> >> else >> >> >> { >> >> >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | >> >> >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); >> >> >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); >> >> > >> >> > domain_id will be promoted from a uint16_t to an int32_t, then shifted >> >> > left by 16 bits which is undefined if the top of domain_id bit was set. >> >> >> >> That promotion is done by zero extension, so I don't figure what >> >> you think is undefined here. >> > >> > If the domid has bit 15 set, it will be shifted into the sign bit of >> > the promoted integer; it should be explicitly cast to an unsigned type >> > before the shift. >> >> Again - I don't see this. Promotion happens before the shift, >> i.e. 0x8000 -> 0x00008000 -> 0x80000000. > > AIUI the default promotion is to a signed integer if the value will > fit, i.e.: > (unsigned short) 0x8000 > promoted (signed int) 0x00008000 > shifted left (signed int) 0x80000000 (undefined behaviour) Right - but the promotion (as you also show) is done via zero extension. Hence, plus because of left shifts being ignorant of signed-ness, no need for a cast. Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 10:55 ` Jan Beulich @ 2013-11-28 10:25 ` Tim Deegan 2013-11-28 14:38 ` Jan Beulich 0 siblings, 1 reply; 20+ messages in thread From: Tim Deegan @ 2013-11-28 10:25 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, xen-devel At 10:55 +0000 on 27 Nov (1385546118), Jan Beulich wrote: > >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org> wrote: > > At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote: > >> >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote: > >> > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote: > >> >> >>> On 27.11.13 at 11:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > >> >> > On 27/11/13 08:25, Jan Beulich wrote: > >> >> >> else > >> >> >> { > >> >> >> *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | > >> >> >> - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > >> >> >> + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > >> >> > > >> >> > domain_id will be promoted from a uint16_t to an int32_t, then shifted > >> >> > left by 16 bits which is undefined if the top of domain_id bit was set. > >> >> > >> >> That promotion is done by zero extension, so I don't figure what > >> >> you think is undefined here. > >> > > >> > If the domid has bit 15 set, it will be shifted into the sign bit of > >> > the promoted integer; it should be explicitly cast to an unsigned type > >> > before the shift. > >> > >> Again - I don't see this. Promotion happens before the shift, > >> i.e. 0x8000 -> 0x00008000 -> 0x80000000. > > > > AIUI the default promotion is to a signed integer if the value will > > fit, i.e.: > > (unsigned short) 0x8000 > > promoted (signed int) 0x00008000 > > shifted left (signed int) 0x80000000 (undefined behaviour) > > Right - but the promotion (as you also show) is done via zero > extension. Hence, plus because of left shifts being ignorant of > signed-ness, no need for a cast. No: left-shifting that set bit into the sign bit of the promoted value is undefined behaviour. I still don't have my standard reference to hand, but e.g. http://blog.regehr.org/archives/738 Tim. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-28 10:25 ` Tim Deegan @ 2013-11-28 14:38 ` Jan Beulich 2013-11-28 15:11 ` Tim Deegan 0 siblings, 1 reply; 20+ messages in thread From: Jan Beulich @ 2013-11-28 14:38 UTC (permalink / raw) To: tim; +Cc: George.Dunlap, andrew.cooper3, keir, xen-devel >>> Tim Deegan <tim@xen.org> 11/28/13 11:26 AM >>> >At 10:55 +0000 on 27 Nov (1385546118), Jan Beulich wrote: >> >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org> wrote: >> > AIUI the default promotion is to a signed integer if the value will >> > fit, i.e.: >> > (unsigned short) 0x8000 >> > promoted (signed int) 0x00008000 >> > shifted left (signed int) 0x80000000 (undefined behaviour) >> >> Right - but the promotion (as you also show) is done via zero >> extension. Hence, plus because of left shifts being ignorant of >> signed-ness, no need for a cast. > >No: left-shifting that set bit into the sign bit of the promoted value >is undefined behaviour. I still don't have my standard reference to >hand, but e.g. http://blog.regehr.org/archives/738 Ah, indeed. I can certainly add a cast there, but as said before - the value can't be negative as we only permit 2^^15 domains, - the change to the line in question was only white space adjustment. Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-28 14:38 ` Jan Beulich @ 2013-11-28 15:11 ` Tim Deegan 0 siblings, 0 replies; 20+ messages in thread From: Tim Deegan @ 2013-11-28 15:11 UTC (permalink / raw) To: Jan Beulich; +Cc: George.Dunlap, andrew.cooper3, keir, xen-devel At 14:38 +0000 on 28 Nov (1385645907), Jan Beulich wrote: > >>> Tim Deegan <tim@xen.org> 11/28/13 11:26 AM >>> > >At 10:55 +0000 on 27 Nov (1385546118), Jan Beulich wrote: > >> >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org> wrote: > >> > AIUI the default promotion is to a signed integer if the value will > >> > fit, i.e.: > >> > (unsigned short) 0x8000 > >> > promoted (signed int) 0x00008000 > >> > shifted left (signed int) 0x80000000 (undefined behaviour) > >> > >> Right - but the promotion (as you also show) is done via zero > >> extension. Hence, plus because of left shifts being ignorant of > >> signed-ness, no need for a cast. > > > >No: left-shifting that set bit into the sign bit of the promoted value > >is undefined behaviour. I still don't have my standard reference to > >hand, but e.g. http://blog.regehr.org/archives/738 > > Ah, indeed. I can certainly add a cast there, but as said before > - the value can't be negative as we only permit 2^^15 domains, Absolutely. My reviewed-by stands on the original patch. Tim. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] fix locking in offline_page() 2013-11-27 8:25 ` [PATCH 1/2] fix locking in offline_page() Jan Beulich 2013-11-27 10:01 ` Andrew Cooper @ 2013-11-27 14:48 ` George Dunlap 1 sibling, 0 replies; 20+ messages in thread From: George Dunlap @ 2013-11-27 14:48 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Keir Fraser, Tim Deegan On 11/27/2013 08:25 AM, Jan Beulich wrote: > Coverity ID 1055655 > > Apart from the Coverity-detected lock order reversal (a domain's > page_alloc_lock taken with the heap lock already held), calling > put_page() with heap_lock is a bad idea too (as a possible descendant > from put_page() is free_heap_pages(), which wants to take this very > lock). This sounds like a bug fix, so doesn't need a freeze exception at this time. -George > > From all I can tell the region over which heap_lock was held was far > too large: All we need to protect are the call to mark_page_offline() > and reserve_heap_page() (and I'd even put under question the need for > the former). Hence by slightly re-arranging the if/else-if chain we > can drop the lock much earlier, at once no longer covering the two > put_page() invocations. > > Once at it, do a little bit of other cleanup: Put the "pod_replace" > code path inline rather than at its own label, and drop the effectively > unused variable "ret". > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int > { > unsigned long old_info = 0; > struct domain *owner; > - int ret = 0; > struct page_info *pg; > > if ( !mfn_valid(mfn) ) > @@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int > if ( page_state_is(pg, offlined) ) > { > reserve_heap_page(pg); > - *status = PG_OFFLINE_OFFLINED; > + > + spin_unlock(&heap_lock); > + > + *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN > + : PG_OFFLINE_OFFLINED; > + return 0; > } > - else if ( (owner = page_get_owner_and_reference(pg)) ) > + > + spin_unlock(&heap_lock); > + > + if ( (owner = page_get_owner_and_reference(pg)) ) > { > if ( p2m_pod_offline_or_broken_hit(pg) ) > - goto pod_replace; > + { > + put_page(pg); > + p2m_pod_offline_or_broken_replace(pg); > + *status = PG_OFFLINE_OFFLINED; > + } > else > { > *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING | > - (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > + (owner->domain_id << PG_OFFLINE_OWNER_SHIFT); > /* Release the reference since it will not be allocated anymore */ > put_page(pg); > } > @@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int > else if ( old_info & PGC_xen_heap ) > { > *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING | > - (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); > + (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT); > } > else > { > @@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int > if ( broken ) > *status |= PG_OFFLINE_BROKEN; > > - spin_unlock(&heap_lock); > - > - return ret; > - > -pod_replace: > - put_page(pg); > - spin_unlock(&heap_lock); > - > - p2m_pod_offline_or_broken_replace(pg); > - *status = PG_OFFLINE_OFFLINED; > - > - if ( broken ) > - *status |= PG_OFFLINE_BROKEN; > - > - return ret; > + return 0; > } > > /* > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible 2013-11-27 8:07 [PATCH 0/2] XSA-74 follow-ups Jan Beulich 2013-11-27 8:25 ` [PATCH 1/2] fix locking in offline_page() Jan Beulich @ 2013-11-27 8:26 ` Jan Beulich 2013-11-27 10:07 ` Andrew Cooper 2013-11-27 14:44 ` George Dunlap 2013-11-27 10:42 ` [PATCH 0/2] XSA-74 follow-ups Tim Deegan 2013-12-03 9:20 ` Keir Fraser 3 siblings, 2 replies; 20+ messages in thread From: Jan Beulich @ 2013-11-27 8:26 UTC (permalink / raw) To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser, Tim Deegan [-- Attachment #1: Type: text/plain, Size: 4453 bytes --] This is generally cheaper than re-reading ->tot_pages. While doing so I also noticed an improper use (lacking error handling) of get_domain() as well as lacks of ->is_dying checks in the memory sharing code, which the patch fixes at once. In the course of doing this I further noticed other error paths there pointlessly calling put_page() et al with ->page_alloc_lock still held, which is also being reversed. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -611,9 +611,16 @@ static int page_make_sharable(struct dom struct page_info *page, int expected_refcnt) { - int drop_dom_ref; + bool_t drop_dom_ref; + spin_lock(&d->page_alloc_lock); + if ( d->is_dying ) + { + spin_unlock(&d->page_alloc_lock); + return -EBUSY; + } + /* Change page type and count atomically */ if ( !get_page_and_type(page, d, PGT_shared_page) ) { @@ -624,8 +631,8 @@ static int page_make_sharable(struct dom /* Check it wasn't already sharable and undo if it was */ if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) { - put_page_and_type(page); spin_unlock(&d->page_alloc_lock); + put_page_and_type(page); return -EEXIST; } @@ -633,15 +640,14 @@ static int page_make_sharable(struct dom * the second from get_page_and_type at the top of this function */ if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) { + spin_unlock(&d->page_alloc_lock); /* Return type count back to zero */ put_page_and_type(page); - spin_unlock(&d->page_alloc_lock); return -E2BIG; } page_set_owner(page, dom_cow); - domain_adjust_tot_pages(d, -1); - drop_dom_ref = (d->tot_pages == 0); + drop_dom_ref = !domain_adjust_tot_pages(d, -1); page_list_del(page, &d->page_list); spin_unlock(&d->page_alloc_lock); @@ -659,6 +665,13 @@ static int page_make_private(struct doma spin_lock(&d->page_alloc_lock); + if ( d->is_dying ) + { + spin_unlock(&d->page_alloc_lock); + put_page(page); + return -EBUSY; + } + /* We can only change the type if count is one */ /* Because we are locking pages individually, we need to drop * the lock here, while the page is typed. We cannot risk the @@ -666,8 +679,8 @@ static int page_make_private(struct doma expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); if ( page->u.inuse.type_info != expected_type ) { - put_page(page); spin_unlock(&d->page_alloc_lock); + put_page(page); return -EEXIST; } @@ -682,7 +695,7 @@ static int page_make_private(struct doma page_set_owner(page, d); if ( domain_adjust_tot_pages(d, 1) == 1 ) - get_domain(d); + get_knownalive_domain(d); page_list_add_tail(page, &d->page_list); spin_unlock(&d->page_alloc_lock); --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -475,8 +475,8 @@ static long memory_exchange(XEN_GUEST_HA (j * (1UL << exch.out.extent_order))); spin_lock(&d->page_alloc_lock); - domain_adjust_tot_pages(d, -dec_count); - drop_dom_ref = (dec_count && !d->tot_pages); + drop_dom_ref = (dec_count && + !domain_adjust_tot_pages(d, -dec_count)); spin_unlock(&d->page_alloc_lock); if ( drop_dom_ref ) --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1516,8 +1516,9 @@ struct page_info *alloc_domheap_pages( void free_domheap_pages(struct page_info *pg, unsigned int order) { - int i, drop_dom_ref; struct domain *d = page_get_owner(pg); + unsigned int i; + bool_t drop_dom_ref; ASSERT(!in_irq()); @@ -1545,8 +1546,7 @@ void free_domheap_pages(struct page_info page_list_del2(&pg[i], &d->page_list, &d->arch.relmem_list); } - domain_adjust_tot_pages(d, -(1 << order)); - drop_dom_ref = (d->tot_pages == 0); + drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order)); spin_unlock_recursive(&d->page_alloc_lock); [-- Attachment #2: domain-adjust-tot_pages.patch --] [-- Type: text/plain, Size: 4513 bytes --] use return value of domain_adjust_tot_pages() where feasible This is generally cheaper than re-reading ->tot_pages. While doing so I also noticed an improper use (lacking error handling) of get_domain() as well as lacks of ->is_dying checks in the memory sharing code, which the patch fixes at once. In the course of doing this I further noticed other error paths there pointlessly calling put_page() et al with ->page_alloc_lock still held, which is also being reversed. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -611,9 +611,16 @@ static int page_make_sharable(struct dom struct page_info *page, int expected_refcnt) { - int drop_dom_ref; + bool_t drop_dom_ref; + spin_lock(&d->page_alloc_lock); + if ( d->is_dying ) + { + spin_unlock(&d->page_alloc_lock); + return -EBUSY; + } + /* Change page type and count atomically */ if ( !get_page_and_type(page, d, PGT_shared_page) ) { @@ -624,8 +631,8 @@ static int page_make_sharable(struct dom /* Check it wasn't already sharable and undo if it was */ if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) { - put_page_and_type(page); spin_unlock(&d->page_alloc_lock); + put_page_and_type(page); return -EEXIST; } @@ -633,15 +640,14 @@ static int page_make_sharable(struct dom * the second from get_page_and_type at the top of this function */ if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) { + spin_unlock(&d->page_alloc_lock); /* Return type count back to zero */ put_page_and_type(page); - spin_unlock(&d->page_alloc_lock); return -E2BIG; } page_set_owner(page, dom_cow); - domain_adjust_tot_pages(d, -1); - drop_dom_ref = (d->tot_pages == 0); + drop_dom_ref = !domain_adjust_tot_pages(d, -1); page_list_del(page, &d->page_list); spin_unlock(&d->page_alloc_lock); @@ -659,6 +665,13 @@ static int page_make_private(struct doma spin_lock(&d->page_alloc_lock); + if ( d->is_dying ) + { + spin_unlock(&d->page_alloc_lock); + put_page(page); + return -EBUSY; + } + /* We can only change the type if count is one */ /* Because we are locking pages individually, we need to drop * the lock here, while the page is typed. We cannot risk the @@ -666,8 +679,8 @@ static int page_make_private(struct doma expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); if ( page->u.inuse.type_info != expected_type ) { - put_page(page); spin_unlock(&d->page_alloc_lock); + put_page(page); return -EEXIST; } @@ -682,7 +695,7 @@ static int page_make_private(struct doma page_set_owner(page, d); if ( domain_adjust_tot_pages(d, 1) == 1 ) - get_domain(d); + get_knownalive_domain(d); page_list_add_tail(page, &d->page_list); spin_unlock(&d->page_alloc_lock); --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -475,8 +475,8 @@ static long memory_exchange(XEN_GUEST_HA (j * (1UL << exch.out.extent_order))); spin_lock(&d->page_alloc_lock); - domain_adjust_tot_pages(d, -dec_count); - drop_dom_ref = (dec_count && !d->tot_pages); + drop_dom_ref = (dec_count && + !domain_adjust_tot_pages(d, -dec_count)); spin_unlock(&d->page_alloc_lock); if ( drop_dom_ref ) --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1516,8 +1516,9 @@ struct page_info *alloc_domheap_pages( void free_domheap_pages(struct page_info *pg, unsigned int order) { - int i, drop_dom_ref; struct domain *d = page_get_owner(pg); + unsigned int i; + bool_t drop_dom_ref; ASSERT(!in_irq()); @@ -1545,8 +1546,7 @@ void free_domheap_pages(struct page_info page_list_del2(&pg[i], &d->page_list, &d->arch.relmem_list); } - domain_adjust_tot_pages(d, -(1 << order)); - drop_dom_ref = (d->tot_pages == 0); + drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order)); spin_unlock_recursive(&d->page_alloc_lock); [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible 2013-11-27 8:26 ` [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible Jan Beulich @ 2013-11-27 10:07 ` Andrew Cooper 2013-11-27 14:44 ` George Dunlap 1 sibling, 0 replies; 20+ messages in thread From: Andrew Cooper @ 2013-11-27 10:07 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, xen-devel, Keir Fraser, Tim Deegan On 27/11/13 08:26, Jan Beulich wrote: > This is generally cheaper than re-reading ->tot_pages. > > While doing so I also noticed an improper use (lacking error handling) > of get_domain() as well as lacks of ->is_dying checks in the memory > sharing code, which the patch fixes at once. In the course of doing > this I further noticed other error paths there pointlessly calling > put_page() et al with ->page_alloc_lock still held, which is also being > reversed. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> > > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -611,9 +611,16 @@ static int page_make_sharable(struct dom > struct page_info *page, > int expected_refcnt) > { > - int drop_dom_ref; > + bool_t drop_dom_ref; > + > spin_lock(&d->page_alloc_lock); > > + if ( d->is_dying ) > + { > + spin_unlock(&d->page_alloc_lock); > + return -EBUSY; > + } > + > /* Change page type and count atomically */ > if ( !get_page_and_type(page, d, PGT_shared_page) ) > { > @@ -624,8 +631,8 @@ static int page_make_sharable(struct dom > /* Check it wasn't already sharable and undo if it was */ > if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) > { > - put_page_and_type(page); > spin_unlock(&d->page_alloc_lock); > + put_page_and_type(page); > return -EEXIST; > } > > @@ -633,15 +640,14 @@ static int page_make_sharable(struct dom > * the second from get_page_and_type at the top of this function */ > if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) > { > + spin_unlock(&d->page_alloc_lock); > /* Return type count back to zero */ > put_page_and_type(page); > - spin_unlock(&d->page_alloc_lock); > return -E2BIG; > } > > page_set_owner(page, dom_cow); > - domain_adjust_tot_pages(d, -1); > - drop_dom_ref = (d->tot_pages == 0); > + drop_dom_ref = !domain_adjust_tot_pages(d, -1); > page_list_del(page, &d->page_list); > spin_unlock(&d->page_alloc_lock); > > @@ -659,6 +665,13 @@ static int page_make_private(struct doma > > spin_lock(&d->page_alloc_lock); > > + if ( d->is_dying ) > + { > + spin_unlock(&d->page_alloc_lock); > + put_page(page); > + return -EBUSY; > + } > + > /* We can only change the type if count is one */ > /* Because we are locking pages individually, we need to drop > * the lock here, while the page is typed. We cannot risk the > @@ -666,8 +679,8 @@ static int page_make_private(struct doma > expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); > if ( page->u.inuse.type_info != expected_type ) > { > - put_page(page); > spin_unlock(&d->page_alloc_lock); > + put_page(page); > return -EEXIST; > } > > @@ -682,7 +695,7 @@ static int page_make_private(struct doma > page_set_owner(page, d); > > if ( domain_adjust_tot_pages(d, 1) == 1 ) > - get_domain(d); > + get_knownalive_domain(d); > page_list_add_tail(page, &d->page_list); > spin_unlock(&d->page_alloc_lock); > > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -475,8 +475,8 @@ static long memory_exchange(XEN_GUEST_HA > (j * (1UL << exch.out.extent_order))); > > spin_lock(&d->page_alloc_lock); > - domain_adjust_tot_pages(d, -dec_count); > - drop_dom_ref = (dec_count && !d->tot_pages); > + drop_dom_ref = (dec_count && > + !domain_adjust_tot_pages(d, -dec_count)); > spin_unlock(&d->page_alloc_lock); > > if ( drop_dom_ref ) > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1516,8 +1516,9 @@ struct page_info *alloc_domheap_pages( > > void free_domheap_pages(struct page_info *pg, unsigned int order) > { > - int i, drop_dom_ref; > struct domain *d = page_get_owner(pg); > + unsigned int i; > + bool_t drop_dom_ref; > > ASSERT(!in_irq()); > > @@ -1545,8 +1546,7 @@ void free_domheap_pages(struct page_info > page_list_del2(&pg[i], &d->page_list, &d->arch.relmem_list); > } > > - domain_adjust_tot_pages(d, -(1 << order)); > - drop_dom_ref = (d->tot_pages == 0); > + drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order)); > > spin_unlock_recursive(&d->page_alloc_lock); > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible 2013-11-27 8:26 ` [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible Jan Beulich 2013-11-27 10:07 ` Andrew Cooper @ 2013-11-27 14:44 ` George Dunlap 2013-11-27 15:46 ` Jan Beulich 1 sibling, 1 reply; 20+ messages in thread From: George Dunlap @ 2013-11-27 14:44 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Keir Fraser, Tim Deegan On 11/27/2013 08:26 AM, Jan Beulich wrote: > This is generally cheaper than re-reading ->tot_pages. > > While doing so I also noticed an improper use (lacking error handling) > of get_domain() as well as lacks of ->is_dying checks in the memory > sharing code, which the patch fixes at once. In the course of doing > this I further noticed other error paths there pointlessly calling > put_page() et al with ->page_alloc_lock still held, which is also being > reversed. Thus hiding two very important changes underneath a summary that looks completely unimportant. If this patch only did as the title suggests, I would question whether it should be included for 4.4, since it seems to have little benefit. Are the other two changes bug fixes? In any case, the summary should indicate to someone just browsing through what important changes might be inside. -George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible 2013-11-27 14:44 ` George Dunlap @ 2013-11-27 15:46 ` Jan Beulich 2013-11-27 16:51 ` George Dunlap 0 siblings, 1 reply; 20+ messages in thread From: Jan Beulich @ 2013-11-27 15:46 UTC (permalink / raw) To: George Dunlap; +Cc: Andrew Cooper, TimDeegan, Keir Fraser, xen-devel >>> On 27.11.13 at 15:44, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 11/27/2013 08:26 AM, Jan Beulich wrote: >> This is generally cheaper than re-reading ->tot_pages. >> >> While doing so I also noticed an improper use (lacking error handling) >> of get_domain() as well as lacks of ->is_dying checks in the memory >> sharing code, which the patch fixes at once. In the course of doing >> this I further noticed other error paths there pointlessly calling >> put_page() et al with ->page_alloc_lock still held, which is also being >> reversed. > > Thus hiding two very important changes underneath a summary that looks > completely unimportant. > > If this patch only did as the title suggests, I would question whether > it should be included for 4.4, since it seems to have little benefit. > Are the other two changes bug fixes? I'm sure the missing ->is_dying check is one. I'm not sure the put vs unlock ordering is just inefficient or could actively cause issues. > In any case, the summary should indicate to someone just browsing > through what important changes might be inside. The issue is that you can't really put all three things in the title. And hence I decided to keep the title what it was supposed to be when I started correcting this code. With - in my understanding - the memory sharing code still being experimental, it also didn't really seem worthwhile splitting these changes out into separate patches (I generally try to do so when spotting isolated issues, but tend to keep things together when in the course of doing other adjustments I recognize further small issues in code that's not production ready anyway, i.e. not needing to be backported, and hence the title not needing to hint at that). In any event, as to the freeze: The code was written and would have been submitted before the code freeze if I hadn't been required to hold it back until after XSA-74 went public (which goes back to our [unwritten?] policy of not publishing changes that were found necessary/desirable in the course of researching a specific security issue). Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible 2013-11-27 15:46 ` Jan Beulich @ 2013-11-27 16:51 ` George Dunlap 0 siblings, 0 replies; 20+ messages in thread From: George Dunlap @ 2013-11-27 16:51 UTC (permalink / raw) To: Jan Beulich; +Cc: Andrew Cooper, TimDeegan, Keir Fraser, xen-devel On 11/27/2013 03:46 PM, Jan Beulich wrote: >>>> On 27.11.13 at 15:44, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 11/27/2013 08:26 AM, Jan Beulich wrote: >>> This is generally cheaper than re-reading ->tot_pages. >>> >>> While doing so I also noticed an improper use (lacking error handling) >>> of get_domain() as well as lacks of ->is_dying checks in the memory >>> sharing code, which the patch fixes at once. In the course of doing >>> this I further noticed other error paths there pointlessly calling >>> put_page() et al with ->page_alloc_lock still held, which is also being >>> reversed. >> Thus hiding two very important changes underneath a summary that looks >> completely unimportant. >> >> If this patch only did as the title suggests, I would question whether >> it should be included for 4.4, since it seems to have little benefit. >> Are the other two changes bug fixes? > I'm sure the missing ->is_dying check is one. I'm not sure the > put vs unlock ordering is just inefficient or could actively cause > issues. > >> In any case, the summary should indicate to someone just browsing >> through what important changes might be inside. > The issue is that you can't really put all three things in the title. > And hence I decided to keep the title what it was supposed to be > when I started correcting this code. I think I would call it something like, "Various fix-ups to mm-related code". That would allow anyone scanning it to know that there were a number of fix-ups, and they were in the MM code; and would prompt them to look further if it seemed like something they might be looking for (either fixes to backport, or the source of a bug that was introduced). > With - in my understanding - the memory sharing code still being > experimental, it also didn't really seem worthwhile splitting these > changes out into separate patches (I generally try to do so when > spotting isolated issues, but tend to keep things together when > in the course of doing other adjustments I recognize further small > issues in code that's not production ready anyway, i.e. not > needing to be backported, and hence the title not needing to hint > at that). > > In any event, as to the freeze: The code was written and would > have been submitted before the code freeze if I hadn't been > required to hold it back until after XSA-74 went public (which > goes back to our [unwritten?] policy of not publishing changes > that were found necessary/desirable in the course of researching > a specific security issue). Unfortunately, the "unfairness" of having been held back for a security embargo (or in the dom0 PVH case, having been submitted a year ago) doesn't change the realities of the situation: that making changes risks introducing bugs which delay the release, or worse, are not found until after the release. That may be a reason to consider it "having been submitted before the feature freeze", but it can't tilt the scales of a cost/benefits analysis. Anyway, we're only half-way through the code freeze, and these do look like bug fixes; I'm inclined to say they're OK. -George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] XSA-74 follow-ups 2013-11-27 8:07 [PATCH 0/2] XSA-74 follow-ups Jan Beulich 2013-11-27 8:25 ` [PATCH 1/2] fix locking in offline_page() Jan Beulich 2013-11-27 8:26 ` [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible Jan Beulich @ 2013-11-27 10:42 ` Tim Deegan 2013-12-03 9:20 ` Keir Fraser 3 siblings, 0 replies; 20+ messages in thread From: Tim Deegan @ 2013-11-27 10:42 UTC (permalink / raw) To: Jan Beulich; +Cc: George Dunlap, xen-devel, Keir Fraser, Andrew Cooper At 08:07 +0000 on 27 Nov (1385536046), Jan Beulich wrote: > 1: fix locking in offline_page() > 2: use return value of domain_adjust_tot_pages() where feasible > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] XSA-74 follow-ups 2013-11-27 8:07 [PATCH 0/2] XSA-74 follow-ups Jan Beulich ` (2 preceding siblings ...) 2013-11-27 10:42 ` [PATCH 0/2] XSA-74 follow-ups Tim Deegan @ 2013-12-03 9:20 ` Keir Fraser 3 siblings, 0 replies; 20+ messages in thread From: Keir Fraser @ 2013-12-03 9:20 UTC (permalink / raw) To: Jan Beulich, xen-devel Cc: George Dunlap, Andrew Cooper, Keir Fraser, Tim Deegan On 27/11/2013 08:07, "Jan Beulich" <JBeulich@suse.com> wrote: > 1: fix locking in offline_page() > 2: use return value of domain_adjust_tot_pages() where feasible > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2013-12-03 9:20 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-27 8:07 [PATCH 0/2] XSA-74 follow-ups Jan Beulich 2013-11-27 8:25 ` [PATCH 1/2] fix locking in offline_page() Jan Beulich 2013-11-27 10:01 ` Andrew Cooper 2013-11-27 10:05 ` Jan Beulich 2013-11-27 10:34 ` Tim Deegan 2013-11-27 10:43 ` Jan Beulich 2013-11-27 10:48 ` Tim Deegan 2013-11-27 10:53 ` Tim Deegan 2013-11-27 10:55 ` Jan Beulich 2013-11-28 10:25 ` Tim Deegan 2013-11-28 14:38 ` Jan Beulich 2013-11-28 15:11 ` Tim Deegan 2013-11-27 14:48 ` George Dunlap 2013-11-27 8:26 ` [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible Jan Beulich 2013-11-27 10:07 ` Andrew Cooper 2013-11-27 14:44 ` George Dunlap 2013-11-27 15:46 ` Jan Beulich 2013-11-27 16:51 ` George Dunlap 2013-11-27 10:42 ` [PATCH 0/2] XSA-74 follow-ups Tim Deegan 2013-12-03 9:20 ` Keir Fraser
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).