* [PATCH 0/2] xen/x86: cmpxchg cleanup @ 2024-02-22 9:05 Roger Pau Monne 2024-02-22 9:05 ` [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop Roger Pau Monne 2024-02-22 9:05 ` [PATCH 2/2] x86/hpet: " Roger Pau Monne 0 siblings, 2 replies; 17+ messages in thread From: Roger Pau Monne @ 2024-02-22 9:05 UTC (permalink / raw) To: xen-devel Cc: Roger Pau Monne, Tamas K Lengyel, Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu Hello, Following series replace a couple of cmpxchg loops with an atomic inc. The usage of such loops probably dates back to 32bit support, which didn't have an instruction to do an atomic 64bit addition. Thanks, Roger. Roger Pau Monne (2): x86/memsharing: use an atomic add instead of a cmpxchg loop x86/hpet: use an atomic add instead of a cmpxchg loop xen/arch/x86/hpet.c | 6 +----- xen/arch/x86/mm/mem_sharing.c | 8 +------- 2 files changed, 2 insertions(+), 12 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 [PATCH 0/2] xen/x86: cmpxchg cleanup Roger Pau Monne @ 2024-02-22 9:05 ` Roger Pau Monne 2024-02-22 10:06 ` Jan Beulich ` (2 more replies) 2024-02-22 9:05 ` [PATCH 2/2] x86/hpet: " Roger Pau Monne 1 sibling, 3 replies; 17+ messages in thread From: Roger Pau Monne @ 2024-02-22 9:05 UTC (permalink / raw) To: xen-devel Cc: Roger Pau Monne, Tamas K Lengyel, Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same can be achieved with an atomic increment, which is both simpler to read, and avoid any need for a loop. The cmpxchg usage is likely a remnant of 32bit support, which didn't have an instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/mm/mem_sharing.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 4f810706a315..fe299a2bf9aa 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) static shr_handle_t get_next_handle(void) { - /* Get the next handle get_page style */ - uint64_t x, y = next_handle; - do { - x = y; - } - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); - return x + 1; + return arch_fetch_and_add(&next_handle, 1) + 1; } static atomic_t nr_saved_mfns = ATOMIC_INIT(0); -- 2.43.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 ` [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop Roger Pau Monne @ 2024-02-22 10:06 ` Jan Beulich 2024-02-22 18:03 ` Tamas K Lengyel 2024-02-22 11:12 ` Julien Grall 2024-02-29 7:37 ` Jan Beulich 2 siblings, 1 reply; 17+ messages in thread From: Jan Beulich @ 2024-02-22 10:06 UTC (permalink / raw) To: Roger Pau Monne, Tamas K Lengyel Cc: Andrew Cooper, George Dunlap, Wei Liu, xen-devel On 22.02.2024 10:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> albeit ... > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) > > static shr_handle_t get_next_handle(void) > { > - /* Get the next handle get_page style */ > - uint64_t x, y = next_handle; > - do { > - x = y; > - } > - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); > - return x + 1; > + return arch_fetch_and_add(&next_handle, 1) + 1; > } ... the adding of 1 here is a little odd when taken together with next_handle's initializer. Tamas, you've not written that code, but do you have any thoughts towards the possible removal of either the initializer or the adding here? Plus that variable of course could very well do with moving into this function. Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-22 10:06 ` Jan Beulich @ 2024-02-22 18:03 ` Tamas K Lengyel 2024-02-23 7:43 ` Jan Beulich 0 siblings, 1 reply; 17+ messages in thread From: Tamas K Lengyel @ 2024-02-22 18:03 UTC (permalink / raw) To: Jan Beulich Cc: Roger Pau Monne, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 22.02.2024 10:05, Roger Pau Monne wrote: > > The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > > can be achieved with an atomic increment, which is both simpler to read, and > > avoid any need for a loop. > > > > The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > > instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > > > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> > > Reviewed-by: Jan Beulich <jbeulich@suse.com> > albeit ... > > > --- a/xen/arch/x86/mm/mem_sharing.c > > +++ b/xen/arch/x86/mm/mem_sharing.c > > @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) > > > > static shr_handle_t get_next_handle(void) > > { > > - /* Get the next handle get_page style */ > > - uint64_t x, y = next_handle; > > - do { > > - x = y; > > - } > > - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); > > - return x + 1; > > + return arch_fetch_and_add(&next_handle, 1) + 1; > > } > > ... the adding of 1 here is a little odd when taken together with > next_handle's initializer. Tamas, you've not written that code, but do > you have any thoughts towards the possible removal of either the > initializer or the adding here? Plus that variable of course could > very well do with moving into this function. I have to say I find the existing logic here hard to parse but by the looks I don't think we need the + 1 once we switch to arch_fetch_and_add. Also could go without initializing next_handle to 1. Moving it into the function would not really accomplish anything other than style AFAICT? Tamas ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-22 18:03 ` Tamas K Lengyel @ 2024-02-23 7:43 ` Jan Beulich 2024-02-28 10:53 ` Roger Pau Monné 0 siblings, 1 reply; 17+ messages in thread From: Jan Beulich @ 2024-02-23 7:43 UTC (permalink / raw) To: Tamas K Lengyel Cc: Roger Pau Monne, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On 22.02.2024 19:03, Tamas K Lengyel wrote: > On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote: >> On 22.02.2024 10:05, Roger Pau Monne wrote: >>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same >>> can be achieved with an atomic increment, which is both simpler to read, and >>> avoid any need for a loop. >>> >>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an >>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. >>> >>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> >> >> Reviewed-by: Jan Beulich <jbeulich@suse.com> >> albeit ... >> >>> --- a/xen/arch/x86/mm/mem_sharing.c >>> +++ b/xen/arch/x86/mm/mem_sharing.c >>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) >>> >>> static shr_handle_t get_next_handle(void) >>> { >>> - /* Get the next handle get_page style */ >>> - uint64_t x, y = next_handle; >>> - do { >>> - x = y; >>> - } >>> - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); >>> - return x + 1; >>> + return arch_fetch_and_add(&next_handle, 1) + 1; >>> } >> >> ... the adding of 1 here is a little odd when taken together with >> next_handle's initializer. Tamas, you've not written that code, but do >> you have any thoughts towards the possible removal of either the >> initializer or the adding here? Plus that variable of course could >> very well do with moving into this function. > > I have to say I find the existing logic here hard to parse but by the > looks I don't think we need the + 1 once we switch to > arch_fetch_and_add. Also could go without initializing next_handle to > 1. Moving it into the function would not really accomplish anything > other than style AFAICT? Well, limiting scope of things can be viewed as purely style, but I think it's more than that: It makes intentions more clear and reduces the chance of abuse (deliberate or unintentional). Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-23 7:43 ` Jan Beulich @ 2024-02-28 10:53 ` Roger Pau Monné 2024-02-28 11:18 ` Jan Beulich 0 siblings, 1 reply; 17+ messages in thread From: Roger Pau Monné @ 2024-02-28 10:53 UTC (permalink / raw) To: Jan Beulich Cc: Tamas K Lengyel, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote: > On 22.02.2024 19:03, Tamas K Lengyel wrote: > > On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote: > >> On 22.02.2024 10:05, Roger Pau Monne wrote: > >>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > >>> can be achieved with an atomic increment, which is both simpler to read, and > >>> avoid any need for a loop. > >>> > >>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > >>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > >>> > >>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > >>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> > >> > >> Reviewed-by: Jan Beulich <jbeulich@suse.com> > >> albeit ... > >> > >>> --- a/xen/arch/x86/mm/mem_sharing.c > >>> +++ b/xen/arch/x86/mm/mem_sharing.c > >>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) > >>> > >>> static shr_handle_t get_next_handle(void) > >>> { > >>> - /* Get the next handle get_page style */ > >>> - uint64_t x, y = next_handle; > >>> - do { > >>> - x = y; > >>> - } > >>> - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); > >>> - return x + 1; > >>> + return arch_fetch_and_add(&next_handle, 1) + 1; > >>> } > >> > >> ... the adding of 1 here is a little odd when taken together with > >> next_handle's initializer. Tamas, you've not written that code, but do > >> you have any thoughts towards the possible removal of either the > >> initializer or the adding here? Plus that variable of course could > >> very well do with moving into this function. > > > > I have to say I find the existing logic here hard to parse but by the > > looks I don't think we need the + 1 once we switch to > > arch_fetch_and_add. Also could go without initializing next_handle to > > 1. Moving it into the function would not really accomplish anything > > other than style AFAICT? > > Well, limiting scope of things can be viewed as purely style, but I > think it's more than that: It makes intentions more clear and reduces > the chance of abuse (deliberate or unintentional). I'm afraid that whatever is the outcome here, I will defer it to a further commit, since the purpose here is to be a non-functional change. Thanks, Roger. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-28 10:53 ` Roger Pau Monné @ 2024-02-28 11:18 ` Jan Beulich 2024-02-28 13:28 ` Roger Pau Monné 0 siblings, 1 reply; 17+ messages in thread From: Jan Beulich @ 2024-02-28 11:18 UTC (permalink / raw) To: Roger Pau Monné Cc: Tamas K Lengyel, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On 28.02.2024 11:53, Roger Pau Monné wrote: > On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote: >> On 22.02.2024 19:03, Tamas K Lengyel wrote: >>> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote: >>>> On 22.02.2024 10:05, Roger Pau Monne wrote: >>>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same >>>>> can be achieved with an atomic increment, which is both simpler to read, and >>>>> avoid any need for a loop. >>>>> >>>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an >>>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. >>>>> >>>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> >>>> >>>> Reviewed-by: Jan Beulich <jbeulich@suse.com> >>>> albeit ... >>>> >>>>> --- a/xen/arch/x86/mm/mem_sharing.c >>>>> +++ b/xen/arch/x86/mm/mem_sharing.c >>>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) >>>>> >>>>> static shr_handle_t get_next_handle(void) >>>>> { >>>>> - /* Get the next handle get_page style */ >>>>> - uint64_t x, y = next_handle; >>>>> - do { >>>>> - x = y; >>>>> - } >>>>> - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); >>>>> - return x + 1; >>>>> + return arch_fetch_and_add(&next_handle, 1) + 1; >>>>> } >>>> >>>> ... the adding of 1 here is a little odd when taken together with >>>> next_handle's initializer. Tamas, you've not written that code, but do >>>> you have any thoughts towards the possible removal of either the >>>> initializer or the adding here? Plus that variable of course could >>>> very well do with moving into this function. >>> >>> I have to say I find the existing logic here hard to parse but by the >>> looks I don't think we need the + 1 once we switch to >>> arch_fetch_and_add. Also could go without initializing next_handle to >>> 1. Moving it into the function would not really accomplish anything >>> other than style AFAICT? >> >> Well, limiting scope of things can be viewed as purely style, but I >> think it's more than that: It makes intentions more clear and reduces >> the chance of abuse (deliberate or unintentional). > > I'm afraid that whatever is the outcome here, I will defer it to a > further commit, since the purpose here is to be a non-functional > change. That's fine with me, but an ack from Tamas is still pending, unless I missed something somewhere. Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-28 11:18 ` Jan Beulich @ 2024-02-28 13:28 ` Roger Pau Monné 2024-02-28 13:38 ` Tamas K Lengyel 0 siblings, 1 reply; 17+ messages in thread From: Roger Pau Monné @ 2024-02-28 13:28 UTC (permalink / raw) To: Jan Beulich Cc: Tamas K Lengyel, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On Wed, Feb 28, 2024 at 12:18:31PM +0100, Jan Beulich wrote: > On 28.02.2024 11:53, Roger Pau Monné wrote: > > On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote: > >> On 22.02.2024 19:03, Tamas K Lengyel wrote: > >>> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote: > >>>> On 22.02.2024 10:05, Roger Pau Monne wrote: > >>>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > >>>>> can be achieved with an atomic increment, which is both simpler to read, and > >>>>> avoid any need for a loop. > >>>>> > >>>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > >>>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > >>>>> > >>>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > >>>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> > >>>> > >>>> Reviewed-by: Jan Beulich <jbeulich@suse.com> > >>>> albeit ... > >>>> > >>>>> --- a/xen/arch/x86/mm/mem_sharing.c > >>>>> +++ b/xen/arch/x86/mm/mem_sharing.c > >>>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) > >>>>> > >>>>> static shr_handle_t get_next_handle(void) > >>>>> { > >>>>> - /* Get the next handle get_page style */ > >>>>> - uint64_t x, y = next_handle; > >>>>> - do { > >>>>> - x = y; > >>>>> - } > >>>>> - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); > >>>>> - return x + 1; > >>>>> + return arch_fetch_and_add(&next_handle, 1) + 1; > >>>>> } > >>>> > >>>> ... the adding of 1 here is a little odd when taken together with > >>>> next_handle's initializer. Tamas, you've not written that code, but do > >>>> you have any thoughts towards the possible removal of either the > >>>> initializer or the adding here? Plus that variable of course could > >>>> very well do with moving into this function. > >>> > >>> I have to say I find the existing logic here hard to parse but by the > >>> looks I don't think we need the + 1 once we switch to > >>> arch_fetch_and_add. Also could go without initializing next_handle to > >>> 1. Moving it into the function would not really accomplish anything > >>> other than style AFAICT? > >> > >> Well, limiting scope of things can be viewed as purely style, but I > >> think it's more than that: It makes intentions more clear and reduces > >> the chance of abuse (deliberate or unintentional). > > > > I'm afraid that whatever is the outcome here, I will defer it to a > > further commit, since the purpose here is to be a non-functional > > change. > > That's fine with me, but an ack from Tamas is still pending, unless I > missed something somewhere. No, just wanted to clarify that I wasn't expecting to do further changes here, FTAOD. Not sure if Tamas was expecting me to further adjust the code. Thanks, Roger. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-28 13:28 ` Roger Pau Monné @ 2024-02-28 13:38 ` Tamas K Lengyel 0 siblings, 0 replies; 17+ messages in thread From: Tamas K Lengyel @ 2024-02-28 13:38 UTC (permalink / raw) To: Roger Pau Monné Cc: Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On Wed, Feb 28, 2024 at 8:28 AM Roger Pau Monné <roger.pau@citrix.com> wrote: > > On Wed, Feb 28, 2024 at 12:18:31PM +0100, Jan Beulich wrote: > > On 28.02.2024 11:53, Roger Pau Monné wrote: > > > On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote: > > >> On 22.02.2024 19:03, Tamas K Lengyel wrote: > > >>> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote: > > >>>> On 22.02.2024 10:05, Roger Pau Monne wrote: > > >>>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > > >>>>> can be achieved with an atomic increment, which is both simpler to read, and > > >>>>> avoid any need for a loop. > > >>>>> > > >>>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > > >>>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > > >>>>> > > >>>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > > >>>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> > > >>>> > > >>>> Reviewed-by: Jan Beulich <jbeulich@suse.com> > > >>>> albeit ... > > >>>> > > >>>>> --- a/xen/arch/x86/mm/mem_sharing.c > > >>>>> +++ b/xen/arch/x86/mm/mem_sharing.c > > >>>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg) > > >>>>> > > >>>>> static shr_handle_t get_next_handle(void) > > >>>>> { > > >>>>> - /* Get the next handle get_page style */ > > >>>>> - uint64_t x, y = next_handle; > > >>>>> - do { > > >>>>> - x = y; > > >>>>> - } > > >>>>> - while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); > > >>>>> - return x + 1; > > >>>>> + return arch_fetch_and_add(&next_handle, 1) + 1; > > >>>>> } > > >>>> > > >>>> ... the adding of 1 here is a little odd when taken together with > > >>>> next_handle's initializer. Tamas, you've not written that code, but do > > >>>> you have any thoughts towards the possible removal of either the > > >>>> initializer or the adding here? Plus that variable of course could > > >>>> very well do with moving into this function. > > >>> > > >>> I have to say I find the existing logic here hard to parse but by the > > >>> looks I don't think we need the + 1 once we switch to > > >>> arch_fetch_and_add. Also could go without initializing next_handle to > > >>> 1. Moving it into the function would not really accomplish anything > > >>> other than style AFAICT? > > >> > > >> Well, limiting scope of things can be viewed as purely style, but I > > >> think it's more than that: It makes intentions more clear and reduces > > >> the chance of abuse (deliberate or unintentional). > > > > > > I'm afraid that whatever is the outcome here, I will defer it to a > > > further commit, since the purpose here is to be a non-functional > > > change. > > > > That's fine with me, but an ack from Tamas is still pending, unless I > > missed something somewhere. > > No, just wanted to clarify that I wasn't expecting to do further > changes here, FTAOD. Not sure if Tamas was expecting me to further > adjust the code. Acked-by: Tamas K Lengyel <tamas@tklengyel.com> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 ` [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop Roger Pau Monne 2024-02-22 10:06 ` Jan Beulich @ 2024-02-22 11:12 ` Julien Grall 2024-02-29 7:37 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Julien Grall @ 2024-02-22 11:12 UTC (permalink / raw) To: Roger Pau Monne, xen-devel Cc: Tamas K Lengyel, Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu Hi Roger, On 22/02/2024 09:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 ` [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop Roger Pau Monne 2024-02-22 10:06 ` Jan Beulich 2024-02-22 11:12 ` Julien Grall @ 2024-02-29 7:37 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Jan Beulich @ 2024-02-29 7:37 UTC (permalink / raw) To: Roger Pau Monne Cc: Tamas K Lengyel, Andrew Cooper, George Dunlap, Wei Liu, xen-devel On 22.02.2024 10:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > The cmpxchg usage is likely a remnant of 32bit support, which didn't have an > instruction to do an atomic 64bit add, and instead a cmpxchg had to be used. > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> Hmm, same typo again, so my commit script again got the author wrong. I'm sorry again. I would consider adding an explicit confirmation step when there's no (well-formed) S-o-b, but I use this script elsewhere as well, and in e.g. binutils and gcc S-o-b are pretty uncommon. (And to avoid the question: I'm specifically avoiding From: itself in the common case, as for you and other people with non-ASCII characters in their names those typically don't properly show up there.) Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] x86/hpet: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 [PATCH 0/2] xen/x86: cmpxchg cleanup Roger Pau Monne 2024-02-22 9:05 ` [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop Roger Pau Monne @ 2024-02-22 9:05 ` Roger Pau Monne 2024-02-22 10:10 ` Jan Beulich ` (2 more replies) 1 sibling, 3 replies; 17+ messages in thread From: Roger Pau Monne @ 2024-02-22 9:05 UTC (permalink / raw) To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same can be achieved with an atomic increment, which is both simpler to read, and avoid any need for a loop. Note there can be a small divergence in the channel returned if next_channel overflows, but returned channel will always be in the [0, num_hpets_used) range, and that's fine for the purpose of balancing HPET channels across CPUs. This is also theoretical, as there's no system currently with 2^32 CPUs (as long as next_channel is 32bit width). Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/hpet.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index d982b0f6b2c9..4777dc859d96 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -458,11 +458,7 @@ static struct hpet_event_channel *hpet_get_channel(unsigned int cpu) if ( num_hpets_used >= nr_cpu_ids ) return &hpet_events[cpu]; - do { - next = next_channel; - if ( (i = next + 1) == num_hpets_used ) - i = 0; - } while ( cmpxchg(&next_channel, next, i) != next ); + next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used; /* try unused channel first */ for ( i = next; i < next + num_hpets_used; i++ ) -- 2.43.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] x86/hpet: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 ` [PATCH 2/2] x86/hpet: " Roger Pau Monne @ 2024-02-22 10:10 ` Jan Beulich 2024-02-22 10:58 ` Roger Pau Monné 2024-02-22 11:16 ` Julien Grall 2024-02-26 9:29 ` Jan Beulich 2 siblings, 1 reply; 17+ messages in thread From: Jan Beulich @ 2024-02-22 10:10 UTC (permalink / raw) To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel On 22.02.2024 10:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > Note there can be a small divergence in the channel returned if next_channel > overflows, but returned channel will always be in the [0, num_hpets_used) > range, and that's fine for the purpose of balancing HPET channels across CPUs. > This is also theoretical, as there's no system currently with 2^32 CPUs (as > long as next_channel is 32bit width). The code change looks good to me, but I fail to see the connection to 2^32 CPUs. So it feels like I'm missing something, in which case I'd rather not offer any R-b. Jan > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> > --- > xen/arch/x86/hpet.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c > index d982b0f6b2c9..4777dc859d96 100644 > --- a/xen/arch/x86/hpet.c > +++ b/xen/arch/x86/hpet.c > @@ -458,11 +458,7 @@ static struct hpet_event_channel *hpet_get_channel(unsigned int cpu) > if ( num_hpets_used >= nr_cpu_ids ) > return &hpet_events[cpu]; > > - do { > - next = next_channel; > - if ( (i = next + 1) == num_hpets_used ) > - i = 0; > - } while ( cmpxchg(&next_channel, next, i) != next ); > + next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used; > > /* try unused channel first */ > for ( i = next; i < next + num_hpets_used; i++ ) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] x86/hpet: use an atomic add instead of a cmpxchg loop 2024-02-22 10:10 ` Jan Beulich @ 2024-02-22 10:58 ` Roger Pau Monné 2024-02-22 11:02 ` Jan Beulich 0 siblings, 1 reply; 17+ messages in thread From: Roger Pau Monné @ 2024-02-22 10:58 UTC (permalink / raw) To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel On Thu, Feb 22, 2024 at 11:10:54AM +0100, Jan Beulich wrote: > On 22.02.2024 10:05, Roger Pau Monne wrote: > > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > > can be achieved with an atomic increment, which is both simpler to read, and > > avoid any need for a loop. > > > > Note there can be a small divergence in the channel returned if next_channel > > overflows, but returned channel will always be in the [0, num_hpets_used) > > range, and that's fine for the purpose of balancing HPET channels across CPUs. > > This is also theoretical, as there's no system currently with 2^32 CPUs (as > > long as next_channel is 32bit width). > > The code change looks good to me, but I fail to see the connection to > 2^32 CPUs. So it feels like I'm missing something, in which case I'd > rather not offer any R-b. The purpose of hpet_get_channel() is to distribute HPET channels across CPUs, so that each CPU gets assigned an HPET channel, balancing the number of CPUs that use each channel. This is done by (next_channel++ % num_hpets_used), so the value of next_channel after this change can be > num_hpets_used, which previously wasn't. This can lead to a different channel returned for the 2^32 call to the function, as the counter would overflow. Note calls to the function are restricted to the number of CPUs in the host, as per-CPU channel assignment is done only once (and the channel is then stored in a per-cpu variable). Feel free to adjust the commit message if you think all this is too much data, or too confusing. Thanks, Roger. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] x86/hpet: use an atomic add instead of a cmpxchg loop 2024-02-22 10:58 ` Roger Pau Monné @ 2024-02-22 11:02 ` Jan Beulich 0 siblings, 0 replies; 17+ messages in thread From: Jan Beulich @ 2024-02-22 11:02 UTC (permalink / raw) To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel On 22.02.2024 11:58, Roger Pau Monné wrote: > On Thu, Feb 22, 2024 at 11:10:54AM +0100, Jan Beulich wrote: >> On 22.02.2024 10:05, Roger Pau Monne wrote: >>> The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same >>> can be achieved with an atomic increment, which is both simpler to read, and >>> avoid any need for a loop. >>> >>> Note there can be a small divergence in the channel returned if next_channel >>> overflows, but returned channel will always be in the [0, num_hpets_used) >>> range, and that's fine for the purpose of balancing HPET channels across CPUs. >>> This is also theoretical, as there's no system currently with 2^32 CPUs (as >>> long as next_channel is 32bit width). >> >> The code change looks good to me, but I fail to see the connection to >> 2^32 CPUs. So it feels like I'm missing something, in which case I'd >> rather not offer any R-b. > > The purpose of hpet_get_channel() is to distribute HPET channels > across CPUs, so that each CPU gets assigned an HPET channel, balancing > the number of CPUs that use each channel. > > This is done by (next_channel++ % num_hpets_used), so the value of > next_channel after this change can be > num_hpets_used, which > previously wasn't. This can lead to a different channel returned for > the 2^32 call to the function, as the counter would overflow. Note > calls to the function are restricted to the number of CPUs in the > host, as per-CPU channel assignment is done only once (and the channel > is then stored in a per-cpu variable). That's once per CPU being brought up, not once per CPU in the system. > Feel free to adjust the commit message if you think all this is too > much data, or too confusing. I'll simply drop that last sentence then, without which Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] x86/hpet: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 ` [PATCH 2/2] x86/hpet: " Roger Pau Monne 2024-02-22 10:10 ` Jan Beulich @ 2024-02-22 11:16 ` Julien Grall 2024-02-26 9:29 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Julien Grall @ 2024-02-22 11:16 UTC (permalink / raw) To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich, Andrew Cooper, Wei Liu Hi Roger, On 22/02/2024 09:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > Note there can be a small divergence in the channel returned if next_channel > overflows, but returned channel will always be in the [0, num_hpets_used) > range, and that's fine for the purpose of balancing HPET channels across CPUs. > This is also theoretical, as there's no system currently with 2^32 CPUs (as > long as next_channel is 32bit width). This would also happen if we decide to reduce the size next_channel. Reviewed-by: Julien Grall <jgrall@amazon.com> Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] x86/hpet: use an atomic add instead of a cmpxchg loop 2024-02-22 9:05 ` [PATCH 2/2] x86/hpet: " Roger Pau Monne 2024-02-22 10:10 ` Jan Beulich 2024-02-22 11:16 ` Julien Grall @ 2024-02-26 9:29 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Jan Beulich @ 2024-02-26 9:29 UTC (permalink / raw) To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel On 22.02.2024 10:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > Note there can be a small divergence in the channel returned if next_channel > overflows, but returned channel will always be in the [0, num_hpets_used) > range, and that's fine for the purpose of balancing HPET channels across CPUs. > This is also theoretical, as there's no system currently with 2^32 CPUs (as > long as next_channel is 32bit width). > > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> Hmm, I'm sorry - it's now me who is recorded as the author of the patch, for my script not finding any Signed-off-by: here (note the typo). Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-02-29 7:38 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-22 9:05 [PATCH 0/2] xen/x86: cmpxchg cleanup Roger Pau Monne 2024-02-22 9:05 ` [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop Roger Pau Monne 2024-02-22 10:06 ` Jan Beulich 2024-02-22 18:03 ` Tamas K Lengyel 2024-02-23 7:43 ` Jan Beulich 2024-02-28 10:53 ` Roger Pau Monné 2024-02-28 11:18 ` Jan Beulich 2024-02-28 13:28 ` Roger Pau Monné 2024-02-28 13:38 ` Tamas K Lengyel 2024-02-22 11:12 ` Julien Grall 2024-02-29 7:37 ` Jan Beulich 2024-02-22 9:05 ` [PATCH 2/2] x86/hpet: " Roger Pau Monne 2024-02-22 10:10 ` Jan Beulich 2024-02-22 10:58 ` Roger Pau Monné 2024-02-22 11:02 ` Jan Beulich 2024-02-22 11:16 ` Julien Grall 2024-02-26 9:29 ` Jan Beulich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.