Re: [PATCH v2 11/19] powerpc/rtas: add work area allocator

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nathan Lynch <nathanl@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
	Nathan Lynch via B4 Submission Endpoint
	<devnull+nathanl.linux.ibm.com@kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Kajol Jain <kjain@linux.ibm.com>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	Nick Child <nnac123@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 11/19] powerpc/rtas: add work area allocator
Date: Wed, 08 Feb 2023 08:48:01 -0600	[thread overview]
Message-ID: <87r0v0nrha.fsf@linux.ibm.com> (raw)
In-Reply-To: <87o7q4wera.fsf@mpe.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch via B4 Submission Endpoint
> <devnull+nathanl.linux.ibm.com@kernel.org> writes:
>> diff --git a/arch/powerpc/include/asm/rtas-work-area.h b/arch/powerpc/include/asm/rtas-work-area.h
>> new file mode 100644
>> index 000000000000..76ccb039cc37
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/rtas-work-area.h
>> @@ -0,0 +1,45 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +#ifndef POWERPC_RTAS_WORK_AREA_H
>> +#define POWERPC_RTAS_WORK_AREA_H
>
> The usual style would be _ASM_POWERPC_RTAS_WORK_AREA_H.

OK. (will change in all new headers)

>> +static struct rtas_work_area_allocator_state {
>> +	struct gen_pool *gen_pool;
>> +	char *arena;
>> +	struct mutex mutex; /* serializes allocations */
>> +	struct wait_queue_head wqh;
>> +	mempool_t descriptor_pool;
>> +	bool available;
>> +} rwa_state_ = {
>> +	.mutex = __MUTEX_INITIALIZER(rwa_state_.mutex),
>> +	.wqh = __WAIT_QUEUE_HEAD_INITIALIZER(rwa_state_.wqh),
>> +};
>> +static struct rtas_work_area_allocator_state *rwa_state = &rwa_state_;
>
> I assumed the pointer was so you could swap this out at runtime or
> something, but I don't think you do.
>
> Any reason not to drop the pointer and just use rwa_state.foo accessors?
> That would also allow the struct to be anonymous.
>
> Or if you have the pointer you can at least make it NULL prior to init
> and avoid the need for "available".

I think it's there because earlier versions of this that I never posted
had unit tests. I'll either resurrect those or reduce the indirection.


>> +/*
>> + * A single work area buffer and descriptor to serve requests early in
>> + * boot before the allocator is fully initialized.
>> + */
>> +static bool early_work_area_in_use __initdata;
>> +static char early_work_area_buf[SZ_4K] __initdata;
>
> That should be page aligned I think?

Yes. It happens to be safe in this version because ibm,get-system-parameter,
which has no alignment requirement, is the only function used early
enough to use the buffer. But that's too fragile.


>> +static struct rtas_work_area early_work_area __initdata = {
>> +	.buf = early_work_area_buf,
>> +	.size = sizeof(early_work_area_buf),
>> +};
>> +
>> +
>> +static struct rtas_work_area * __init rtas_work_area_alloc_early(size_t size)
>> +{
>> +	WARN_ON(size > early_work_area.size);
>> +	WARN_ON(early_work_area_in_use);
>> +	early_work_area_in_use = true;
>> +	memset(early_work_area.buf, 0, early_work_area.size);
>> +	return &early_work_area;
>> +}
>> +
>> +static void __init rtas_work_area_free_early(struct rtas_work_area *work_area)
>> +{
>> +	WARN_ON(work_area != &early_work_area);
>> +	WARN_ON(!early_work_area_in_use);
>> +	early_work_area_in_use = false;
>> +}
>> +
>> +struct rtas_work_area * __ref rtas_work_area_alloc(size_t size)
>> +{
>> +	struct rtas_work_area *area;
>> +	unsigned long addr;
>> +
>> +	might_sleep();
>> +
>> +	WARN_ON(size > RTAS_WORK_AREA_MAX_ALLOC_SZ);
>> +	size = min_t(size_t, size, RTAS_WORK_AREA_MAX_ALLOC_SZ);
>
> This seems unsafe.
>
> If you return a buffer smaller than the caller asks for they're likely
> to read/write past the end of it and corrupt memory.

OK, let's figure out another way to handle this.

> AFAIK genalloc doesn't have guard pages or anything fancy to save us
> from that - but maybe I'm wrong, I've never used it.

Yeah we would have to build our own thing on top of it. And I don't
think it could be something that traps on access, it would have to be a
check in rtas_work_area_free(), after the fact.

> There's only three callers in the end, seems like we should just return
> NULL if the size is too large and have callers check the return value.

There are more conversions to do, and a property I hope to maintain is
that requests can't fail. Existing users of rtas_data_buf don't have
error paths for failure to acquire the buffer.

I believe the allocation size passed to rtas_work_area_alloc() can be
known at build time in all cases. Maybe we could prevent inappropriate
requests from being built with a compile-time assertion (untested):

/* rtas-work-area.h */

static inline struct rtas_work_area *rtas_work_area_alloc(size_t sz)
{
	static_assert(sz < RTAS_WORK_AREA_MAX_ALLOC_SZ);
        return __rtas_work_area_alloc(sz);
}

I think this would be OK? If I can't make it work I'll fall back to
returning NULL as you suggest, but it will make for more churn (and
risk) in the conversions.


>> +	if (!rwa_state->available) {
>> +		area = rtas_work_area_alloc_early(size);
>> +		goto out;
>> +	}
>> +
>> +	/*
>> +	 * To ensure FCFS behavior and prevent a high rate of smaller
>> +	 * requests from starving larger ones, use the mutex to queue
>> +	 * allocations.
>> +	 */
>> +	mutex_lock(&rwa_state->mutex);
>> +	wait_event(rwa_state->wqh,
>> +		   (addr = gen_pool_alloc(rwa_state->gen_pool, size)) != 0);
>> +	mutex_unlock(&rwa_state->mutex);
>> +
>> +	area = mempool_alloc(&rwa_state->descriptor_pool, GFP_KERNEL);
>> +	*area = (typeof(*area)){
>> +		.size = size,
>> +		.buf = (char *)addr,
>> +	};
>
> That is an odd way to write that :)

yeah I'll change it.

>
>> +out:
>> +	pr_devel("%ps -> %s() -> buf=%p size=%zu\n",
>> +		 (void *)_RET_IP_, __func__, area->buf, area->size);
>
> Can we drop those? They need a recompile to enable, so if someone needs
> debugging they can just rewrite them - or use some sort of tracing
> instead.

Sure.


>> +machine_arch_initcall(pseries, rtas_work_area_allocator_init);
>
> Should it live in platforms/pseries then?

Yeah it probably ought to. I am pretty sure the "work area" construct is
PAPR-specific, and I haven't found any evidence that it's used on
non-pseries.


>> +/**
>> + * rtas_work_area_reserve_arena() - reserve memory suitable for RTAS work areas.
>> + */
>> +int __init rtas_work_area_reserve_arena(const phys_addr_t limit)
>> +{
>> +	const phys_addr_t align = RTAS_WORK_AREA_ARENA_ALIGN;
>> +	const phys_addr_t size = RTAS_WORK_AREA_ARENA_SZ;
>> +	const phys_addr_t min = MEMBLOCK_LOW_LIMIT;
>> +	const int nid = NUMA_NO_NODE;
>
> This should probably also be restricted to pseries?

Yes.

next prev parent reply	other threads:[~2023-02-08 14:49 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 18:54 [PATCH v2 00/19] RTAS maintenance Nathan Lynch
2023-02-06 18:54 ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 01/19] powerpc/rtas: handle extended delays safely in early boot Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-08 11:20   ` Michael Ellerman
2023-02-08 13:14     ` Nathan Lynch
2023-02-10  5:54       ` Michael Ellerman
2023-02-06 18:54 ` [PATCH v2 02/19] powerpc/perf/hv-24x7: add missing RTAS retry status handling Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 03/19] powerpc/pseries/lpar: " Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 04/19] powerpc/pseries/lparcfg: " Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 05/19] powerpc/pseries/setup: " Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 06/19] powerpc/pseries: drop RTAS-based timebase synchronization Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 07/19] powerpc/rtas: improve function information lookups Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-08 11:57   ` Michael Ellerman
2023-02-08 13:16     ` Nathan Lynch
2023-02-06 18:54 ` [PATCH v2 08/19] powerpc/rtas: strengthen do_enter_rtas() type safety, drop inline Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 09/19] powerpc/tracing: tracepoints for RTAS entry and exit Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 10/19] powerpc/rtas: add tracepoints around RTAS entry Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 11/19] powerpc/rtas: add work area allocator Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-08 11:58   ` Michael Ellerman
2023-02-08 14:48     ` Nathan Lynch [this message]
2023-02-10  6:07       ` Michael Ellerman
2023-02-06 18:54 ` [PATCH v2 12/19] powerpc/pseries/dlpar: use RTAS work area API Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 13/19] powerpc/pseries: PAPR system parameter API Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 14/19] powerpc/pseries: convert CMO probe to papr_sysparm API Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 15/19] powerpc/pseries/lparcfg: convert " Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 16/19] powerpc/pseries/hv-24x7: " Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 17/19] powerpc/pseries/lpar: " Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 18/19] powerpc/rtas: introduce rtas_function_token() API Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint
2023-02-08 12:09   ` Michael Ellerman
2023-02-08 15:44     ` Nathan Lynch
2023-02-06 18:54 ` [PATCH v2 19/19] powerpc/rtas: arch-wide function token lookup conversions Nathan Lynch
2023-02-06 18:54   ` Nathan Lynch via B4 Submission Endpoint

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0v0nrha.fsf@linux.ibm.com \
    --to=nathanl@linux.ibm.com \
    --cc=ajd@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=devnull+nathanl.linux.ibm.com@kernel.org \
    --cc=kjain@linux.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=nnac123@linux.ibm.com \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.