linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Lombard <clombard@linux.vnet.ibm.com>,
	imunsie@au1.ibm.com, andrew.donnellan@au1.ibm.com,
	fbarrat@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] cxl: Add a kernel thread to check the coherent platform function's state
Date: Tue, 19 Apr 2016 19:47:50 +1000	[thread overview]
Message-ID: <1461059270.17661.1.camel@ellerman.id.au> (raw)
In-Reply-To: <1460984701-21490-1-git-send-email-clombard@linux.vnet.ibm.com>

On Mon, 2016-04-18 at 15:05 +0200, Christophe Lombard wrote:

> In the POWERVM environement, the PHYP CoherentAccel component manages

PowerVM is correct I think.

> the state of the Coherant Accelerator Processor Interface adapter and
							   ^
							   (CAPI)
> virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and
> interrupts - and provides a new set of HCALLs for the OS APIs to utilize
					 ^
					 hcall (as below?)
> AFUs.

AFUs ? (you define it below)

> During the course of operation, a coherent platform function can
> encounter errors. Some possible reason for errors are:
> • Hardware recoverable and unrecoverable errors
> • Transient and over-threshold correctable errors
> 
> PHYP implements its own state model for the coherent platform function.
> The current state of this Acclerator Fonction Unit (AFU) is available
> through a hcall.
> 
> In case of low-level troubles (or error injection), The PHYP component
> may reset the card and change the AFU state. The PHYP interface doesn't
> provide any way to be notified when that happens.

Ugh.

> The current implementation of the cxl driver, for the POWERVM
> environment, follows the general error recovery procedures required to

What are "the general error recovery procedures" ?

> reset operation of the coherent platform function. The platform firmware
> resets and reconfigures hardware when an external action is required -
> attach/detach a process, link ok, ....

Platform firmware does that at our request or by itself?

> The purpose of this patch is to interact with the external driver

What's an external driver?

> (where the AFU is shown) even if no action is required. A kernel thread

But no action is required, so why do we need to do anything?

> is needed to check every x seconds the current state of the AFU to see
> if we need to enter an error recovery path.


I don't really understand what this is doing and why we want it. It sounds like
we're waking the cpu up every 3 seconds and having it poll the hypervisor, for
each AFU?

As far as the implementation, I can't see any reason why you need your own
kthreads, can't you just use queue_work() ?

cheers

      parent reply	other threads:[~2016-04-19  9:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-18 13:05 [PATCH] cxl: Add a kernel thread to check the coherent platform function's state Christophe Lombard
2016-04-19  2:40 ` Andrew Donnellan
2016-04-19  9:15   ` christophe lombard
2016-04-19  9:47 ` Michael Ellerman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461059270.17661.1.camel@ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=clombard@linux.vnet.ibm.com \
    --cc=fbarrat@linux.vnet.ibm.com \
    --cc=imunsie@au1.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).