From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4F2C6E65D3F for ; Fri, 22 Nov 2024 09:55:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CE36710EB27; Fri, 22 Nov 2024 09:55:51 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="BGdP+vkf"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by gabe.freedesktop.org (Postfix) with ESMTPS id BE4F910E22C for ; Fri, 22 Nov 2024 09:55:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732269350; x=1763805350; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=Z+rSgWuN1QTOXN5n/4E6GS1asaXVmOvoqeR8GOAzRAk=; b=BGdP+vkfwXu4HWSI8kulUyIEpjOhSjG0ZocDvU0ZWHbbp9/yyDb28M/Y huJGNT/Po3n9q51HUWVwyRga+md8by8InTaMvCUFl1Hh2RGLHIaVmOAlO U8qOTn0TDsf7acNUCUJYs3BHVl4hQAhvyG9eqisJ+iJZeuj0eKi5CBj1Y pFYxB1mu2Gka+vkzICw6Wq4dly57ZAyxhpOUOZf+PBPyUMIIN+VTFas2C C/A3Pm4xgLr4sHVFk4qK/N6HGWFcy3SAU4esIAygQXHs+LS98Orp8ojS1 2lcMmBfViyKBnWardlj0GsCsF3aP5S1OONZ0FVZAhRWF9/E2G6kZtF9Py w==; X-CSE-ConnectionGUID: DWKazbtvROWc8ou78NvLBQ== X-CSE-MsgGUID: i7lgCMWpSla0EupQwe9bHA== X-IronPort-AV: E=McAfee;i="6700,10204,11263"; a="57819685" X-IronPort-AV: E=Sophos;i="6.12,175,1728975600"; d="scan'208";a="57819685" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2024 01:55:49 -0800 X-CSE-ConnectionGUID: p0WhUiG/S2ueF9lDV5TVAg== X-CSE-MsgGUID: l3bqrj4OQHKVMNnuOfyWag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="95587913" Received: from bergbenj-mobl1.ger.corp.intel.com (HELO [10.245.244.9]) ([10.245.244.9]) by ORVIESA003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2024 01:55:47 -0800 Message-ID: Date: Fri, 22 Nov 2024 10:55:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH i-g-t v2 4/4] tests/intel/xe_eudebug_online: Add read/write pagefault online tests To: Gwan-gyeong Mun , igt-dev@lists.freedesktop.org Cc: andrzej.hajda@intel.com, jonathan.cavitt@intel.com, mika.kuoppala@intel.com, dominik.grzegorzek@intel.com References: <20241121122230.451423-1-gwan-gyeong.mun@intel.com> <20241121122230.451423-5-gwan-gyeong.mun@intel.com> <5d6d89aa-8bd7-407a-ab68-351c7ce87bcf@intel.com> Content-Language: en-US From: "Manszewski, Christoph" Organization: Intel Technology Poland sp. z o.o. - ul. Slowackiego 173, 80-298 Gdansk - KRS 101882 - NIP 957-07-52-316 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: igt-dev@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development mailing list for IGT GPU Tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: igt-dev-bounces@lists.freedesktop.org Sender: "igt-dev" Hi Gwan-gyeong, On 22.11.2024 09:21, Gwan-gyeong Mun wrote: > > > On 11/21/24 7:12 PM, Manszewski, Christoph wrote: >> Hi Gwan-gyeong, >> >> On 21.11.2024 13:22, Gwan-gyeong Mun wrote: >>> Add read and write pagefault tests to xe_eudebug_online that checks if a >>> pagefault event is submitted by the KMD debugger when a pagefault >>> occurs. >>> >>> Test that read (load instruction) and write(store instruction) >>> attempt to >>> load or store access to unallocated memory, causing a pagefault. >>> Examine the address causing the page fault and the number of eu threads >>> causing the pagefault. >>> >>> v2: Refactor of output attention bits on pagefault event handling >>> (Andrzej) >>>      remove / update redudant code (Andrzej, Christoph) >>>      use igt_container_of() macro (Christoph) >>> >>> Co-developed-by: Jonathan Cavitt >>> Signed-off-by: Gwan-gyeong Mun >>> --- >>>   tests/intel/xe_eudebug_online.c | 178 +++++++++++++++++++++++++++++++- >>>   1 file changed, 173 insertions(+), 5 deletions(-) >>> >>> diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/ >>> xe_eudebug_online.c >>> index 0ef0d8093..a70d18ee4 100644 >>> --- a/tests/intel/xe_eudebug_online.c >>> +++ b/tests/intel/xe_eudebug_online.c >>> @@ -36,6 +36,8 @@ >>>   #define BB_IN_VRAM            (1 << 11) >>>   #define TARGET_IN_SRAM            (1 << 12) >>>   #define TARGET_IN_VRAM            (1 << 13) >>> +#define SHADER_PAGEFAULT_READ        (1 << 14) >>> +#define SHADER_PAGEFAULT_WRITE        (1 << 15) >>>   #define TRIGGER_UFENCE_SET_BREAKPOINT    (1 << 24) >>>   #define TRIGGER_RESUME_SINGLE_WALK    (1 << 25) >>>   #define TRIGGER_RESUME_PARALLEL_WALK    (1 << 26) >>> @@ -45,6 +47,7 @@ >>>   #define TRIGGER_RESUME_DSS        (1 << 30) >>>   #define TRIGGER_RESUME_ONE        (1 << 31) >>> +#define SHADER_PAGEFAULT    (SHADER_PAGEFAULT_READ | >>> SHADER_PAGEFAULT_WRITE) >>>   #define BB_REGION_BITMASK    (BB_IN_SRAM | BB_IN_VRAM) >>>   #define TARGET_REGION_BITMASK    (TARGET_IN_SRAM | TARGET_IN_VRAM) >>> @@ -61,6 +64,8 @@ >>>   #define CACHING_VALUE(n)    (CACHING_INIT_VALUE + (n)) >>>   #define SHADER_CANARY 0x01010101 >>> +#define BAD_CANARY 0xf1f1f1f >>> +#define BAD_OFFSET (0x12345678ull << 12) >>>   #define WALKER_X_DIM        4 >>>   #define WALKER_ALIGNMENT    16 >>> @@ -120,7 +125,7 @@ static struct intel_buf *create_uc_buf(int fd, >>> int width, int height, uint64_t r >>>   static int get_number_of_threads(uint64_t flags) >>>   { >>> -    if (flags & SHADER_MIN_THREADS) >>> +    if (flags & (SHADER_MIN_THREADS | SHADER_PAGEFAULT)) >>>           return 16; >>>       if (flags & (TRIGGER_RESUME_ONE | TRIGGER_RESUME_SINGLE_WALK | >>> @@ -179,6 +184,16 @@ static struct gpgpu_shader *get_shader(int fd, >>> const unsigned int flags) >>>               gpgpu_shader__common_target_write_u32(shader, s_dim.y + >>> i, CACHING_VALUE(i)); >>>           gpgpu_shader__nop(shader); >>>           gpgpu_shader__breakpoint(shader); >>> +    } else if (flags & SHADER_PAGEFAULT) { >>> +        if (flags & SHADER_PAGEFAULT_READ) >>> +            gpgpu_shader__read_a64_dword(shader, BAD_OFFSET); >>> +        else >>> +            gpgpu_shader__write_a64_dword(shader, BAD_OFFSET, >>> BAD_CANARY); >>> + >>> +        gpgpu_shader__label(shader, 0); >>> +        gpgpu_shader__write_dword(shader, SHADER_CANARY, 0); >>> +        gpgpu_shader__jump_neq(shader, 0, w_dim.y, STEERING_END_LOOP); >>> +        gpgpu_shader__write_dword(shader, SHADER_CANARY, 0); >> >> Now that I think about - do we need this to be a loop? Can't we just >> do the read/write instructions? This would simplify the code and I >> don't yet see why we need to loop within the shader. The SHADER_LOOP >> is used for interrupt-all because we want to interrupt the workload >> from the user/main igt thread. But here, similar to the >> basic-breakpoint test, we just submit a workload that will halt >> because of the hardware/kmd intervention. >> > the pagefault tests also need this concept. > > When a pagefault happened, KMD sets “Force Exception / Force External > Halt” in TD_CTL to cause the eu threads to enter SIP mode. > In the pagefault handling process of eudebug, kmd installs a null page > at the address where the pagefault happened and makes the halted eu > threads resume (make unhalt). > > It would be ideal if all unhalted eu threads immediately entered SIP > mode due to the FE/FEH settings, but it may not happen immediately. > Therefore, the purpose of using a loop is to ensure that the kernel > shader does not terminate until a pagefault event and attention event > occur by adding an additional instruction after the instruction that > causes the page fault. > Therefore, a loop is used to ensure that at least one eu thread must > enter SIP mode. Yeah if the count of processed instructions before the exception is not defined then indeed the loop has it's place here. But we still may reduce a little bit of code, see below. > The attention callback sets to exit this loop, so this code allows the > eu thread to terminate after the sip shader is processed. > > Br, > G.G. >>>       } >>>       gpgpu_shader__eot(shader); >>> @@ -217,6 +232,16 @@ static int count_set_bits(void *ptr, size_t size) >>>       return count; >>>   } >>> +static int eu_attentions_xor_count(const uint32_t *a, const uint32_t >>> *b, uint32_t size) >>> +{ >>> +    int count = 0; >>> + >>> +    for (int i = 0; i < size / 4 ; i++) >>> +        count += igt_hweight(a[i] ^ b[i]); >>> + >>> +    return count; >>> +} >>> + >>>   static int count_canaries_eq(uint32_t *ptr, struct dim_t w_dim, >>> uint32_t value) >>>   { >>>       int count = 0; >>> @@ -636,7 +661,7 @@ static void eu_attention_resume_trigger(struct >>> xe_eudebug_debugger *d, >>>           } >>>       } >>> -    if (d->flags & SHADER_LOOP) { >>> +    if (d->flags & (SHADER_LOOP | SHADER_PAGEFAULT)) { >> >> If we drop the loop we can drop also this. >> >>>           uint32_t threads = get_number_of_threads(d->flags); >>>           uint32_t val = STEERING_END_LOOP; >>> @@ -746,6 +771,44 @@ static void >>> eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger * >>>               data->single_step_bitmask[i] &= ~att->bitmask[i]; >>>   } >>> +static void eu_attention_resume_pagefault_trigger(struct >>> xe_eudebug_debugger *d, >>> +                          struct drm_xe_eudebug_event *e) >>> +{ >>> +    struct drm_xe_eudebug_event_eu_attention *att = >>> igt_container_of(e, att, base); >>> +    struct online_debug_data *data = d->ptr; >>> +    uint32_t bitmask_size = att->bitmask_size; >>> +    uint8_t *bitmask; >>> + >>> +    if (data->last_eu_control_seqno > att->base.seqno) >>> +        return; >>> + >>> +    bitmask = calloc(1, att->bitmask_size); >>> +    igt_assert(bitmask); >>> + >>> +    eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle, >>> +               att->lrc_handle, bitmask, &bitmask_size); >>> +    igt_assert(bitmask_size == att->bitmask_size); >>> + >>> +    pthread_mutex_lock(&data->mutex); >>> + >>> +    if (d->flags & SHADER_PAGEFAULT) { >>> +        uint32_t threads = get_number_of_threads(d->flags); >>> +        uint32_t val = STEERING_END_LOOP; >>> + >>> +        igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t), >>> +                     data->target_offset + steering_offset(threads)), >>> +                  sizeof(uint32_t)); >>> +        fsync(data->vm_fd); >>> +    } >> >> We can also drop this when we remove the loop. Btw. why can't we just >> use 'eu_attention_resume_trigger' instead of this whole function? We could remove the 'eu_attention_resume_trigger' like so: ``` diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/xe_eudebug_online.c index a70d18ee4..c077795ee 100644 --- a/tests/intel/xe_eudebug_online.c +++ b/tests/intel/xe_eudebug_online.c @@ -622,7 +622,10 @@ static void eu_attention_resume_trigger(struct xe_eudebug_debugger *d, eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle, att->lrc_handle, bitmask, &bitmask_size); igt_assert(bitmask_size == att->bitmask_size); - igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0); + + /* No guarantee that all pagefaulting eu threads will raise attention */ + if (!(d->flags & SHADER_PAGEFAULT)) + igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0); pthread_mutex_lock(&data->mutex); if (igt_nsec_elapsed(&data->exception_arrived) < (MAX_PREEMPT_TIMEOUT + 1) * NSEC_PER_SEC && @@ -771,44 +774,6 @@ static void eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger * data->single_step_bitmask[i] &= ~att->bitmask[i]; } -static void eu_attention_resume_pagefault_trigger(struct xe_eudebug_debugger *d, - struct drm_xe_eudebug_event *e) -{ - struct drm_xe_eudebug_event_eu_attention *att = igt_container_of(e, att, base); - struct online_debug_data *data = d->ptr; - uint32_t bitmask_size = att->bitmask_size; - uint8_t *bitmask; - - if (data->last_eu_control_seqno > att->base.seqno) - return; - - bitmask = calloc(1, att->bitmask_size); - igt_assert(bitmask); - - eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle, - att->lrc_handle, bitmask, &bitmask_size); - igt_assert(bitmask_size == att->bitmask_size); - - pthread_mutex_lock(&data->mutex); - - if (d->flags & SHADER_PAGEFAULT) { - uint32_t threads = get_number_of_threads(d->flags); - uint32_t val = STEERING_END_LOOP; - - igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t), - data->target_offset + steering_offset(threads)), - sizeof(uint32_t)); - fsync(data->vm_fd); - } - pthread_mutex_unlock(&data->mutex); - - data->last_eu_control_seqno = eu_ctl_resume(d->master_fd, d->fd, att->client_handle, - att->exec_queue_handle, att->lrc_handle, - bitmask, att->bitmask_size); - - free(bitmask); -} - static void open_trigger(struct xe_eudebug_debugger *d, struct drm_xe_eudebug_event *e) { @@ -1530,7 +1495,7 @@ static void test_pagefault_online(int fd, struct drm_xe_engine_class_instance *h xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION, eu_attention_debug_trigger); xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION, - eu_attention_resume_pagefault_trigger); + eu_attention_resume_trigger); xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger); xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA, create_metadata_trigger); ``` Does this look reasonable? I know it adds yet another path to 'eu_attention_resume_trigger' but you partially account for the pagefault shader in your current code anyway. Thanks, Christoph >> >>> +    pthread_mutex_unlock(&data->mutex); >>> + >>> +    data->last_eu_control_seqno = eu_ctl_resume(d->master_fd, d->fd, >>> att->client_handle, >>> +                            att->exec_queue_handle, att->lrc_handle, >>> +                            bitmask, att->bitmask_size); >>> + >>> +    free(bitmask); >>> +} >>> + >>>   static void open_trigger(struct xe_eudebug_debugger *d, >>>                struct drm_xe_eudebug_event *e) >>>   { >>> @@ -1015,7 +1078,7 @@ static void run_online_client(struct >>> xe_eudebug_client *c) >>>       struct intel_bb *ibb; >>>       struct intel_buf *buf; >>>       uint32_t *ptr; >>> -    int fd; >>> +    int fd, vm_flags; >>>       metadata[0] = calloc(2, sizeof(*metadata)); >>>       metadata[1] = calloc(2, sizeof(*metadata)); >>> @@ -1025,7 +1088,7 @@ static void run_online_client(struct >>> xe_eudebug_client *c) >>>       fd = xe_eudebug_client_open_driver(c); >>>       /* Additional memory for steering control */ >>> -    if (c->flags & SHADER_LOOP || c->flags & SHADER_SINGLE_STEP) >>> +    if (c->flags & SHADER_LOOP || c->flags & SHADER_SINGLE_STEP || >>> c- >flags & SHADER_PAGEFAULT) >>>           s_dim.y++; >>>       /* Additional memory for caching check */ >>>       if ((c->flags & SHADER_CACHING_SRAM) || (c->flags & >>> SHADER_CACHING_VRAM)) >>> @@ -1045,7 +1108,11 @@ static void run_online_client(struct >>> xe_eudebug_client *c) >>>                                  DRM_XE_DEBUG_METADATA_PROGRAM_MODULE, >>>                                  2 * sizeof(*metadata), metadata[1]); >>> -    create.vm_id = xe_eudebug_client_vm_create(c, fd, >>> DRM_XE_VM_CREATE_FLAG_LR_MODE, 0); >>> +    vm_flags = DRM_XE_VM_CREATE_FLAG_LR_MODE; >>> +    vm_flags |= c->flags & SHADER_PAGEFAULT ? >>> DRM_XE_VM_CREATE_FLAG_FAULT_MODE : 0; >>> + >>> +    create.vm_id = xe_eudebug_client_vm_create(c, fd, vm_flags, 0); >>> + >>>       xe_eudebug_client_exec_queue_create(c, fd, &create); >>>       ibb = xe_bb_create_on_offset(fd, create.exec_queue_id, >>> create.vm_id, bb_offset, bb_size, >>> @@ -1245,11 +1312,13 @@ match_attention_with_exec_queue(struct >>> xe_eudebug_event_log *log, >>>   static void online_session_check(struct xe_eudebug_session *s, int >>> flags) >>>   { >>>       struct drm_xe_eudebug_event_eu_attention *ea = NULL; >>> +    struct drm_xe_eudebug_event_pagefault *pf = NULL; >>>       struct drm_xe_eudebug_event *event = NULL; >>>       struct online_debug_data *data = s->client->ptr; >>>       bool expect_exception = flags & DISABLE_DEBUG_MODE ? false : true; >>>       int sum = 0; >>>       int bitmask_size; >>> +    int pagefault_threads = 0; >>>       xe_eudebug_session_check(s, true, >>> XE_EUDEBUG_FILTER_EVENT_VM_BIND | >>>                         XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP | >>> @@ -1265,6 +1334,17 @@ static void online_session_check(struct >>> xe_eudebug_session *s, int flags) >>>               igt_assert_eq(ea->bitmask_size, bitmask_size); >>>               sum += count_set_bits(ea->bitmask, bitmask_size); >>>               igt_assert(match_attention_with_exec_queue(s->debugger- >>> >log, ea)); >>> +        } else if (event->type == DRM_XE_EUDEBUG_EVENT_PAGEFAULT) { >>> +            uint32_t after_offset = bitmask_size / sizeof(uint32_t); >>> +            uint32_t resolved_offset = bitmask_size / >>> sizeof(uint32_t) * 2; >>> +            uint32_t *ptr = NULL; >>> + >>> +            pf = igt_container_of(event, pf, base); >>> +            ptr = (uint32_t *) pf->bitmask; >>> +            igt_assert_eq(pf->bitmask_size, bitmask_size * 3); >>> +            pagefault_threads += eu_attentions_xor_count(ptr + >>> after_offset, >>> +                                     ptr + resolved_offset, >>> +                                     bitmask_size); >>>           } >>>       } >>> @@ -1279,6 +1359,9 @@ static void online_session_check(struct >>> xe_eudebug_session *s, int flags) >>>           igt_assert(sum > 0); >>>       else >>>           igt_assert(sum == 0); >>> + >>> +    if (flags & SHADER_PAGEFAULT) >>> +        igt_assert(pagefault_threads > 0); >>>   } >>>   static void ufence_ack_trigger(struct xe_eudebug_debugger *d, >>> @@ -1302,6 +1385,43 @@ static void ufence_ack_set_bp_trigger(struct >>> xe_eudebug_debugger *d, >>>       } >>>   } >>> +static void pagefault_trigger(struct xe_eudebug_debugger *d, >>> +                  struct drm_xe_eudebug_event *e) >>> +{ >>> +    struct drm_xe_eudebug_event_pagefault *pf = igt_container_of(e, >>> pf, base); >>> +    uint32_t attn_size = pf->bitmask_size / 3; >>> +    int attn_size_as_u32 = attn_size / sizeof(uint32_t); >>> +    uint32_t *ptr = (uint32_t *) pf->bitmask; >>> +    uint32_t *ptrs[3] = {ptr, ptr + attn_size_as_u32, ptr + 2 * >>> attn_size_as_u32}; >>> +    const char * const name[3] = {"before", "after", "resolved"}; >>> +    int threads[3], pagefault_threads, idx; >>> + >>> +    for (idx = 0; idx < 3; idx++) >>> +        threads[idx] = count_set_bits(ptrs[idx], attn_size); >>> + >>> +    pagefault_threads = eu_attentions_xor_count(ptrs[1], ptrs[2], >>> attn_size); >>> + >>> +    igt_debug("EVENT[%llu] pagefault; threads[before=%d, after=%d, " >>> +          "resolved=%d, pagefault=%d] " >>> +          "client[%llu], exec_queue[%llu], lrc[%llu], >>> bitmask_size[%d], " >>> +          "pagefault_address[0x%llx]\n", >>> +          pf->base.seqno, threads[0], threads[1], threads[2], >>> +          pagefault_threads, pf->client_handle, pf->exec_queue_handle, >>> +          pf->lrc_handle, pf->bitmask_size, >>> +          pf->pagefault_address); >>> + >>> +    for (idx = 0; idx < 3; idx++) { >>> +        igt_debug("=== Attentions %s ===\n", name[idx]); >>> + >>> +        for (uint32_t i = 0; i < attn_size_as_u32; i += 2) >>> +            igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, >>> +                  ptrs[idx][i], ptrs[idx][i + 1]); >>> +    } >>> + >>> +    igt_assert(pagefault_threads > 0); >>> +    igt_assert_eq_u64(pf->pagefault_address, BAD_OFFSET); >>> +} >>> + >>>   /** >>>    * SUBTEST: basic-breakpoint >>>    * Description: >>> @@ -1383,6 +1503,49 @@ static void test_set_breakpoint_online(int fd, >>> struct drm_xe_engine_class_instan >>>       online_debug_data_destroy(data); >>>   } >>> +/** >>> + * SUBTEST: pagefault-read >>> + * Description: >>> + *     Check whether KMD sends pagefault event for workload in debug >>> mode that >>> + *     triggers a read pagefault. >>> + * >>> + * SUBTEST: pagefault-write >>> + * Description: >>> + *     Check whether KMD sends pagefault event for workload in debug >>> mode that >>> + *     triggers a write pagefault. >>> + */ >>> +static void test_pagefault_online(int fd, struct >>> drm_xe_engine_class_instance *hwe, >>> +                  int flags) >>> +{ >>> +    struct xe_eudebug_session *s; >>> +    struct online_debug_data *data; >>> + >>> +    data = online_debug_data_create(hwe); >>> +    s = xe_eudebug_session_create(fd, run_online_client, flags, data); >>> + >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_OPEN, >>> +                    open_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE, >>> +                    exec_queue_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_EU_ATTENTION, >>> +                    eu_attention_debug_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_EU_ATTENTION, >>> +                    eu_attention_resume_pagefault_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_METADATA, >>> +                    create_metadata_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE, >>> +                    ufence_ack_trigger); >>> +    xe_eudebug_debugger_add_trigger(s->debugger, >>> DRM_XE_EUDEBUG_EVENT_PAGEFAULT, >>> +                    pagefault_trigger); >> >> Removing the loop would make it possible to reduce this to 3 triggers. >> >> So again, I may be missing some detail that implies we need a loop in >> the shader. But for now it looks to me like we don't. >> >> Thanks, >> Christoph >> >>> + >>> +    xe_eudebug_session_run(s); >>> +    online_session_check(s, s->flags); >>> + >>> +    xe_eudebug_session_destroy(s); >>> +    online_debug_data_destroy(data); >>> +} >>> + >>>   /** >>>    * SUBTEST: preempt-breakpoint >>>    * Description: >>> @@ -2344,6 +2507,11 @@ igt_main >>>       igt_subtest("breakpoint-many-sessions-tiles") >>>           test_many_sessions_on_tiles(fd, true); >>> +    test_gt_render_or_compute("pagefault-read", fd, hwe) >>> +        test_pagefault_online(fd, hwe, SHADER_PAGEFAULT_READ); >>> +    test_gt_render_or_compute("pagefault-write", fd, hwe) >>> +        test_pagefault_online(fd, hwe, SHADER_PAGEFAULT_WRITE); >>> + >>>       igt_fixture { >>>           xe_eudebug_enable(fd, was_enabled); >