From: Matthew Auld matthew.auld@intel.com
stable inclusion from stable-5.10.52 commit 81dd2d60f677bbab622c52711a711f0f43d37458 bugzilla: 175542 https://gitee.com/openeuler/kernel/issues/I4DTKU
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 0abb33bfca0fb74df76aac03e90ce685016ef7be upstream.
We skip filling out the pt with scratch entries if the va range covers the entire pt, since we later have to fill it with the PTEs for the object pages anyway. However this might leave open a small window where the PTEs don't point to anything valid for the HW to consume.
When for example using 2M GTT pages this fill_px() showed up as being quite significant in perf measurements, and ends up being completely wasted since we ignore the pt and just use the pde directly.
Anyway, currently we have our PTE construction split between alloc and insert, which is probably slightly iffy nowadays, since the alloc doesn't actually allocate anything anymore, instead it just sets up the page directories and points the PTEs at the scratch page. Later when we do the insert step we re-program the PTEs again. Better might be to squash the alloc and insert into a single step, then bringing back this optimisation(along with some others) should be possible.
Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Chris Wilson chris.p.wilson@intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: stable@vger.kernel.org # v4.15+ Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matth... (cherry picked from commit 8f88ca76b3942d82e2c1cea8735ec368d89ecc15) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Chen Jun chenjun102@huawei.com Acked-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Chen Jun chenjun102@huawei.com --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index f08e25e95746..be27f9889431 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -299,10 +299,7 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm, __i915_gem_object_pin_pages(pt->base); i915_gem_object_make_unshrinkable(pt->base);
- if (lvl || - gen8_pt_count(*start, end) < I915_PDES || - intel_vgpu_active(vm->i915)) - fill_px(pt, vm->scratch[lvl]->encode); + fill_px(pt, vm->scratch[lvl]->encode);
spin_lock(&pd->lock); if (likely(!pd->entry[idx])) {