New subject: [PATCH OLK-6.6 v10 1/5] sched: Split out QOS_LEVEL from QOS_SCHED for reuse

8 May 2026

      Cloud Service Provider deploy Best-Effort and Latency Sensitive tasks on
the same physical core to maximize the resource utilization. We observe
the LS task needs more cycles to complete the same workload due to
the Microarchitectural resource contention. This feature control
the instruction throughput of BE task into pipeline, so that the other SMT
running LS task could occupy more uarch resources to reach better IPC.

First split out QOS_LABEL from QOS_SCHED for SMT QoS reuse.

The test results on 920G:

        +-------------------------------------------------------------+
        |                               | 基线   | 混布基线 | SMT QoS |
        | sched_wfi_timeout_us          |   \    |   \      |    50   |
        | sched_smt_offline_util_pct    |   \    |   \      |    50   |
        | P99                           | 0.201  | 0.292    |  0.212  |
        | CPU utilization               | 29.3%  | 67.00%   |  63.30% |
        | P99 regression percentage     |   \    | 45.27%   |  5.47%  |
        +-------------------------------------------------------------+

Changes in v10:
- Decouple QOS_SCHED_SMT_EXPELLER and QOS_SCHED
  scheduling logic with SMT_QOS.
- Some cleanup.

Changes in v9:
- Also use SMT sibling CPU utilization NUMA level watermark for offline
  task cpu select to improve the 920G CPU utilization.

Chanegs in v8:
- Use SMT sibling CPU utilization NUMA level watermark instead of src cpu
  offline tasks migrate watermark on every load balance.

Changes in v7:
- Fix the prefer_cpu and select_cpus save/restore.
- Extract can_smt_qos_migrate_task() helper from can_migrate_task().
- Update the names.
- Update the arch code and remove the pmu code.
- Add some comment.

Changes in v6:
- Rename QOS_LABEL to QOS_LEVEL.
- Rename USER_WFXT to SMT_QOS.
- Rename TAG_PULL to SMT_TAG_PULL.
- Adjust the select cpu code which depends on QOS_SCHED_DYNAMIC_AFFINITY.
- Distribute offline tasks to SMT sibling cores based on the configured
  proportion move to a separate patch.
- Use cpumask_t for even_cpu_mask.
- smt_throttle move to the arch patch.
- pmu_smt_update_status() -> smt_update_qos_level().
- >> smt_task_imbalance instead of / 1000.
- Remove the limit for odd -> even load balance, which solve the
  one NUMA load very high problem.
- Rebased on the newest OLK-6.6 code.
- Fix the build issue for QOS_LEVEL selected but CGROUP_SCHED or
  CFS_BANDWIDTH not selected.

Jinjie Ruan (5):
  sched: Split out QOS_LEVEL from QOS_SCHED for reuse
  sched: Add qos_sched_enabled() helper for future expansion
  sched/fair: Add SMT QoS sched core code
  arm64: Add arch code for SMT QoS
  config: Enable SMT_QOS

 arch/arm64/Kconfig.turbo               |  17 ++
 arch/arm64/configs/openeuler_defconfig |   1 +
 arch/arm64/include/asm/cpufeature.h    |   5 +
 arch/arm64/include/asm/xint.h          |  15 ++
 arch/arm64/kernel/Makefile             |   1 +
 arch/arm64/kernel/entry-common.c       |  16 ++
 arch/arm64/kernel/entry.S              |  14 +-
 arch/arm64/kernel/smp.c                |  23 ++
 arch/arm64/kernel/smt_qos.c            |  84 +++++++
 arch/arm64/kernel/xcall/entry.S        |  78 +++++++
 drivers/irqchip/irq-gic-v3.c           |  43 ++++
 include/linux/sched.h                  |  20 ++
 init/Kconfig                           |   5 +
 kernel/sched/core.c                    |  20 +-
 kernel/sched/fair.c                    | 289 ++++++++++++++++++++++++-
 kernel/sched/features.h                |   4 +
 kernel/sched/sched.h                   |  18 +-
 17 files changed, 618 insertions(+), 35 deletions(-)
 create mode 100644 arch/arm64/include/asm/xint.h
 create mode 100644 arch/arm64/kernel/smt_qos.c

-- 
2.34.1

[PATCH OLK-6.6 v10 0/5] SMT QOS

Jinjie Ruan

Jinjie Ruan

Jinjie Ruan

Jinjie Ruan

Jinjie Ruan

Jinjie Ruan

patchwork bot

tags

participants (2)