
On 08/09/2021 17.57, Jakub Kicinski wrote:
On Wed, 8 Sep 2021 18:26:35 +0300 Ilias Apalodimas wrote:
Normally I'd say put the stats in ethtool -S and the rest in debugfs but I'm not sure if exposing pages_state_hold_cnt and pages_state_release_cnt directly. Those are short counters, and will very likely wrap. They are primarily meaningful for calculating page_pool_inflight(). Given this I think their semantics may be too confusing for an average ethtool -S user.
Putting all the information in debugfs seems like a better idea.
I can't really disagree on the aforementioned stats being confusing. However at some point we'll want to add more useful page_pool stats (e.g the percentage of the page/page fragments that are hitting the recycling path). Would it still be 'ok' to have info split across ethtool and debugfs?
Possibly. We'll also see what Alex L comes up with for XDP stats. Maybe we can arrive at a netlink API for standard things (broken record).
You said percentage - even tho I personally don't like it - there is a small precedent of ethtool -S containing non-counter information (IOW not monotonically increasing event counters), e.g. some vendors rammed PCI link quality in there. So if all else fails ethtool -S should be fine.
I agree with Ilias, that we ought-to add some page_pool stats. *BUT* ONLY if this doesn't hurt performance!!! We have explained before, how this is possible, e.g. by keeping consumer vs. producer counters on separate cache-lines (internally in page_pool struct and likely on per CPU for returning pages). Then the drivers ethtool functions can request the page_pool to fillout a driver provided stats area, such that the collection and aggregation of counters are not on the fast-path. I definitely don't want to see pages_state_hold_cnt and pages_state_release_cnt being exposed directly. These were carefully designed to not hurt performance. An inflight counter can be deducted by above ethtool-driver step and presented to userspace. Notice that while developing page_pool, I've been using tracepoints and bpftrace scripts to inspect the behavior and internals of page_pool. See[1] and I've even written a page leak detector[2]. In principle you could write a bpftrace tool that extract stats, the same way. But I would only recommend doing this for devel phase, because these tracepoints do add some overhead. Originally I wanted to push people to use this for stats, but I've realized that not having these stats easy available is annoying ;-) -Jesper [1] https://github.com/xdp-project/xdp-project/tree/master/areas/mem/bpftrace [2] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/pa...