[...]
So add elevated refcnt support in page pool, and support allocating page frag to enable multi-frames-per-page based on the elevated refcnt support.
As the elevated refcnt is per page, and there is no space for that in "struct page" now, so add a dynamically allocated "struct page_pool_info" to record page pool ptr and refcnt corrsponding to a page for now. Later, we can recycle the "struct page_pool_info" too, or use part of page memory to record pp_info.
I'm not happy with allocating a memory (slab) object "struct page_pool_info" per page.
This also gives us an extra level of indirection.
I'm not happy with that either, if there is better way to avoid that, I will be happy to change it:)
I think what we have to answer here is, do we want and does it make sense for page_pool to do the housekeeping of the buffer splitting or are we better of having each driver do that. IIRC your previous patch on top of the original recycling patchset was just 'atomic' refcnts on top of page pool.
You are right that driver was doing the the buffer splitting in previous patch.
The reason why I abandoned that is:
Currently the meta-data of page in the driver is per desc, which means it might not be able to use first half of a page for a desc, and the second half of the same page for another desc, this ping-pong way of reusing the whole page for only one desc in the driver seems unnecessary and waste a lot of memory when there is already reusing in the page pool.
Easy use of API for the driver too, which means the driver uses page_pool_dev_alloc_frag() and page_pool_put_full_page() for elevated refcnt case, corresponding to page_pool_dev_alloc_pages() and page_pool_put_full_page() for non-elevated refcnt case, the driver does not need to worry about the meta-data of a page.
Ok that makes sense. We'll need the complexity anyway and I said I don't have any strong opinions yet, we might as well make page_pool responsible for it. What we need to keep in mind is that page_pool was primarily used for XDP packets. We need to make sure we have no performance regressions there. However I don't have access to > 10gbit NICs with XDP support. Can anyone apply the patchset and check the performance?
[...]
Aside from the performance improvement, there is memory usage decrease for 64K page size kernel, which means a 64K page can be used by 32 description with 2k buffer size, and that is a lot of memory saving for 64 page size kernel comparing to the current split page reusing implemented in the driver.
Whether the driver or page_pool itself keeps the meta-data, the outcome here won't change. We'll still be able to use page frags.
As above, it is the ping-pong way of reusing when the driver keeps the meta-data, and it is page-frag way of reusing when the page pool keeps the meta-data.
I am not sure if the page-frag way of reusing is possible when we still keep the meta-data in the driver, which seems very complex at the initial thinking.
Fair enough. It's complex in both scenarios so if people think it's useful I am not against adding it in the API.
Thanks /Ilias
Cheers /Ilias
__page_frag_cache_refill() + __page_frag_cache_drain() + page_frag_alloc_align()
[...]
.