
在 2021/6/1 19:45, John Garry 写道:
On 29/05/2021 02:31, chenxiang (M) wrote:
Hi john,
在 2021/5/28 22:23, John Garry 写道:
On 28/05/2021 08:12, chenxiang wrote:
From: Xiang Chen <chenxiang66@hisilicon.com>
The first patch is to release those rcache when rmmod the driver of the last device to save memory.
And patch2~6 is add support for IOMMU debugfs related to IOVA as follows: /sys/kernel/debug/iommu/iovad/iommu_domainx
Under iommu_domainx dir, add debugfs file iova_rcache and drop_rcache.
From debugfs file iova_rcache, we can get how many cpu_rcache / share rcache / iovas are used, and also we can drop those rcache by debugfs file drop_rcache:
For cpu_rcache, [i]=x|y indicates that there are x iova in load iova_magazine and y iova in prev iova_magazine (128 at most). For share rcache, [i]=x indicates that there are x iova_magazines in use.
estuary:/sys/kernel/debug/iommu/iovad/iommu_domain2$
How do we know the relation to the IOMMU group?
And it could be nice to show 'ls -l' output, even if you did mention it, above.
The domain id(actually there is no domain id in IOMMU code) is the same as the group id, so "iommu_domain2" is related to "iommu_group2".
ok, but I would then just name the folder as "iommu_group2", which is makes it quite clear.
I need to check the code how you set this name (which I'll do next), as we would need to support non-IOMMU group IOVA domains as well.
I would also like to see additional info, specifically allocation attempts per rcache range and also allocation attempts which were "too_big" to be cached.
I did add those additional info (including retry cnt when allocating iova in rbtree, you can see them on attachment). But i fount it had a little affect on performance (1750K -> 1700K), so remove them.
hmmm... maybe we can have a per-cpu count for each item. But then it would be getting more complicated...
If we have only a "too_big", then it should not touch hotpath (as too big means going to IOVA RB tree, which can be slow).
Yes, if only have "too_big", it should not affect the performance, but not sure whether it is ok not know the ratio too_big/total_cnt.
cat iova_rcache [ 272.814457] cpu0 [0]=60|0 [1]=7|0 [2]=32|0 [3]=0|0 [4]=97|0 [5]=104|0
I suppose this is ok, but the output is becoming huge with many CPUs and possibly increasing rcache range.
You possibly could consider breaking it down to sub files or folders, which may be better, like:
ls iovad/iommu_domain2: 0 1 2 3 4 ... too_big
Right, this is a good idea to break it into sub folders according to the size of rcache.
ls iovad/iommu_domain2/0 rcache allocations depot
more iovad/iommu_domain2/0/rcache cpu0: 0|1 cpu1: 2|4 ... cpuMax: 0|0
more iovad/iommu_domain2/0/allocations 1244
more iovad/iommu_domain2/0/depot 28 44 22
ls iovad/iommu_domain2/too_big allocations
As for the rcache file, you could even have separate per-cpu files in a rcache folder, like:
ls iovad/iommu_domain2/0/rcache cpu0 cpu1 cpu2 ...cpu127
more iovad/iommu_domain2/0/rcache/cpu0 0|3
But then we need have more files and folders to examine. Need to find a good balance.
Yes, if adding per-cpu file in a rcache folder, there are too many files need to examine, and user don't know which cpu he should focus on, so need to check it one by one. And i prefer to print all of them once a time as you suggest: more iovad/iommu_domain2/0/rcache cpu0: 0|1 cpu1: 2|4 ... cpuMax: 0|0
.