[Linuxarm] Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc

24 Mar 2021


      On 2021/3/24 9:49, Cong Wang wrote:
...
On Sun, Mar 21, 2021 at 5:55 PM Yunsheng Lin linyunsheng@huawei.com wrote:
...
On 2021/3/20 2:15, Cong Wang wrote:
...
On Thu, Mar 18, 2021 at 12:33 AM Yunsheng Lin linyunsheng@huawei.com wrote:
...
On 2021/3/17 21:45, Jason A. Donenfeld wrote:
...
On 3/17/21, Toke Høiland-Jørgensen toke@redhat.com wrote:
...
Cong Wang xiyou.wangcong@gmail.com writes:
> On Mon, Mar 15, 2021 at 2:07 PM Jakub Kicinski kuba@kernel.org wrote:
>>
>> I thought pfifo was supposed to be "lockless" and this change
>> re-introduces a lock between producer and consumer, no?
>
> It has never been truly lockless, it uses two spinlocks in the ring
> buffer
> implementation, and it introduced a q->seqlock recently, with this patch
> now we have priv->lock, 4 locks in total. So our "lockless" qdisc ends
> up having more locks than others. ;) I don't think we are going to a
> right direction...
Just a thought, have you guys considered adopting the lockless MSPC ring
buffer recently introduced into Wireguard in commit:
8b5553ace83c ("wireguard: queueing: get rid of per-peer ring buffers")
Jason indicated he was willing to work on generalising it into a
reusable library if there was a use case for it. I haven't quite though
through the details of whether this would be such a use case, but
figured I'd at least mention it :)
That offer definitely still stands. Generalization sounds like a lot of fun.
Keep in mind though that it's an eventually consistent queue, not an
immediately consistent one, so that might not match all use cases. It
works with wg because we always trigger the reader thread anew when it
finishes, but that doesn't apply to everyone's queueing setup.
Thanks for mentioning this.
"multi-producer, single-consumer" seems to match the lockless qdisc's
paradigm too, for now concurrent enqueuing/dequeuing to the pfifo_fast's
queues() is not allowed, it is protected by producer_lock or consumer_lock.
So it would be good to has lockless concurrent enqueuing, while dequeuing
can be protected by qdisc_lock() or q->seqlock, which meets the "multi-producer,
single-consumer" paradigm.
I don't think so. Usually we have one queue for each CPU so we can expect
each CPU has a lockless qdisc assigned, but we can not assume this in
the code, so we still have to deal with multiple CPU's sharing a lockless qdisc,
and we usually enqueue and dequeue in process context, so it means we could
have multiple producers and multiple consumers.
For lockless qdisc, dequeuing is always within the qdisc_run_begin() and
qdisc_run_end(), so multiple consumers is protected with each other by
q->seqlock .
So are you saying you will never go lockless for lockless qdisc? I thought
you really want to go lockless with Jason's proposal of MPMC ring buffer
code.
I think we has different definition about lockless qdisc.
For my understanding, the dequeuing is within the qdisc_run_begin()
and qdisc_run_end(), so it is always protected by q->seqlock for
lockless qdisck currently, and by lockless qdisc, I never mean
lockless dequeuing, and I am not proposing lockless dequeuing
currently.
Current lockless qdisc for pfifo_fast only means there is no lock
for protection between dequeuing and enqueuing, which also means
when __qdisc_run() is dequeuing a skb while other cpu is enqueuing
a skb.
But enqueuing is protected by producer_lock in skb_array_produce(),
so only one cpu can do the enqueuing at the same time, so I am
proposing to use Jason's proposal to enable multi cpus to do
concurrent enqueuing without taking any lock.
...
...
For enqueuing, multiple consumers is protected by producer_lock, see
pfifo_fast_enqueue() -> skb_array_produce() -> ptr_ring_produce().
I think you seriously misunderstand how we classify MPMC or MPSC,
it is not about how we lock them, it is about whether we truly have
a single or multiple consumers regardless of locks used, because the
goal is to go lockless.
I think I am only relying on the MPSC(multi-produce & single-consumer),
as explained above.
...
...
I am not sure if lockless MSPC can work with the process context, but
even if not, the enqueuing is also protected by rcu_read_lock_bh(),
which provides some kind of atomicity, so that producer_lock can be
reomved when lockless MSPC is used.
Not sure if I can even understand what you are saying here, Jason's
code only disables preemption with busy wait, I can't see why it can
not be used in the process context.
I am saying q->enqeue() is protected by rcu_read_lock_bh().
rcu_read_lock_bh() will disable preemption for us for most configuation,
otherwise it will break netdev_xmit_more() interface too, for it relies
on the cpu not being prempted by using per cpu var(softnet_data.xmit.more).
...
Thanks.
.

2025

2024

2023

2022

2021

2020

[Linuxarm] Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc