[Linuxarm] Re: [PATCH net-next 2/3] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc

29 May 2021


      On Sat, 29 May 2021 09:44:57 +0800 Yunsheng Lin wrote:
...
...
...
@@ -3852,10 +3852,32 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
  qdisc_calculate_pkt_len(skb, q);
if (q->flags & TCQ_F_NOLOCK) {
+		if (q->flags & TCQ_F_CAN_BYPASS && nolock_qdisc_is_empty(q) &&
+		    qdisc_run_begin(q)) {
+			/* Retest nolock_qdisc_is_empty() within the protection
+			 * of q->seqlock to ensure qdisc is indeed empty.
+			 */
+			if (unlikely(!nolock_qdisc_is_empty(q))) {
This is just for the DRAINING case right?
MISSED can be set at any moment, testing MISSED seems confusing.
MISSED is only set when there is lock contention, which means it
is better not to do the qdisc bypass to avoid out of order packet
problem,
Avoid as in make less likely? Nothing guarantees other thread is not
interrupted after ->enqueue and before qdisc_run_begin().

TBH I'm not sure what out-of-order situation you're referring to,
there is no ordering guarantee between separate threads trying to
transmit AFAIU.

IOW this check is not required for correctness, right?
...
another good thing is that we could also do the batch
dequeuing and transmiting of packets when there is lock contention.
No doubt, but did you see the flag get set significantly often here 
to warrant the double-checking?
...
...
Is it really worth the extra code?
Currently DRAINING is only set for the netdev queue stopped.
We could only use DRAINING to indicate the non-empty of a qdisc,
then we need to set the DRAINING evrywhere MISSED is set, that is
why I use both DRAINING and MISSED to indicate a non-empty qdisc.
...
...
+				rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
+				__qdisc_run(q);
+				qdisc_run_end(q);
+
+				goto no_lock_out;
+			}
+
+			qdisc_bstats_cpu_update(q, skb);
+			if (sch_direct_xmit(skb, q, dev, txq, NULL, true) &&
+			    !nolock_qdisc_is_empty(q))
+				__qdisc_run(q);
+
+			qdisc_run_end(q);
+			return NET_XMIT_SUCCESS;
+		}
+
      rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
-		if (likely(!netif_xmit_frozen_or_stopped(txq)))
-			qdisc_run(q);
+		qdisc_run(q);
+no_lock_out:
      if (unlikely(to_free))
      	kfree_skb_list(to_free);
      return rc;
...
...
...
@@ -164,9 +166,13 @@ static inline void dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
skb = next;
  }
-	if (lock)
+
+	if (lock) {
      spin_unlock(lock);
-	__netif_schedule(q);
+		set_bit(__QDISC_STATE_MISSED, &q->state);
+	} else {
+		__netif_schedule(q);
Could we reorder qdisc_run_begin() with clear_bit(SCHED) 
in net_tx_action() and add SCHED to the NON_EMPTY mask?
Did you mean clearing the SCHED after the q->seqlock is
taken?
The problem is that the SCHED is also used to indicate
a qdisc is in sd->output_queue or not, and the
qdisc_run_begin() called by net_tx_action() can not
guarantee it will take the q->seqlock(we are using trylock
for lockless qdisc)
Ah, right. We'd need to do some more flag juggling in net_tx_action()
to get it right.
...
...
...
+	}
 }
static void try_bulk_dequeue_skb(struct Qdisc *q,
@@ -409,7 +415,11 @@ void __qdisc_run(struct Qdisc *q)
  while (qdisc_restart(q, &packets)) {
      quota -= packets;
      if (quota <= 0) {
-			__netif_schedule(q);
+			if (q->flags & TCQ_F_NOLOCK)
+				set_bit(__QDISC_STATE_MISSED, &q->state);
+			else
+				__netif_schedule(q);
+
      	break;
      }
  }
@@ -680,13 +690,14 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
  if (likely(skb)) {
      qdisc_update_stats_at_dequeue(qdisc, skb);
  } else if (need_retry &&
-		   test_bit(__QDISC_STATE_MISSED, &qdisc->state)) {
+		   READ_ONCE(qdisc->state) & QDISC_STATE_NON_EMPTY) {
Do we really need to retry based on DRAINING being set?
Or is it just a convenient way of coding things up?
Yes, it is just a convenient way of coding things up.
Only MISSED need retrying.

    

[Linuxarm] Re: [PATCH net-next 2/3] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc

Jakub Kicinski