From: Yunsheng Lin
Sent: 20 July 2021 03:22
As x86 and arm64 is the two available systems that I can build and test the cpu_relax() implementation, so only add cpu_relax() implementation for x86 and arm64, other arches can be added easily when needed.
...
+#if defined(__i386__) || defined(__x86_64__) +/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ +static __always_inline void rep_nop(void) +{
- asm volatile("rep; nop" ::: "memory");
+}
Beware, Intel increased the stall for 'rep nop' in some recent cpu to IIRC about 200 cycles.
They even document that this might have a detrimental effect. It is basically far too long for the sort of thing it makes sense to busy-wait for.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)