Announcement

**tuxd3v** · 26 November 2021, 06:02 PM

Originally posted by microcode View Post

AMD64 chips are not the home market of computers, they are the most common server chips in the world, and they are the machines that move the most TCP in software, period.

What I was saying is that in heterogeneous environments, Like mixed Little Endian and Big Endian environments you have no chance to avoid it..
well I don't know about tcp/ip, but for me seems that the payload if not in network native oder, will not be understood by Big Endian machines..

Originally posted by microcode View Post

What?

If you are in mixed Environments, Big Endian/Little Endian..
I count 2 or 3 operations, just for that.. it will translate in latency, and we are talking about 64 bits. multiplying this by a lot in a packet, or in tons of packets will translate to latency weather we want it or not..
Also compilers have evolved a lot in the last 10 years or a bit more..

**tuxd3v** · 26 November 2021, 06:05 PM

Originally posted by sinepgib View Post

On another note, out of complete ignorance, doesn't IPv6 require a swap for addresses too? Those are bigger AFAIR.

I believe everybody avoids almost at all costs dealing with ipv6, due to the processing power/latency required..

**tuxd3v** · 26 November 2021, 06:19 PM

Originally posted by microcode View Post

That depends on how you stored them to begin with, and whether you are computing on them as numbers (pretty uncommon); my understanding is that when you have a socket open like this, those IP fields are templated in.

Overall, the bigger problem with TCP would not be byte swaps, but the bit manip required to fill those odd-shaped fields.

Templated in maybe, but in Big Endian, at least by the standards..its a stream of bytes, the order matter..
All operations will count, all will add work.

**tuxd3v** · 26 November 2021, 08:57 PM

Originally posted by F.Ultra View Post

Going by the tables at https://www.agner.org/optimize/instruction_tables.pdf it looks like MOVBE is equal in both ops and reciprocal throughput to MOV (at least on Zen) so it does look like modern CPU:s actually do this with zero cost (as long as the compiler optimizes to use MOVBE and not BSWAP+MOV).

for what I read from that, in MOVBE only one argument can be a memory address, the other needs to be a register, it will be needed 2 operations, to swap and then store in memory, I think..but it will be 2 ops, plus a return from the builtin function..I don't know if MOVBE is standard in all processors..

**F.Ultra** · 27 November 2021, 12:18 AM

Originally posted by tuxd3v View Post

for what I read from that, in MOVBE only one argument can be a memory address, the other needs to be a register, it will be needed 2 operations, to swap and then store in memory, I think..but it will be 2 ops, plus a return from the builtin function..I don't know if MOVBE is standard in all processors..

Its part of the AMD64 mnemonics. And you have to read or store the value to memory at some point so either perform the swap at the load or at the store. Would be interesting to benchmark on some ARM systems that can run in both big and little and see if there are any real world difference.

**sinepgib** · 27 November 2021, 04:41 AM

Originally posted by tuxd3v View Post

for what I read from that, in MOVBE only one argument can be a memory address, the other needs to be a register, it will be needed 2 operations, to swap and then store in memory, I think..but it will be 2 ops, plus a return from the builtin function..I don't know if MOVBE is standard in all processors..

Why would you want to move between registers swapping endianess? You generally want to either have native endian values in registers or not take it into account (if you're only doing bitwise operations you may not care about byte order, for example). It only makes sense for reading from memory to a register or storing to memory from a register.

**microcode** · 27 November 2021, 01:14 PM

Originally posted by tuxd3v View Post

Templated in maybe, but in Big Endian, at least by the standards..its a stream of bytes, the order matter..
All operations will count, all will add work.

No, like literally no operations differ between little endian and big endian in this case; they're both memcpy.

**tuxd3v** · 27 November 2021, 01:53 PM

Originally posted by sinepgib View Post

Why would you want to move between registers swapping endianess?

Ideally with MOVBE we need to swap bytes when you bring from memory, or when you are storing on memory.
Or you dos it first, or you does it later..
See F.Ultra comment above

**tuxd3v** · 27 November 2021, 01:56 PM

Originally posted by microcode View Post

No, like literally no operations differ between little endian and big endian in this case; they're both memcpy.

The template and numbers of places will not change, but the information need to be swapped to become Big Endian, that was my point..

**microcode** · 27 November 2021, 07:44 PM

Originally posted by tuxd3v View Post

The template and numbers of places will not change, but the information need to be swapped to become Big Endian, that was my point..

No it doesn't. An IP address is never stored in the wrong order to begin with.

Announcement

Another Sizable Performance Optimization To Benefit Network Code With Linux 5.17

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment