Announcement

Collapse
No announcement yet.

Another Sizable Performance Optimization To Benefit Network Code With Linux 5.17

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
    jacob
    Senior Member

  • jacob
    replied
    Originally posted by microcode View Post

    In this particular case, you could easily just do a halfword swap when storing, as part of your datastructure design. The cost of the swap is nothing or approximately nothing, even on m68k.
    Actually no. On the m68k a swap was 4 cycles, which is basically half of what you could save that way on average per node during a search. On an update the swap would have been a net penalty.

    Leave a comment:

  • tuxd3v
    Senior Member

  • tuxd3v
    replied
    Originally posted by microcode View Post
    No it doesn't. An IP address is never stored in the wrong order to begin with.
    See this: https://www.cs.wcupa.edu/lngo/csc231...ing/index.html
    check 15. IP addresses
    * 32-bit IP addresses are stored in an IP address struct.
    • IP addresses are always stored in memory in network byte order (big-endian byte order)
    • True in general for any integer transferred in a packet header from one machine to another.
      • E.g., the port number used to identify an Internet connection
    The network format is Big Endian..
    You can't route packets between routers in Little Endian, they don't understand it, also ports for protocols and such, you need to convert..
    tuxd3v
    Senior Member
    Last edited by tuxd3v; 27 November 2021, 09:48 PM.

    Leave a comment:

  • microcode
    Senior Member

  • microcode
    replied
    Originally posted by tuxd3v View Post
    The template and numbers of places will not change, but the information need to be swapped to become Big Endian, that was my point..
    No it doesn't. An IP address is never stored in the wrong order to begin with.

    Leave a comment:

  • tuxd3v
    Senior Member

  • tuxd3v
    replied
    Originally posted by microcode View Post
    No, like literally no operations differ between little endian and big endian in this case; they're both memcpy.
    The template and numbers of places will not change, but the information need to be swapped to become Big Endian, that was my point..

    Leave a comment:

  • tuxd3v
    Senior Member

  • tuxd3v
    replied
    Originally posted by sinepgib View Post
    Why would you want to move between registers swapping endianess?
    Ideally with MOVBE we need to swap bytes when you bring from memory, or when you are storing on memory.
    Or you dos it first, or you does it later..
    See F.Ultra comment above

    Leave a comment:

  • microcode
    Senior Member

  • microcode
    replied
    Originally posted by tuxd3v View Post

    Templated in maybe, but in Big Endian, at least by the standards..its a stream of bytes, the order matter..
    All operations will count, all will add work.
    No, like literally no operations differ between little endian and big endian in this case; they're both memcpy.

    Leave a comment:

  • sinepgib
    Senior Member

  • sinepgib
    replied
    Originally posted by tuxd3v View Post
    for what I read from that, in MOVBE only one argument can be a memory address, the other needs to be a register, it will be needed 2 operations, to swap and then store in memory, I think..but it will be 2 ops, plus a return from the builtin function..I don't know if MOVBE is standard in all processors..
    Why would you want to move between registers swapping endianess? You generally want to either have native endian values in registers or not take it into account (if you're only doing bitwise operations you may not care about byte order, for example). It only makes sense for reading from memory to a register or storing to memory from a register.

    Leave a comment:

  • F.Ultra
    Senior Member

  • F.Ultra
    replied
    Originally posted by tuxd3v View Post
    for what I read from that, in MOVBE only one argument can be a memory address, the other needs to be a register, it will be needed 2 operations, to swap and then store in memory, I think..but it will be 2 ops, plus a return from the builtin function..I don't know if MOVBE is standard in all processors..
    Its part of the AMD64 mnemonics. And you have to read or store the value to memory at some point so either perform the swap at the load or at the store. Would be interesting to benchmark on some ARM systems that can run in both big and little and see if there are any real world difference.

    Leave a comment:

  • tuxd3v
    Senior Member

  • tuxd3v
    replied
    Originally posted by F.Ultra View Post
    Going by the tables at https://www.agner.org/optimize/instruction_tables.pdf it looks like MOVBE is equal in both ops and reciprocal throughput to MOV (at least on Zen) so it does look like modern CPU:s actually do this with zero cost (as long as the compiler optimizes to use MOVBE and not BSWAP+MOV).
    for what I read from that, in MOVBE only one argument can be a memory address, the other needs to be a register, it will be needed 2 operations, to swap and then store in memory, I think..but it will be 2 ops, plus a return from the builtin function..I don't know if MOVBE is standard in all processors..

    Leave a comment:

  • tuxd3v
    Senior Member

  • tuxd3v
    replied
    Originally posted by microcode View Post
    That depends on how you stored them to begin with, and whether you are computing on them as numbers (pretty uncommon); my understanding is that when you have a socket open like this, those IP fields are templated in.

    Overall, the bigger problem with TCP would not be byte swaps, but the bit manip required to fill those odd-shaped fields.
    Templated in maybe, but in Big Endian, at least by the standards..its a stream of bytes, the order matter..
    All operations will count, all will add work.

    Leave a comment:

Working...
X