Announcement

Collapse
No announcement yet.

Samsung Dealing With Wayland "Zombie Apocalypse" Bug

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by CrystalGamma View Post
    I would have said it is pretty obvious that the client has to expect objects it has sent a delete request for to continue receiving events until the server confirms the deletion with the 'delete_id' event, including all that entails, like FDs in the message.
    That is the problem. Its not written in the protocol. So under Unix/Linux/Posix when you have transferred FD to another process you are in fact free to close it. This bring out one of the what the moments when server sends event client no longer receiving at all and server does not get any response.

    https://dvdhrm.wordpress.com/tag/memfd/

    Sealing on fds exist for the exact reason that you send over a fd and you never ever want to see it again. The protocol does not state that sealing should or should not be done or that fds have to remain standing.

    Basically inside Posix IPC you cannot believe that when you have received a fd that there is anything still there unless the protocol you are using states it has to be.

    The leaks and issues happened one due to client developers not believing they had to clean up their fd when it is a client side responsibility. Other issues happened because client code was cleaning up recycling a fd so now a fd that had not been closed has been used more than once. Read the protocol again nothing says you are not allowed to-do this.

    So you pretty obvious is wrong due to how people broke it. This is like the idea common sense its not that common. If you want a pretty obvious/common sense thing to be they way it is write it in the documentation or out of the million monkeys out there someone will stuff it up.

    Comment


    • #22
      Originally posted by oiaohm View Post
      .....
      So you pretty obvious is wrong due to how people broke it. This is like the idea common sense its not that common. If you want a pretty obvious/common sense thing to be they way it is write it in the documentation or out of the million monkeys out there someone will stuff it up.
      It's obvious to some who know about things like RCU, but not as obvious to others. In any case, there seems to have been a misunderstanding from the statement "we can't change the wire protocol", as some thought that meant that a "proper" solution requires changing the protocol, but that is not the case. The way it was eventually fixed is considered a "proper" fix by the author, in the comment section of the blog he wrote he was puzzled that it could be interpreted otherwise.

      And it sounds to me like changing the protocol would have been the wrong way to fix it, even if it were possible. The author maybe just sorted out that non-option before figuring out what the nature of a proper fix would be.

      Comment


      • #23
        Originally posted by oiaohm View Post
        ....
        The leaks and issues happened one due to client developers not believing they had to clean up their fd when it is a client side responsibility. Other issues happened because client code was cleaning up recycling a fd so now a fd that had not been closed has been used more than once. Read the protocol again nothing says you are not allowed to-do this.
        ....
        I believe this is about events that are still in the buffers after the delete event has been sent to the client application. These must be cleaned up by the library code handling the proxies on the client side. The reason being, if I understand it correctly, that the client application will not see these events anymore, and the library code is the only one who can handle them. (Trying to speak within the terms of that implementation.)

        Comment


        • #24
          Originally posted by indepe View Post

          It's obvious to some who know about things like RCU, but not as obvious to others. In any case, there seems to have been a misunderstanding from the statement "we can't change the wire protocol", as some thought that meant that a "proper" solution requires changing the protocol, but that is not the case. The way it was eventually fixed is considered a "proper" fix by the author, in the comment section of the blog he wrote he was puzzled that it could be interpreted otherwise.

          And it sounds to me like changing the protocol would have been the wrong way to fix it, even if it were possible. The author maybe just sorted out that non-option before figuring out what the nature of a proper fix would be.
          Please note I am not talking about changing the over the wire protocol. The wayland protocol documentation go read it does not say that fd are to be handled in RCU pattern either.

          Originally posted by indepe View Post
          I believe this is about events that are still in the buffers after the delete event has been sent to the client application. These must be cleaned up by the library code handling the proxies on the client side. The reason being, if I understand it correctly, that the client application will not see these events anymore, and the library code is the only one who can handle them. (Trying to speak within the terms of that implementation.)
          If you look at X11 protocols sections cover how you implement xlib and xcb. So another party implementing their own do the require clean ups.

          The current protocol documentation does not say that the client code has to clean this stuff up. Does not say libraries for the client code has to clean this stuff up either. So a conforming implementation can basically repeat this mistake.

          You could say what I am saying is the protocol need some extra clarification in it so those using are aware how fd and other bits are to be handled. Include basic idea who is reasonable and how clean ups should happen.

          Fixing libwayland-server and libwayland-client is one thing. To prevent someone writing their own implementation from repeating the same mistakes in future hopeful would be noting this handling requirement in the protocol documentation. So this is not that the fix is broken. Its that the protocol documentation need to be updated so others don't end up down the same path of hell.

          Comment


          • #25
            Originally posted by oiaohm View Post
            Fixing libwayland-server and libwayland-client is one thing. To prevent someone writing their own implementation from repeating the same mistakes in future hopeful would be noting this handling requirement in the protocol documentation. So this is not that the fix is broken. Its that the protocol documentation need to be updated so others don't end up down the same path of hell.
            On the other hand, this is the free software reference implementation of a free software protocol. If someone want to implement a wayland compositor there is a good chance that he will be using both the protocol definition and the reference implementation as documentation, so one could says that the problem and the fix (if he chose to handle it the same way) is already documented

            Documenting the problem and its solution within the protocol would be a bad idea, as

            1) not everyone is going to handle the problem the same way
            2) some programming languages will not even exhibit the problem at all
            3) it might give the impression that you need to implement your wayland compositor the same way

            The wayland protocol documentation go read it does not say that fd are to be handled in RCU pattern either.
            It doesn't have to: it states what happen to the FDs, not how you want to make sure that your implementation enforce the protocol rules.

            Comment


            • #26
              Originally posted by Emmanuel Deloget View Post
              It doesn't have to: it states what happen to the FDs, not how you want to make sure that your implementation enforce the protocol rules.
              Read the documentation it states what happens in create, in transfer, and skips over what happens in the delete/dispose stage of fd.

              Originally posted by Emmanuel Deloget View Post
              1) not everyone is going to handle the problem the same way
              This is partly related to number 2.

              Originally posted by Emmanuel Deloget View Post
              2) some programming languages will not even exhibit the problem at all
              The problem here is without stating how Fd are handled the language could choose to recycle that brought out one of the bugs.

              So without stating how disposal will be handled you cannot have a standard test for testing that disposal is handled right. Now if a language gets the right reposes implemented a different way that is fine. But if we have a case that we have two majorly different responses causing bugs that will not be fine.

              The reality with these transfer related things what should happen to messages sent to a closed off handle on client. Should the client just junk them or should i process them. Now this is two different responses. Now if people are implementing different we will have these points be different and cause bugs.

              So sections of compositor and client libraries does need to be implemented using the same kind of method or have issues.

              Comment


              • #27
                Originally posted by oiaohm View Post
                Read the documentation it states what happens in create, in transfer, and skips over what happens in the delete/dispose stage of fd.
                Taking a look at the documentation (not sure if we are looking at the same one), it appears that the responsibility to close an fd, for the receiver of a message, is described in the specific description of each protocol. I would think that in the case of the receiver having been "deleted", this responsibility always falls onto the library code: if not possible on the server side, due to implementation/conceptual restrictions, then on the client side. Removing the fd from the socket's ancillary data is probably always done by the client-side library code.

                Perhaps a guideline in that direction would be helpful for the library implementation. On the other hand, the need to keep track of an fd's lifecycle is probably a general one.

                Comment


                • #28
                  Originally posted by indepe View Post
                  Perhaps a guideline in that direction would be helpful for the library implementation. On the other hand, the need to keep track of an fd's lifecycle is probably a general one.
                  Problem here is this is POSIX there is more than 1 fd life-cycle possiblity.

                  1) the client library delete the fd from client as soon as it transferred. This is good in a protocol that uses fd sealing so the server knows it cannot send anything back by that fd.
                  2)the client library has to wait until informed by server that the fd is deleted.
                  2a) client has to process sent messages to that closed fd that the server placed on that fd after the close message was sent.
                  2b) client can ignore all messages sent to that closed fd after the closed message was sent.

                  This all alters how you are allowed to create you fd life-cycle. The protocol needs to give a fd life-cycle. How you implement that cycle is up to the implement but if different libraries create different life-cycle patterns hello problems.

                  Also depending on what the fd is being sent across different life-cycle responses might be perfectly acceptable but they need to be in the protocol so everyone is on the same page.

                  Comment


                  • #29
                    Originally posted by oiaohm View Post

                    Problem here is this is POSIX there is more than 1 fd life-cycle possiblity.

                    1) the client library delete the fd from client as soon as it transferred. This is good in a protocol that uses fd sealing so the server knows it cannot send anything back by that fd.
                    2)the client library has to wait until informed by server that the fd is deleted.
                    2a) client has to process sent messages to that closed fd that the server placed on that fd after the close message was sent.
                    2b) client can ignore all messages sent to that closed fd after the closed message was sent.

                    This all alters how you are allowed to create you fd life-cycle. The protocol needs to give a fd life-cycle. How you implement that cycle is up to the implement but if different libraries create different life-cycle patterns hello problems.

                    Also depending on what the fd is being sent across different life-cycle responses might be perfectly acceptable but they need to be in the protocol so everyone is on the same page.
                    As far as I understand, the bug is about fd's that are sent in the direction to the client. All I read is that specific protocols ask the client application to call close on specific fd's, after using them. Relevant to the bug is that if the client app will not receive those messages anymore, then the library (of course) has to take care of them, including removing them from the socket's ancillary data, and I suppose it should know how they need to be taken care of, since it is sending them from the server.

                    If there are multiple options there, the library needs to choose one of them, internally, for its own implementation, unless they become relevant for the client app. Do you see ambiguities for what the client application needs to do? If so, I suggest giving specific examples for specific protocols, and filings bugs for those. Unless I misunderstand something.

                    Comment

                    Working...
                    X