Announcement

Collapse
No announcement yet.

Wine Developers Appear Quite Apprehensive About Ubuntu's Plans To Drop 32-Bit Support

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by schmidtbag View Post
    But... you can't depend on closed-source programs being maintained like that. So, when you pre-package everything it needs in order to run, you solve the compatibility issues.
    No you don't... In -all- cases glibc is depended on at the bottom of that dependency graph, and if the glibc you compiled against is not fully compatible with the glibc on your system then it will not work. That -always- happens when the glibc you compiled against is significantly older than the one you have available. Your -only- option in cases like that is a chroot with a significantly old glibc.

    Comment


    • Originally posted by schmidtbag View Post
      Who said anything about developers providing glibc and GPU drivers?

      Who said it had to be used for everything? I'm only suggesting to use it in a few niche cases where someone needs to run a closed-source 32-bit application in need of a handful of obscure 32-bit libraries...
      You seem to be taking everything I'm saying and contorting it to the absolute worst-case extreme scenario.

      Then mind explaining how I can so effortlessly install 32-bit libs for things like Wine? Because all they have to do is retain that functionality. They don't need the entire 32-bit repo to do that. It's just a few hundred MB of lib packages and maybe something like mesa or glibc. That's it. Any obscure 32-bit library Ubuntu doesn't come with can be shipped by the application itself, which is usually how it works, and it tends to work out just fine.

      Actually, I do understand what they're doing, but you don't understand what I'm saying. Frankly, I'm tired of trying to explain this. It's not that you and duby229 don't know what you're talking about because I trust that you do, but you're misinterpreting what I'm saying, and it really doesn't matter what I say at this point because Canonical is going to figure this out one way or another.
      Because you -have- to. Glibc is -NOT- a stable user facing interface. You would have to provide a glibc, but that would only work in a chroot. If you provide a glibc, and you would have to, then you have to provide an entire chroot. It won't work otherwise.

      EDIT: Canonical is -not- going to figure this out. The -literal- only option is going to be a 32bit chroot. There will be no other way.
      Last edited by duby229; 06-22-2019, 09:13 AM.

      Comment


      • Originally posted by duby229 View Post
        EDIT: Canonical is -not- going to figure this out. The -literal- only option is going to be a 32bit chroot. There will be no other way.
        And what's funny is that I have no problem with that. My problem is their solution of DIY and that it appears to be unsupported.

        Why use a supposed noob distribution that may or may not need another distribution as a chroot* in a possibly unsupported manner over a distribution that actually supports multilib or chroots*?

        *or VMs or containers or whatever...I'm not being picky here.

        Comment


        • Originally posted by xfcemint View Post
          Um... by making JIT compiler aggregate the cycle counter, you are also making every screen write to be an 'unpredictable conditional'. This is most obvious in the case of "10 BORDER 1", the unknown is that you dont know which horizontal line of the border is affected, because JIT is missing accurate timing information.

          Consider this: If JIT trace execution is allowed to continue on "10 BORDER 1", then, by the same logic, the same JIT trace can then be continued indefinitely (or for a very long time). In that case, the border never gets actually updated (or it gets updated rarely), therefore losing horizontal timing synchronization. Therefore, JIT with cycle counter aggregation cannot have proper horizontal synchronization.
          loop:
          OUT (254), 1
          100 times NOP
          OUT (254), 2
          100 times NOP
          JR loop

          At "OUT (254), 1" the JIT starts a trace and stores the current screen position into a variable P. Obviously, 100 NOPs allow cycle counter aggregation. At "OUT (254), 2" the JIT draws pixels of color "1" in the screen border starting from P to the current screen position. P is updated to the current screen position. Obviously, the 2nd 100 NOPs allow cycle counter aggregation. At "JR loop", which ends the basic block and the trace, the emulator draws border pixels of color "2" starting from P to the current screen position.

          The point is that "OUT (254), 2" does not end the trace.

          Also of interest may be articles about superoptimization: the point being that we can be sure that further optimization is impossible only after enumerating all possible shorter instruction sequences - without enumerating them all it cannot be concluded that a basic block is unoptimizable.

          Comment


          • Originally posted by atomsymbol View Post
            loop:
            OUT (254), 1
            100 times NOP
            OUT (254), 2
            100 times NOP
            JR loop

            At "OUT (254), 1" the JIT starts a trace and stores the current screen position into a variable P. Obviously, 100 NOPs allow cycle counter aggregation. At "OUT (254), 2" the JIT draws pixels of color "1" in the screen border starting from P to the current screen position. P is updated to the current screen position. Obviously, the 2nd 100 NOPs allow cycle counter aggregation. At "JR loop", which ends the basic block and the trace, the emulator draws border pixels of color "2" starting from P to the current screen position.

            The point is that "OUT (254), 2" does not end the trace.
            Excellent example, except that in a game, you don't have "100 times NOPs". A game would be constantly writing to the screen. But let's concentrate on the border color.

            Let's write some random border colors:

            LD C, 100 ; C=100
            LD A, 3 ; A=3
            loop:
            SLA A ; A*=2
            INC A ; A+=1
            AND A,7 ; A=A%8
            OUT (254), A ; BORDER A
            DEC C ; C-=1
            JNZ loop ; if C!=0 goto loop

            Lets call the cycle counter clkcnt. The current border color - bordercl.

            So, the clkcnt needs to get updated on the "OUT (254), A" instruction. This is a requirement, otherwise the emulator can't figure out the position of CRT electron beam on the screen. When the clkcnt is calculated, the emulator also calculates the new beam position. Pixels are written to the emulated screen border between the old beam position and the new beam position, in color bordercl. Finally, bordercl is updated to color from register A.

            Notice that the JIT cannot ever aggregate clkcnt over the "OUT (254), A" instruction. At this instruction, clkcnt needs to be known exactly.
            If that wasn't the case, the JIT couldn't figure out the exact CRT beam position. But this does not apply just to the border color, it applies also to the screen color attributes array and screen pixel array. Same thing.

            So every write to the screen necessitates calculating clkcnt exactly. There is no way around it. The clkcnt might remain unknown only between screen writes.

            As you have said, a typical game might perform a screen write every 3-5 instructions. Therefore, you can aggregate clkcnt only over those few instructions (if there is no unexpected branching in-between).

            The situation you are hoping for, that there would be 100 NOPs (or any other non screen-modifying) instructions in a game is not realistic.

            OH, I just figured out that I'm partially mistaken. When the CRT electron beam is not refreshing the screen pixels (at the start and at the end of each screen frame), emulator would be free to do screen pixel array updates without the need to constantly update the clkcnt variable. At that point JIT can freely aggregate clkcnt.

            But for most of the time, emulator cannot aggregate over screen border color updates, because screen border is being refreshed at almost any time during a frame.

            So, yes, in fact, you can use JIT to significantly speed up cycle counting, at least during the time when screen pixels are not being refreshed.

            Comment


            • Originally posted by xfcemint View Post

              Excellent example, except that in a game, you don't have "100 times NOPs". A game would be constantly writing to the screen. But let's concentrate on the border color.

              Let's write some random border colors:

              LD C, 100 ; C=100
              LD A, 3 ; A=3
              loop:
              SLA A ; A*=2
              INC A ; A+=1
              AND A,7 ; A=A%8
              OUT (254), A ; BORDER A
              DEC C ; C-=1
              JNZ loop ; if C!=0 goto loop

              Lets call the cycle counter clkcnt. The current border color - bordercl.

              So, the clkcnt needs to get updated on the "OUT (254), A" instruction. This is a requirement, otherwise the emulator can't figure out the position of CRT electron beam on the screen. When the clkcnt is calculated, the emulator also calculates the new beam position. Pixels are written to the emulated screen border between the old beam position and the new beam position, in color bordercl. Finally, bordercl is updated to color from register A.

              Notice that the JIT cannot ever aggregate clkcnt over the "OUT (254), A" instruction. At this instruction, clkcnt needs to be known exactly.
              If that wasn't the case, the JIT couldn't figure out the exact CRT beam position. But this does not apply just to the border color, it applies also to the screen color attributes array and screen pixel array. Same thing.

              So every write to the screen necessitates calculating clkcnt exactly. There is no way around it. The clkcnt might remain unknown only between screen writes.

              As you have said, a typical game might perform a screen write every 3-5 instructions. Therefore, you can aggregate clkcnt only over those few instructions (if there is no unexpected branching in-between).

              The situation you are hoping for, that there would be 100 NOPs (or any other non screen-modifying) instructions in a game is not realistic.

              OH, I just figured out that I'm partially mistaken. When the CRT electron beam is not refreshing the screen pixels (at the start and at the end of each screen frame), emulator would be free to do screen pixel array updates without the need to constantly update the clkcnt variable. At that point JIT can freely aggregate clkcnt.

              But for most of the time, emulator cannot aggregate over screen border color updates, because screen border is being refreshed at almost any time during a frame.

              So, yes, in fact, you can use JIT to significantly speed up cycle counting, at least during the time when screen pixels are not being refreshed.
              Cool response. Thanks.

              In your example, "A" would end up in value 7 for a longer time and thus "OUT (254), 7" would be executed multiple times in a row - but this poses no fundamental issue and can be fixed by slightly changing the code.

              The idea of possibly exploiting the fact that visible screen area is not being updated at the start and end of each frame crossed my mind as well.

              The question is whether "OUT (254), A" would end the trace or end the basic block. In my opinion, the answer is that it doesn't need to end them.

              clkcnt is being indirectly used in border_pos (border_pos is a new variable in our discussion). The "OUT (254), A" only needs to know border_pos. border_pos is an input value to the basic block starting at the label "loop". At "OUT (254), A" the emulator can use the value "border_pos + constant_offset" because the number of clocks elapsed from "loop" to "OUT (254), A" might be a constant. This means that at "OUT (254), A" you don't need to update border_pos and don't need to update clkcnt. There is no way to avoid computing "border_pos + constant_offset". At "JNZ loop" the emulator updates clkcnt (and eventually border_pos if at the label "loop" it cannot be derived from clkcnt).

              Comment


              • Originally posted by atomsymbol View Post
                Cool response. Thanks.
                My pleasure.

                Originally posted by atomsymbol View Post
                border_pos is an input value to the basic block starting at the label "loop". At "OUT (254), A" the emulator can use the value "border_pos + constant_offset" because the number of clocks elapsed from "loop" to "OUT (254), A" might be a constant. [I]This means that at "OUT (254), A" you don't need to update border_pos and don't need to update clkcnt.
                You need to update clkcnt and border_pos once per loop iteration. Doesn't really matter if it happens at the middle of the loop or at the beginning or at the end.

                clkcnt and border_pos are mostly tied together. They are almost the same thing written down in two different ways.

                Originally posted by atomsymbol View Post
                In your example, "A" would end up in value 7 for a longer time and thus "OUT (254), 7" would be executed multiple times in a row - but this poses no fundamental issue and can be fixed by slightly changing the code.
                Another great RNG by me, genius. Replace "INC A" with "ADD A,C". Don't laugh.

                Originally posted by atomsymbol View Post
                The question is whether "OUT (254), A" would end the trace or end the basic block. In my opinion, the answer is that it doesn't need to end them.
                When executing that instruction, you need to do something drastic because that instruction is dependent on accurate timing.

                But that is not the real problem.

                The real problem is that JIT compiling is not easy. You need to manage all the traces, compile them, manage exit points, manage self-modifying code, manage timing.

                I thing you might be too big perfectionist. I can tell you, with regards to performance of cycle counter, it doesn't really matter in the grand scheme of old CPU emulation. It is just a tiny detail, unworthy of serious effort.

                JIT would certainly speed up the emulation, but the price is the development cost of JIT. Is JIT really necessary at this point, considering that we all have multi-GHz CPUs?

                I like the idea you mentioned, for JIT compiling to intermediate byte-code (RISC-alike, very simple). That would simplify things, and it is easier to optimize. Of course, that is not the fastest method, but more than good enough for the likes of Z80 and 80386. Also you get free cross-platform portability.

                Comment


                • I really don't understand the problem. We have years to implement support in a more sustainable way. Why the extreme panic? Linux users are so incredibly emotional.

                  Explain to me why your 1990s game MUST at all times have the very newest version of a given library?

                  Comment


                  • I remember something else important. https://en.wikipedia.org/wiki/Year_2038_problem

                    2038 clock problem effects the Linux kernel 32 bit syscall calls. To use time domains to cover this problem applications need to be on cgroups/namespaces. So in some ways getting rid of 32 bit support from the default install and forcing those programs into flatpaks or docker or snap is the correct solution.

                    Comment


                    • Originally posted by the_scx View Post
                      You.
                      No, I really didn't.
                      Again, you said this. You literally said that it is their job to do this.
                      No... I literally didn't... I never used those words.
                      I said it's their job to pre-package libraries that will get their application running. I didn't say they needed to provide GPU drivers, glibc, or whatever else.
                      I have encountered many closed-source programs that required an old version of a library that didn't come with my distro. Those programs often ship their own, and it works fine.
                      There is no easy solution here. At least for 3rd party developers/packagers. For Canonical it is trivial.
                      Yes, there is. You just seem to keep thinking it involves something far more drastic than is necessary.
                      So? WINE developers are clearly not interested into fixing Ubuntu.
                      I can agree that the minimal multilib/multiarch setup would be fine, but this is Canonical's job to do this. No one will do this for them.
                      I agree the wine devs aren't interested in fixing Ubuntu, nor should they be. I also agree it's Canonical's job. That doesn't mean it's a difficult job.
                      The best they can do is to reverse this stupid decision. It is really easy, but it must be done by them, not someone else. I believe that it is still possible, especially looking at the comments from Valve and WINE developers.
                      Of course, they can drop i386 port. But without multiarch support, it will be a disaster.
                      And the 2nd best they can do is give a partial i386 repo that just contains the necessary packages to get these 32-bit programs working. Why is this so hard to understand?
                      Last edited by schmidtbag; 06-22-2019, 03:20 PM.

                      Comment

                      Working...
                      X