Announcement

Collapse
No announcement yet.

Fedora To Stop Providing i686 Kernels, Might Also Drop 32-Bit Modular/Everything Repos

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by aht0 View Post
    It's fine for present day and future games but screws you over when you have legacy games as well and these happen to be closed source games.
    Really this is missing the long term view.
    Legacy dos games on modern hardware run in dosbox using cpu virtualisation without issue.
    We have seen on windows with win16 applications https://virtuallyfun.com/wordpress/2018/07/15/winevdm/ winevdm. Again the win16 games are now old enough they run using software virtualisation fine on modern hardware.

    32 bit applications will cross the same line where they no longer need host operating system support some already have. Old 32 bit loki games need to use a runtime different to host and those run perfectly find in qemu-usermode because the overhead of the software emulation does not make current day cpu slower than the cpu those games were designed for. This line with qemu-usermode has moved forwards with MTTCG (multi-threaded TCG) and more parts of qemu being threaded. Of course this also unlocks you from needing a x86 cpu and could use a arm or risc-v or something else instead.

    Hard reality here is legacy always sooner or latter end up on a container of some form or virtualisation..

    With the way things are going 10 years into the future how long companies want parties to support Ubuntu and RHEL you most likely will not be wanting x86 32 bit host support at that time because by that time x86 32 bit applications are going to be inside virtualisation/emulation.

    Really this reminds me of when 64 bit x86 was not going to have v86 mode and people were like this breaks dosemu so it going to be too slow. Yes back then dosbox was too slow 5 years latter dosbox was running old dos games perfectly fine with no hardware support. I have no reason to think 32 bit games/programs are not going to go the same way.

    So far I have not seen any arguement from those saying we need 32 bit support that are any different to the dosemu group that was proved wrong in time.

    Comment


    • #42
      Legacy 32bit games are magnitude more resource hungry than any DOS game ever written. Also they are in crushing majority single-threaded.

      Performance difference between 1990 hardware and 2007-2010 hardware is immensely larger than performance improvements between 2010 and 2019 hardware. Around 2010 was the time I started seeing first 64bit games.

      Following that logic train: I cannot see later-era 32bit games being normally playable on x64 hardware over emulation, like ever.. Because x64 has more or less reached it's ceiling and performance improvements come mostly in the form of more cores - which is pointless when you deal with single-threaded games.

      Comment


      • #43
        Originally posted by aht0 View Post
        Legacy 32bit games are magnitude more resource hungry than any DOS game ever written. Also they are in crushing majority single-threaded.

        Performance difference between 1990 hardware and 2007-2010 hardware is immensely larger than performance improvements between 2010 and 2019 hardware. Around 2010 was the time I started seeing first 64bit games.

        Following that logic train: I cannot see later-era 32bit games being normally playable on x64 hardware over emulation, like ever.. Because x64 has more or less reached it's ceiling and performance improvements come mostly in the form of more cores - which is pointless when you deal with single-threaded games.
        In fact you have managed to have your logic complete wrong. Something key that you repeated single threaded. This is in fact the cause of problem.

        Qemu is picking up multi threading to it TCG system to have a single thread per vcpu. But this is not where the most perforce can be got.. There is no exact reason why the 32 bit x86 instruction to 64 bit something has to be running on the same core as the core running the converted instructions.

        So vcpu in qemu technically needs to split in 2 again. So a single thread program will 2 cores instead of 1.

        Something you miss is the emulator processing over a single thread program can in fact look at the single thread program looking for multi thread-able parts prototypes of this have shown up to a 400% speed increase above native at the cost of using lot more cores. Ie 400% increase in speed using 8 cores to achieve it.

        Single threaded performance has hit is max limit. From this point forwards speed will come by being able to use multi cores. If those single thread games cannot be rewrites/modified todo this next best thing will be emulation and have the emulator multi thread it where possible.

        So your complete arguement was kind of upside down.

        Yes I give current day emulation in open source is no where near the best. The best have 32 bit program single threaded running faster on 64 bit system with more cpu cores to throw at it in emulation than that 32 bit program single threading running running straight on the platform.

        If speed is what you are after 32 bit emulation to 64 bit with multi thread conversion of single thread program would be what you would want in future. Native 32 bit is slow.

        Comment


        • #44
          Originally posted by oiaohm View Post
          In fact you have managed to have your logic complete wrong. Something key that you repeated single threaded. This is in fact the cause of problem.

          Qemu is picking up multi threading to it TCG system to have a single thread per vcpu. But this is not where the most perforce can be got.. There is no exact reason why the 32 bit x86 instruction to 64 bit something has to be running on the same core as the core running the converted instructions.
          Nobody said anything about conversion. The fact is that the game itself is single-threaded. You cannot go around this without rewriting the game and recompiling it.

          Comment


          • #45
            Originally posted by Weasel View Post
            Nobody said anything about conversion. The fact is that the game itself is single-threaded. You cannot go around this without rewriting the game and recompiling it.
            Total garbage Weasel this time keep your mouth shut because you have just said something that is cannot that is absolute doable.


            Back in 2011 the first work on turning single threaded coded programs into multi threaded. Parabilis uses 8 cores to give a 1.51 speed up over running same single threaded program on a single core but this was only the first. These methods can be implemented in qemu TCG.

            If you find the latter work you will find there is a 4 times speed up achievable on 8 core system without needing to rewrite the program source code even better this later one was arm bytecode in into a JIT. We are talking about Items that TCG in qemu could search the x86 bytecode for and farm out to multi cpus from a single threaded program so in fact giving the single threaded program more processing time.

            The methods I am talking about have advatnges and disadvantages. Under 8 cores you are fairly much screwed due to how much you need to process the JIT fast enough and over 16 cores no more performance to gain for a single threaded program being on the fly converted to multi cpu.

            Rewriting the program will on average give your more speed up and you will not have the 16 core wall. 4 to 8 times more performance compared to running on a single core is without having to rewrite the program nothing to sneeze at.

            Sorry the idea that these single threaded programs need to have a rewrite to go faster is wrong. There is a upper-limit how performance gain you get by using automated methods to turn a single threaded program multi threaded. Scary part is its 4 to 8 times faster before you hit the hard wall.

            Comment


            • #46
              Originally posted by oiaohm View Post
              Total garbage
              Your links? Yeah. That link talks about compiler optimizations, which means source code, buddy. Perhaps you should:
              Originally posted by oiaohm View Post
              keep your mouth shut


              It is also not faster than OoO and pretty sure they didn't compare it with "native" speed.

              Originally posted by oiaohm View Post
              The methods I am talking about
              Exist in your dreams only.

              Comment


              • #47
                Originally posted by Weasel View Post
                Your links? Yeah. That link talks about compiler optimizations, which means source code, buddy. Perhaps you should:

                It is also not faster than OoO and pretty sure they didn't compare it with "native" speed.
                PDF | The performance of single-threaded programs and legacy binary code is of critical importance in many everyday applications. However, neither can... | Find, read and cite all the research you need on ResearchGate


                This is a older one from 2009. I cannot find the newer one.

                Yes this one they did compare to native speed. Yes you can get faster.

                Normal out of order is restricted to the resources of one CPU. You want to speed up single thread binary you take the Out of Order idea progress the binary to go across multi cores. Yes 2009 processing binary 1.8x with 4 threads was demoed..

                Originally posted by Weasel View Post
                Exist in your dreams only.
                You wish. There is a 2016 and 2017 ones out there as well but those are all subscription access. I am giving you references because unless you are subscribed you cannot get the documents I have read Weasel.

                Do note even the 2009 one notes that your speed up is capped. 2009 version no matter how much in resources you throw at it you are not going to gain above x3 gain in performance. The new version gives you 4X then you are tapped out.

                Comment


                • #48
                  Even IF you are right, such emulation will probably produce tons of annoying micro-stutter for user ingame.

                  Comment


                  • #49
                    Originally posted by aht0 View Post
                    Even IF you are right, such emulation will probably produce tons of annoying micro-stutter for user ingame.
                    The demo done using Android single threaded native binary games did not introduce any micro-stutter for user ingame.

                    Micro stuttering is a quality defect which manifests as irregular delays between GPU frames.
                    This is micro stuttering. Multi threading the single thread games curing cpu caused Micro stutter. They used similar method to how a play-station game running emulated can run smoother on a PC than on the real hardware of tracking the frame rate and stabilising it.

                    Remember these methods give you faster than running on a single core without having to rebuild the binary from source. They are not in fact cpu processing effective they are highly wasteful. So to double performance you need to doubles cores and it sweet spots at 8 cores 4x

                    This is a horrible table.
                    2 cores equals 1 core performance for single thread application.
                    4 cores equals 2 cores worth of performance for single thread application.
                    8 cores equals 4 cores worth of performance for single thread application
                    9+ cores equals 4 cores worth of performance. for single thread application

                    Notice how it levels off flat at 9 cores. At that point putting any more into attempt to speed up a single thread application is not going to help.

                    You said micro stuttering this explains why its not valid on a 4 core system.

                    You only have 2 cores worth of performance. Having to give up like .5 of a core of performance to level out micro stuttering then allow that the cpu cores will not be boosting to the same clock speed since they are all in use so losing another .5. Without micro stutter you end up with 4 core equally 1 core worth of performance so not worth it.

                    At 8 cores thing change you have 4 cores worth of performance. You lose .5 to levelling out micro stutter you use another .5 due to lower cpu clock speed and you are still sitting on x3 speed gain.

                    Please note what I am talking about scales horrible on threads as programs need more and you are attempting to use this acceleration.
                    ! thread program 8 core
                    2 thread program 16 cores
                    4 thread program 32 cores
                    8 thread program 64 cores

                    We are to the point where single thread programs can gain from this Multi threaded programs not much. In fact if program is designed to multi thread properly it will leave the method I am talking about in the dust most of the time.

                    The method I am talking about really has not been possible while general consumer computer have been under 8 cores. Please note hyperthreads really don't count for this.

                    Yes there are some old 32 bit programs that are in fact design for quad core this means to use the stunt I am talking about you need a 32 core system those are not common and don't look to be going to come common in the short term.

                    There is a lot of tech designed ahead of it time. How to speed up single thread programs on multi core systems is one those problem is multi core system had to get enough cores and common enough to make it a productive option.

                    There will like it or not come a point where running 32 bit programs emulated will be faster than 32 bit programs native on the cpu.

                    This is why I see the arguement no different to dosbox vs dosemu that as cpu improved so did overhead of dosbox come less of a problem. This case instead of clock speed it is needing to increase core count had to increase to cross a threshold.

                    Comment


                    • #50
                      Ehm.. Android games? Why test against something with so small performance requirements?

                      Look up Arma series. Either original Operation Flashpoint, ArmA: Armed Assault or later ArmA 2. A3 already has 64bit support and pseudo-multithread (certain sets of tasks will be offloaded across cores). I quarantee ArmA: 2 would crush contemporary CPU when you get creative in it's Mission Editor. Happens in ArmA 3 too, despite 64bit, core offloading etc - set some hundred AI's combat each-other OR go into large multiplayer server and game' performance drops to complete shit. I've seen 25fps on my rig: Ryzen 5 [email protected] running 16GB 3333MHz DDR4 (overclocked 3200 kit)+512GB NVME SSD. Literally all it needs is CPU cycles for it's scripting engine running on one of the cores, GPU could be potato, I am playing on GTX660 and by MSI Afterburner, game is still not using whole GPU resource pool.

                      I'd be grateful as all %¤(/%&* if there was way to get around this single thread limitation business, ditch Windows AND still have functional anticheat engine running inside emulator too.​​​​​​ Until then, color me sceptical.

                      I know what the microstutter is. My point is, yeah, you could get it when running multi-GPU setups. AND you can have it when playing CPU intensive games where you have 'funky things' going on with your CPU. Increased latencies, thermal throttling, some other process trying to grab priority etc. I am not too sure that such "synthesized single thread" would translate into all that smooth gaming experience on a CPU intensive game.
                      Last edited by aht0; 20 July 2019, 05:27 AM.

                      Comment

                      Working...
                      X