Announcement

Collapse
No announcement yet.

The Ideal (Hypothetical) Gaming Processor

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Oh geez... I can't find any good screenshots demonstrating an adaptive sampling grid. I may have to resurrect my years-old raytracing code and make some.

    When I said earlier that I wrote a raytracer, I should have said I wrote an *interactive* raytracer, specifically. It wasn't a very *good* interactive raytracer - it wasn't even multi-threaded, for example - but I was proud of it at the time, and I learned a lot in the process of writing it.

    Comment


    • #22
      Originally posted by Pyre Vulpimorph View Post
      Hi everyone. I've been wondering what the "best" types of gaming CPUs would be, and I would like to know what it takes to make an idealized gaming-oriented processor.

      Suppose I had a fabless semiconductor company, and I was contracted to design the CPU for a new game console. The GPU was already determined to be something relatively powerful, like a die-shrunk Radeon HD 6870. The goal is to make the chip as small and cheap as possible, but will never bottleneck the graphics processor.

      What sort of difference does a processor's ISA make? Suppose I had licenses for MIPS, ARM, SPARC, and other risc-based architectures. Is the ISA really that important, or just the micro-architectural "plumbing" underneath?

      What types of instructions do modern video games make the most use of? Do games find integer performance most important, floating point performance, or both? If floating-point calculations can be offloaded to the GPU, can the CPU's floating-point units be excised to make the chip smaller and cheaper, or would that harm system performance? If FP power is still important, would adding complex 256- or even 512-bit units be beneficial to total system performance, or just a waste of space?

      How important is cache size? Intel's SNB i5, i7, and SNB-E i7 processors have 1.5, 2.0, and 2.5 MiB of L3 cache per core, but looking at benchmarks from Anandtech and other places, there doesn't seem to be too much difference. At least, not enough difference to justify the added space and expense. How much cache per core is "enough"?

      As for the core count itself, would it be best to make a quad-core chip and call it a day? I know most game engines today simply do not scale past four cores, and simultaneous multithreading is pretty much useless for games. But, since consoles are expected to last around 5 years, would making a 6- or 8-core CPU prove beneficial in the future, so long as the chip stayed within the client's budget?

      I know this is just a lot of speculation, but I'm just curious what makes games tick.
      To get back to the topic. Assume one would use OpenCL for physics and rendering. I think you could get away with 2-4 simple RISC cores(without FPU). The cores would be there to feed the GPU with data and take care of game logic, interrupts and other boring stuff. Make them as fast as you can afford. Make sure there are no bottlenecks or large latencies between CPU cores and GPU. Throw 8GB shared memory with enough bandwidth into the mix and you should be good to go.

      And make sure to keep the production costs low and yields high. No experiments a la PS3.

      Comment


      • #23
        Originally posted by Qaridarium
        o man ray tracing dosn't work in FRAMES!

        ray tracing work in ray per minutes!

        a Real time Ray tracing engine dosn'T have FRAMES per SECOND!

        because of this all of your writing is wrong!
        (Rays per minute/60) / (Rays needed per scene to be considered complete) = Number of times the scene will be updated per second (FPS).

        Everything in the computational world works in a discrete metric, you need to update the scene sometime, but you can't do it in a infinitely small gradient of time. Even real world works like this, if not, you would be leading yourself into to a paradox.

        You don't have much idea of what you are talking, right?

        Comment


        • #24
          Originally posted by Qaridarium
          now you pointing out the "interactive" part.. but you never get it a iterative ray-tracing engine is not a "Real-time-ray-tracing-engine"

          Real time really means you only chance the murmuring rate. ALL the time on ALL costs!
          Now you're just playing with words. "Real-time raytracing" does not have any hard and fast definition, it just means raytracing at a high enough frame rate & enough detail for a human being to watch/interact with the scene whilst it is being rendered. I prefer the term "interactive raytracing" because, in other areas of computing, the term "real-time" means guaranteeing a response within a certain time frame, which is usually NOT what people mean when talking about "real-time raytracing".

          "Real-time" is NOT just another word for "high performance".

          Comment


          • #25
            uh... what? I never mentioned anything about capitalism or buying products. I'm just pointing out that a lot of people say "real-time raytracing" when all they really mean is high-performance/interactive raytracing. Often the people who say "real-time raytracing" don't even know what real-time computing actually means.

            You could write a raytracer which worked hard to try and guarantee a particular frame rate, and stop casting any more rays when the time budget for the current frame is elapsed, but that still isn't hard real-time computing because the correctness of the system does not rely on that frame rate being met. If the frame rate is not met, all that happens is the performance drops for a moment. Such a system is at best soft real-time, using the definitions from the Wikipedia page under "Criteria for real-time computing".

            "Interactive raytracing" just means raytracing at a frame rate high enough to interact with the scene. It's not very realistic to interact with a scene in a system which takes 10 minutes to render each frame, is it? "Interactive" doesn't imply any hard constraints, the speed needed to consider something "interactive" is subjective.

            Comment


            • #26
              Originally posted by log0 View Post
              To get back to the topic. Assume one would use OpenCL for physics and rendering. I think you could get away with 2-4 simple RISC cores(without FPU). The cores would be there to feed the GPU with data and take care of game logic, interrupts and other boring stuff. Make them as fast as you can afford. Make sure there are no bottlenecks or large latencies between CPU cores and GPU. Throw 8GB shared memory with enough bandwidth into the mix and you should be good to go.

              And make sure to keep the production costs low and yields high. No experiments a la PS3.
              If physics takes place entirely on the GPU, then your bi-directional communication needs to be rather good between CPU and GPU. Physics will generally trigger game logic events (depending on the game of course), so while the GPU can handle physics calculations faster, it's the need for a feedback system that destroys it for anything more than eye-candy with current architectures. I have been curious how well AMD's Fusion systems can be made to work with that, but I don't really have time to delve into it in more than a theoretical capacity. At least, don't have the time yet.

              Comment


              • #27
                Originally posted by mirv View Post
                If physics takes place entirely on the GPU, then your bi-directional communication needs to be rather good between CPU and GPU. Physics will generally trigger game logic events (depending on the game of course), so while the GPU can handle physics calculations faster, it's the need for a feedback system that destroys it for anything more than eye-candy with current architectures. I have been curious how well AMD's Fusion systems can be made to work with that, but I don't really have time to delve into it in more than a theoretical capacity. At least, don't have the time yet.
                If I think of a single simulation step:
                Prediction
                Broadphase
                Contact Generation
                Correction/Solver

                Lets say the intermediate results form the last step are available to the cpu to tinker with at the same time. There will be a lag of at least one frame. But for game events it should be negligible.

                Comment


                • #28
                  my ray tracer algorythm doesnt murmurs, can i get pregnant? :P

                  Comment


                  • #29
                    Originally posted by log0 View Post
                    If I think of a single simulation step:
                    Prediction
                    Broadphase
                    Contact Generation
                    Correction/Solver

                    Lets say the intermediate results form the last step are available to the cpu to tinker with at the same time. There will be a lag of at least one frame. But for game events it should be negligible.
                    Reading back from the GPU is quite costly. You certainly want to avoid it as much as possible - unless you can share the memory with a zero-copy buffer (in theory). Sure it's getting easier with current architectures and bus speeds for data readback, but I'm pretty sure it's still costly enough that you don't want to do it. This is why most games will only use particle effects, or physics related calculations that's classified as "eye candy" and doesn't directly affect gameplay logic.
                    Also, graphics cards still need to do graphics.
                    I guess it depends on the game, how much physics calculations you need to affect game logic (those ones are generally very simplistic compared to, say, cloth simulation) and where your bottleneck will be (calculations vs data transfer). It would be interesting to see just what kind of balance point can be found...maybe something like ants (for path update AI code) combined with "dodge the particles". Sucks having a day job and not being able to explore such ideas properly.

                    Comment


                    • #30
                      Originally posted by mirv View Post
                      Reading back from the GPU is quite costly. You certainly want to avoid it as much as possible - unless you can share the memory with a zero-copy buffer (in theory). Sure it's getting easier with current architectures and bus speeds for data readback, but I'm pretty sure it's still costly enough that you don't want to do it. This is why most games will only use particle effects, or physics related calculations that's classified as "eye candy" and doesn't directly affect gameplay logic.
                      Also, graphics cards still need to do graphics.
                      I guess it depends on the game, how much physics calculations you need to affect game logic (those ones are generally very simplistic compared to, say, cloth simulation) and where your bottleneck will be (calculations vs data transfer). It would be interesting to see just what kind of balance point can be found...maybe something like ants (for path update AI code) combined with "dodge the particles". Sucks having a day job and not being able to explore such ideas properly.
                      I am assuming shared/unified memory in my proposal.

                      Comment

                      Working...
                      X