Announcement

**mangeek** · 20 May 2023, 11:25 PM

Originally posted by piotrj3 View Post

Regarding this... I think it is not bad idea if there was some sort of emulation on the fly. MacOS uses rosetta to run x86 on ARM. running x86_32 on 86s, with something similar should be painless.

What i am suspicious of, is how big improvement it can be. Let's assume that CPU core logic can be shrunk by 20% (that is very very generous claim) by removing x86 baggage. Problem is core logic is probably less then 30% of entire die (if you look at 13900k die shots). Cache is huge, GPU is huge, media engine is huge, memory controller is huge, fabric connecting everything is huge, I/O is huge and all those things don't care about architecture of CPU changes.

Maybe there could be some energy efficiency and performance gains from dropping some logic and simplifing stuff around but i don't expect for example that 10 core CPU will become 12 core CPU by that change, that probably isn't going to happen.

I'm guessing it's less about physical space on the core and more about just having fewer things you need to design around, support, and test. We see this in software, like when the Mesa driver for 'i915 and Xe' branches to 'i915' and 'Xe', and the Xe driver devs are safe to focus on Gen12+ stuff only and not consider breaking the older stuff. A new subarch of x86 that's 'X86_64 only' would effectively be the same thing.

**billyswong** · 20 May 2023, 11:41 PM

Originally posted by schmidtbag View Post

I overall think this is fine if it helps reduce transistors. Emulation shouldn't be a big deal since you're stepping down. Besides, there aren't really many 32-bit binaries that would be demanding enough where emulation would be a bottleneck. ARM can emulate x86 with pretty decent performance despite being a very different architecture that lacks much of the instructions.
Kinda gets me to think though: since hybrid CPUs exist, what if E-cores were designed with 32-bit compatibility in mind? Then you'd get a hybrid architecture.

No, we don't have existing mainstream OSes that support hybrid ISA architecture. BIG.little was and is only about hybrid efficiency/performance.

Originally posted by schmidtbag View Post

Eh, not quite. There are 2 reasons x86 survived as long as it did:
1. Because there are shockingly still a lot of 32-bit W7 users out there (MS really should've enforced W7 as 64-bit only)
2. Because Intel themselves continued to make 32-bit CPUs into the 2010s.

Dropping 32-bit edition at Win10 may be fine but dropping 32-bit at Win7 is too aggressive. Win7 launched in 2009. The first x86-64/AMD64 CPU was launched in 2003. Even if Intel manufactured all CPUs since 2004 with 64-bit support, you are still asking Microsoft to drop hardware support of perfectly running 6 years old computers.

In a parallel universe without those 32-bit Atom "netbooks", Windows 10 could be 64-bit edition only.

**arteast** · 20 May 2023, 11:45 PM

Originally posted by PluMGMK View Post

Come to think of it, I'm more worried about the killing off of the 67h address-size override prefix. I can imagine that creating a lot more headaches than the limitation of 32-bit segmentation…

I can't imagine that. 67h has some use in 16-bit PM where you could have had long-mode selectors and having 32-bit addressing into them had its uses. 67h in 32-bit mode, though, is basically useless - you've got 32-bit adressing, what use 16-bit addressing/64K wraparound has? I've messed with 32-bit dos extenders a bit a while ago - those tightly interoperate with 16-bit PM and VM, and I still can't remember ever seeing this prefix used in 32-bit, even in that software where you'd expect them most. Moreover, without based selectors, you'd now be limited to 16-bit addressing into FIRST 64k of virtual address space which is even more useless (granted, there are still FS and GS and one could devise some evil plan of using 16-bit addressing using those

.

**mangeek** · 21 May 2023, 12:19 AM

Originally posted by skeevy420 View Post

Well, I very seriously doubt any non-free commercial OS older than Win11 will be updated to work with X86-S.... Linux is the same way. Unless Intel partners up with RHEL, Ubuntu, or SUSE ahead of time to backport a bunch of stuff to one of their LTS kernels you'll have to be on Fedora or Arch running a mainline kernel to even use this when it's released (and for the foreseeable future).

I think systems that have native 64-bit EFI, bootloaders, and kernels today would work on this. I'm not sure if x86_64 Windows 10 or Linux 5.x on a 64-bit UEFI BIOS need any explicitly 32-bit instructions, but it would likely be a very minor 'cleanup' to get there. I'm guessing it would basically be a minor addition to the WOW64 subsystem on Windows, not sure what it would be on Linux. Remember, this isn't a new arch, it's a subarch that just doesn't have the pre-x86_64 stuff in it. It's not the old days when systems booted up to 16-bit BIOSes and bootstrapped to 32; my BIOS, bootloader, kernel, and software are all x86_64 already.

**ryao** · 21 May 2023, 12:53 AM

Originally posted by billyswong View Post

No, we don't have existing mainstream OSes that support hybrid ISA architecture. BIG.little was and is only about hybrid efficiency/performance.

Did you forget about the Cell processor? That is a hybrid ISA architecture.

**arteast** · 21 May 2023, 12:55 AM

Originally posted by ryao View Post

Getting to the point where 128-bit address space is not enough would imply that we have constructed computer memory that is more than 5e11 kilograms in mass, and that assumes every bit is made out of a hydrogen atom and does not consider how we would read or write to them, or keep them in place..

This implies a tightly packed virtual space. I can see some uses for a really vast but sparsely populated VM space. Look at IPv6 and how it "recklessly" throws heaps of addresses at anyone just because it can. E.g. (increasingly insane yet possibly achievable):
- there are data structures that could be built on an assumption that VM space is enormous yet largely unpopulated. E.g. SuperMalloc allocates 512MiB vector for its internal use, and then usually only touches handful of entries into that vector. Due to that, only handful of physical pages gets allocated - its essentially a trie implemented by the hardware!
- preallocate few TBs of VM space for _every_ allocation that could possibly grow, and never have to move your data around just because you've ran out of address space.
- drop address space switching and go for a unified virtual memory view (e.g. allocate high bits of any address for a process identifier), using HW memory protections for security - I could see new IPC schemes going on here
- heck , let's add that IPv6 on top of that, and have every machine in the world share same virtual address space! No more downloading stuff; just mmap your URL and get a pointer into another server's area which you could just read.

**billyswong** · 21 May 2023, 01:21 AM

Originally posted by ryao View Post

Did you forget about the Cell processor? That is a hybrid ISA architecture.

In modern point of view, Cell is more like an SoC with integrated GPU/accelerator. If we call Cell a hybrid, we may call every new chips with extra VPU/NPU etc hybrid too.

**ryao** · 21 May 2023, 01:34 AM

Originally posted by arteast View Post

This implies a tightly packed virtual space. I can see some uses for a really vast but sparsely populated VM space. Look at IPv6 and how it "recklessly" throws heaps of addresses at anyone just because it can. E.g. (increasingly insane yet possibly achievable):
- there are data structures that could be built on an assumption that VM space is enormous yet largely unpopulated. E.g. SuperMalloc allocates 512MiB vector for its internal use, and then usually only touches handful of entries into that vector. Due to that, only handful of physical pages gets allocated - its essentially a trie implemented by the hardware!
- preallocate few TBs of VM space for _every_ allocation that could possibly grow, and never have to move your data around just because you've ran out of address space.
- drop address space switching and go for a unified virtual memory view (e.g. allocate high bits of any address for a process identifier), using HW memory protections for security - I could see new IPC schemes going on here
- heck , let's add that IPv6 on top of that, and have every machine in the world share same virtual address space! No more downloading stuff; just mmap your URL and get a pointer into another server's area which you could just read.

All of the things you suggested could be achieved in a 128-bit address space. Most of them could be achieved in a 64-bit address space.

The industry adopted 64-bit support because growing physical memory capacities forced them to adopt it. They are not going to adopt support for a larger address spaces for the niche use cases you have. That would waste transistors that are better spent on making better processors.

**ryao** · 21 May 2023, 01:37 AM

Originally posted by billyswong View Post

In modern point of view, Cell is more like an SoC with integrated GPU/accelerator. If we call Cell a hybrid, we may call every new chips with extra VPU/NPU etc hybrid too.

I agree with that assessment. Therefore, it follows that operating systems that support hybrid architectures have existed for many years.

**stormcrow** · 21 May 2023, 02:28 AM

Originally posted by ryao View Post

I had expected them to discontinue that after releasing the Mac Studio. It is nice to hear that it is still for sale. That should push back the end of amd64 support at Apple.

Apple implemented an impressive number of x86 extensions:

M1 Rosetta 2 Limitation … Illegal Hardware Instruction

https://medium.com/macoclock/m1-rosetta-2-limitation-illegal-hardware-instruction-a3b48fae02e

If you are expecting Rosetta is a magic solution for all, you may be disappointed!

If software can run on Westmere, it likely can run in Rosetta 2. They also implemented CX16, which is a subset of what was to be SSE5 and had been introduced in Ivy Bridge.

Software that requires AVX, AVX2, AVX-512, RDRAND, SHA or AMX will crash, although anything depending on AVX-512 or AMX will crash on a number of recent Intel processors too. What Apple implemented is really enough to support most x86 software.

Edit: Interestingly, there is a good chance that MacOS software that crashes in Rosetta 2 because it requires something unimplemented would also crash on the 2010 Mac Pro, which used Westmere and can run MacOS 10.13. Apple dropped support for Westmere in MacOS 10.14, one release before Rosetta 2 debuted. Userland software still would have supported 10.13 when 11.0 was released, and doing that meant supporting Westmere. Rosetta 2 just barely supported the minimum in theory necessary to support all x86 software written for MacOS, excluding software that was written in a way that would be broken on Westmere. Had Apple waited any longer, they would have needed to support AVX/AVX2, which would have been more difficult to implement using NEON.

If if if if. The reality is that there were some pieces of software that did not run even with Rosetta 2. Rosetta is great, but not perfect. It's a stop gap, nothing more. That was the entirety of my point. Right now there's very little still-maintained software that doesn't work on at least Rosetta 2, but the vast majority is now ARM64 native, except some perennially bad actors like Steam who won't do jack until they're forced to do it.

The same thing occurs between Intel and AMD systems. One implements something one way, the other does it a different way. One has super nifty shiny that half way works. The other waits and implements it correctly later on. Expectations from output can be different. Hardware bugs happen. Over dependence on a single vendor's implementation can be problematic and leads to a vendor lock in, but it can also be a problem when the same vendor suddenly decides to drop a feature if they want people to pay more for it... Like Intel and AVX-512.

Announcement

Intel Publishes "X86-S" Specification For 64-bit Only Architecture

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment