www. O S N E W S .com
News Features Interviews
BlogContact Editorials
.
HP, Asus announce first Windows 10 ARM PCs
By Thom Holwerda on 2017-12-05 20:19:56

HP and Asus have announced the first Windows 10 PCs running on ARM - Snapdragon 835 - and they're boasting about instant-on, 22 hour battery life, and gigabit LTE. These machines run full Windows 10 - so not some crippled Windows RT nonsense - and support 32bit x86 applications. Microsoft hasn't unveiled a whole lot just yet about their x86-on-ARM emulation, but Ars did compile some information:

The emulator runs in a just-in-time basis, converting blocks of x86 code to equivalent blocks of ARM code. This conversion is cached both in memory (so each given part of a program only has to be translated once per run) and on disk (so subsequent uses of the program should be faster, as they can skip the translation). Moreover, system libraries - the various DLLs that applications load to make use of operating system feature - are all native ARM code, including the libraries loaded by x86 programs. Calling them "Compiled Hybrid Portable Executables" (or "chippie" for short), these libraries are ARM native code, compiled in such a way as to let them respond to x86 function calls.

While processor-intensive applications are liable to suffer a significant performance hit from this emulation - Photoshop will work in the emulator, but it won't be very fast - applications that spend a substantial amount of time waiting around for the user - such as Word - should perform with adequate performance. As one might expect, this emulation isn't available in the kernel, so x86 device drivers won't work on these systems. It's also exclusively 32-bit; software that's available only in a 64-bit x86 version won't be compatible.

I'm very curious about the eventual performance figures for this emulation, since the idea of running my garbage Win32 translation management software on a fast, energy-efficient laptop and external monitor seem quite appealing to me.

 Email a friend - Printer friendly - Related stories
.
Post a new comment
Read Comments: 1-10 -- 11-20 -- 21-30 -- 31-40 -- 41-50 -- 51-59
.
RE[2]: This is awesome
By poliorcetes on 2017-12-06 22:59:31
That is not correct
Permalink - Score: 1
.
RE[2]: Is it crippled?
By tsedlmeyer on 2017-12-06 23:05:01
> And "that reason" is a performance.

That reason is probably patents. Intel has already strongly threatened to use their patent portfolio against anyone trying to do x86 emulation. The 32bit instruction set can be mostly implemented without violating any Intel patents because any relevant ones that existed have expired. There are a few troublesome instructions but those were only implemented on later 32bit processors and were not available on the majority of 32bit x86 processors, so the impact of not implementing them is basically non-existent. The situation for the 64bit instruction set is very different. Intel has patents required to implement quite a few of the key instructions.
Permalink - Score: 3
.
RE[7]: Is it crippled?
By Alfman on 2017-12-06 23:41:40
I found a nice post about memory barriers on x86 that might help in explaining when the x86 memory model guaranties ordered semantics and when it doesn't. In particular, it works with respect to a single memory location or single cpu, but x86 cores are allowed to violate ordering across different memory addresses unless a barrier is used:

https://bartoszmilewski.com/2008/...


Here's an example:
> [ a ] = 0
[ b ] = 0

CPU0
[ a ] = 1
c = [ b ]

CPU1
[ b ] = 1
d = [ a ]


Since reads and writes to different memory locations can be reordered by x86, the result is that the reads on both cores can technically execute before the writes on both cores, resulting in c==d==0 even though there's no possible way for this code to produce that output when it's run sequentially (the first instruction would ALWAYS be one of the memory addresses being set to 1, so regardless of the timing of subsequent instructions, c or d MUST equal 1).

Note that with a single CPU, the x86 semantics guaranty that the results of reordering will not change values in the code. However device drivers still need special care because memory mapped bus devices could potentially still produce side effects due to CPU reordering even if it doesn't effect the code. Linux has special primitives to generate memory barriers throughout the kernel to force the CPU to access memory in order.

https://www.kernel.org/doc/Docume...


There's a lot of nuance and subtle details that leaves a lot of room for error, even for experienced programmers who've written a lot of multithreaded code.

https://stackoverflow.com/questio...

https://stackoverflow.com/questio...

Edited 2017-12-06 23:55 UTC
Permalink - Score: 3
.
Linux for ARM!
By gehersh on 2017-12-07 01:13:44
I would love to get one of such devices (with Snapdragon 835 or 845 when available) and run Linux for ARM on it. No Intel and no Windows! Like a fairly tale came true. Don't know too much about Linux for ARM, though. I believe both Debian and Ubuntu have such an animal, but I wonder whether generic 64-bit ARM version of Linux would run on any such chip (i.e., Snapdragon) and also availability of applications recompiled for ARM. (you probably can cross-compile them if push comes to shove) (?)
Permalink - Score: 1
.
RE[2]: This is awesome
By Morgan on 2017-12-07 01:59:44
I have many, many more 32 bit apps on Windows than I do 64 bit. Firefox, TightVNC, and a couple of games are x64, everything else is legacy because their devs haven't ported to x64 yet. UWP apps are, as the name implies, universal so they'll run the same no matter what CPU is in there. There are practically zero 64-bit-only Windows apps in the wild.

Remember, this isn't Linux where your apps have to be the same architecture as your OS, Windows is still the king of backward compatibility (for better or worse). I don't see it being a problem for a long time to come.
Permalink - Score: 4
.
RE[4]: Garbage in, garbage out.
By Brendan on 2017-12-07 04:12:12
Hi,

> I don't use qemu as an example because of it's performance, but rather because of it's cross-architecture support. Do you know of other cross-architecture emulators with 90% speed efficiency? I'd like to read about it if you've got a source.

That's like asking for a list of things that are both wet and dry. To get acceptable performance for a JIT it has to be tied directly to both the host architecture and the guest architecture. It can't be done in a portable way (without sacrificing most of the performance).

The most common high performance JIT is Sun's (Oracle's) Java virtual machine, which achieves around 90% of native.

Apple's Rosetta wasn't quite so well optimised and only achieved 60% to 80% of native speed.

Intel's "IA-32 Execution Layer" (for emulating 32-bit 80x86 on Itanium, after they removed 80x86 instruction set support from the CPU itself) achieved 50% to 70% of native speed; but Itanium was a peculiar beast - translating a "normal" instruction set to VLIW would've been much more challenging.

These are all at least 4 times faster than the fastest portable emulator that I know of (Qemu).

Of course these are all doing pure translation; without the benefit of storing/caching the translated code on disk to avoid "re-translation" the next time the same executable is executed. The latter 2 are also old (from around 2000?) and don't benefit from recent research/improvements.

- Brendan
Permalink - Score: 3
.
RE[5]: Garbage in, garbage out.
By Alfman on 2017-12-07 06:52:59
Brendan,

> That's like asking for a list of things that are both wet and dry. To get acceptable performance for a JIT it has to be tied directly to both the host architecture and the guest architecture. It can't be done in a portable way (without sacrificing most of the performance).

The most common high performance JIT is Sun's (Oracle's) Java virtual machine, which achieves around 90% of native.


I do commend you for this answer, but such trickery, haha :)

Java programs aren't really emulated in modern JVMs, instead they are run natively with bits and bobs to perform the JIT compilation on the fly. The .class files are merely a binary representation of the source and not an executable binary in the same sense as x86 or arm code.

It would be conceptually very similar to take a .c program, zipping it up into a "binary" .c.gz file. And then passing this binary file to a "C-virtual machine" (which achieves 100% of native btw). But the CVM is NOT emulating C, neither is the java virtual machine emulating java(*)... both are compiling it down to native in order to run on bare metal.

* I am aware the original JVM implementations really did have virtual machine emulation, but this was very slow and not the kind of emulation that achieves "around 90% of native".


> Apple's Rosetta wasn't quite so well optimised and only achieved 60% to 80% of native speed.

This may in fact be the best example, although I see conflicting information about just how fast is was.

https://www.anandtech.com/show/20...

https://www.theguardian.com/techn...

http://www.mactech.com/articles/...

Unfortunately the authors didn't establish a native speed baseline on both sides, comparing times off two two arbitrary machines doesn't give a good indication of the performance of the emulator itself.


> These are all at least 4 times faster than the fastest portable emulator that I know of (Qemu).

Of course these are all doing pure translation; without the benefit of storing/caching the translated code on disk to avoid "re-translation" the next time the same executable is executed. The latter 2 are also old (from around 2000?) and don't benefit from recent research/improvements.



I think most of the interest in pursuing software emulation was lost with hardware virtualization, but if ARM PCs become more popular, it could stimulate R&D for software emulation again.


Just found this:
MS Powerpoint running on Linux on top of ARM processor via WINE and QEMU.
https://www.youtube.com/watch?v=9...

Pretty cool, albeit slow.
Permalink - Score: 3
.
RE[2]: Garbage in, garbage out.
By The123king on 2017-12-07 09:03:33
It was aimed at Thom and his "idea of running [his] garbage Win32 translation management software on a fast, energy-efficient laptop"
Permalink - Score: 0
.
RE: Garbage in, garbage out.
By Alfman on 2017-12-07 14:26:56
The123king,

> It was aimed at Thom and his "idea of running [his] garbage Win32 translation management software on a fast, energy-efficient laptop"

Well, now I understand, but your original post seems to debunk claims that Thom never actually made about emulation being faster:

> I'm very curious about the eventual performance figures for this emulation, since the idea of running my garbage Win32 translation management software on a fast, energy-efficient laptop and external monitor seem quite appealing to me.

It could be that Thom simply wants a fast efficient ARM laptop that also happens to run his garbage win32 translation management software. Like many businesses, his win32 apps may be a critical part of his workflow, but it doesn't mean he won't also be running fast native apps too.

Anyways until we have more data, we're mostly speculating about how good or bad this will be. I think we're all curious.

Edited 2017-12-07 14:27 UTC
Permalink - Score: 4
.
RE: Linux for ARM!
By The123king on 2017-12-07 15:59:12
Raspberry Pi
Permalink - Score: 0

Read Comments 1-10 -- 11-20 -- 21-30 -- 31-40 -- 41-50 -- 51-59

Post a new comment
Username

Password

Title

Your comment

If you do not have an account, please use a desktop browser to create one.
LEAVE SPACES around URLs to autoparse. No more than 8,000 characters are allowed. The only HTML/UBB tags allowed are bold & italics.
Submission of a comment on OSNews implies that you have acknowledged and fully agreed with THESE TERMS.
.
News Features Interviews
BlogContact Editorials
.
WAP site - RSS feed
© OSNews LLC 1997-2007. All Rights Reserved.
The readers' comments are owned and a responsibility of whoever posted them.
Prefer the desktop version of OSNews?