Intel released an explainer video about its upcoming XeSS AI upgrade technology and demonstrated how the technology works on its near-release Arc Alchemist GPUs. It uses the fastest Arc A770 for the demos, although it’s hard to tell how performance will compare to the best graphics cards based on the limited performance details shown.

If you’re at all familiar with Nvidia’s DLSS, which has been around for four years now in various incarnations, the video should evoke an acute sense of Deja Vu. Tom Petersen, who used to work for Nvidia and gave some of the old DLSS presentations, goes over the basics of XeSS. In short, XeSS sounds a lot like a mirrored version of Nvidia’s DLSS, except that it’s designed to work with Intel’s XMX deep learning cores rather than Nvidia’s tensor cores. The technology can also work with other GPUs, but using DP4a mode, which could make it an interesting alternative to AMD’s FSR 2.0 upscaler.

In the demos shown by Intel, XeSS seemed to work well. Of course, it’s hard to say for sure when the output video is a 1080p compressed version of the actual content, but we’ll save the detailed image quality comparisons for another time. The performance boost looks similar to what we’ve seen with DLSS, with over 100% frame rate increases in some situations when using XeSS Performance mode.

How it works

If you already know how DLSS works, Intel’s solution is largely the same, but with some minor changes. XeSS is an AI accelerated upscaling algorithm designed to increase frame rates in video games.

It starts with training, the first step in most deep learning algorithms. The AI ​​network takes lower resolution sample frames from a game and processes them, generating what should be an upscaled output image. The network then compares the results to the desired target image and back propagates weight adjustments to try to correct any “errors”. The resulting images won’t look very good at first, but the AI ​​algorithm slowly learns from its mistakes. After thousands (or more) of training images, the network eventually converges on the ideal weights that will “magically” generate the desired results.

Once the algorithm is fully trained using samples from many different games, it can in theory take any input image from any video game and scale it up almost perfectly. As with DLSS (and FSR 2.0), the XeSS algorithm also takes on the role of anti-aliasing and replaces classical solutions such as temporal AA.

(Image credit: Intel)

Again, nothing particularly noteworthy so far. DLSS and FSR 2.0 and even the standard temporal AA algorithms have much of the same basic functionality – minus the AI ​​stuff for FSR and TAA. Games will integrate XeSS into their rendering pipeline, usually after the main rendering and initial effects are done, but before the post-processing effects and GUI/HUD elements are drawn. In this way, the user interface remains clear while the difficult task of 3D rendering is performed at a lower resolution.

XeSS works with Intel’s Arc XMX cores, but can also work with other GPUs in a slightly different mode. The DP4a instructions are basically four INT8 (8-bit integer) calculations done using a single 32-bit register, which you normally access through a GPU shader core. Meanwhile, XMX cores natively support INT8 and can handle 128 values ​​at once.

This may seem very one-sided, but as an example, the Arc A380 has 1024 shader cores, each of which can perform four INT8 operations simultaneously. Alternatively, the A380 has 128 MXM units, each of which can perform 128 INT8 operations. This makes MXM throughput four times faster than DP4a throughput, but obviously DP4a mode should still be sufficient for some level of XeSS goodness.

Note that DP4a appears to use a different trained network that is arguably less computationally intensive. How this will translate into real-world performance and image quality remains to be seen, and it looks like game developers will have to explicitly include support for both XMX and DP4a modes if they want to support GPUs other than from Arc.

Intel XeSS performance expectations

Previous articleSamsung is testing a new Tensor chip: It’s not Tensor 2
Next articleHow to reset your PS4