Originally posted by coder
View Post
Announcement
Collapse
No announcement yet.
The NVIDIA Jetson TX2 Performance Has Evolved Nicely Since Launch
Collapse
X
-
- Likes 1
-
Originally posted by ldesnogu View PostThat's not what Girolamo_Cavazzoni was talking about I guess: Denver is using a JIT that improves performance as the benchmark runs by recompiling hot spots on the fly, that's much more dynamic than profile-driven compilation where, as you say, you run the program twice and you're done. OTOH I'm not sure the JIT engine of Denver will improve performance of a program when it's run multiple times.
It would be an interesting test to actually measure the performance of consecutive runs, until it plateaus. It'd avoid the need for all this speculation. I always prefer to just try it out, when possible.
Originally posted by ldesnogu View PostAnother thing to take care of when benchmarking TX2 is to make sure of where a program is running: the Denver core or the A57 core. When the board boots the Denver cores are disabled and have to be explicitly enabled. The nvpmodel tool can be used to enable either or both clusters.
That said, a second set of benchmarks on the optimized configuration would be bonus.
Leave a comment:
-
Originally posted by coder View PostMichael should consult the Nvidia docs, or at least run the benchmarks twice.
In traditional profile-driven compilation, there's usually not much benefit to running them more than twice.
Another thing to take care of when benchmarking TX2 is to make sure of where a program is running: the Denver core or the A57 core. When the board boots the Denver cores are disabled and have to be explicitly enabled. The nvpmodel tool can be used to enable either or both clusters.
- Likes 1
Leave a comment:
-
Originally posted by Girolamo_Cavazzoni View PostI have a question regarding the Denver cores: How often are the benchmarks run? As far as I know a software layers optimizes the code fed to the cores which are very wide in-order designs. Processing speed should grow with each iteration until it hits a maximum.
In traditional profile-driven compilation, there's usually not much benefit to running them more than twice.
Leave a comment:
-
Originally posted by grok View PostThese are the Quadros of ARM boards.
Leave a comment:
-
Originally posted by milkylainen View PostDoes the Jetson TX2 have NVLinks somewhere?
https://www.anandtech.com/show/11913...t-nextgen-gpus
Leave a comment:
-
Originally posted by schmidtbag View PostI'd like to get one of these but they're just so expensive.
For about $120, you get probably comparable CPU performance and a well-supported GPU (with open source drivers) that's at least half what the TX2 packs. Power consumption is comparable, but Gemini Lake is available in standard form factors.
ASRock Super AlloyIntel Quad-Core Pentium Processor J5005 (up to 2.8 GHz)Supports DDR4 2133/2400 SO-DIMM1 PCIe 2.0 x1, 1 M.2 (Key E)Graphics Output Options: D-Sub, HDMI, DVI-D7.1 CH HD Audio (Realtek ALC892 Audio Codec), ELNA Audio Caps4 SATA34 USB 3.1 Gen1 (2 Front, 2 Rear)Supports Full Spike Protection, ASRock Live Update & APP Shop
That particular board is passively-cooled and supports HDMI 2.0.
Best of all, it supports OpenCL (which Tegra SoC's do not)!Last edited by coder; 30 August 2018, 01:30 AM.
Leave a comment:
-
Originally posted by milkylainen View PostDoes the Jetson TX2 have NVLinks somewhere?
I'd like a full (and free, preferably) NVLink IP block to integrate into super fast FPGA's.
That would enable me to move some serious data into the GPU.
PCIe is just not fast enough.
Nvidia Xavier goes around the problem by including tons of hardware on the die.
Future GPUs use PCIe 4.0 (e.g. the coming AMD ones, no word on the RTX 2080 as far as I know). Future AMD Zen2 may have PCIe 4.0 but that's speculation.
If you really need to keep things small Ryzen APUs (ITX or embedded) are the closest thing to the Tegras I guess, but stuck with a PCIe 8x slot (3.0 on current hardware, 4.0 on next gen probably)
Leave a comment:
-
Originally posted by Girolamo_Cavazzoni View PostI have a question regarding the Denver cores: How often are the benchmarks run? As far as I know a software layers optimizes the code fed to the cores which are very wide in-order designs. Processing speed should grow with each iteration until it hits a maximum.
That is the number when maximum number crunching has occurred.
- Likes 2
Leave a comment:
Leave a comment: