Hardware support: Valve announces a Steamworks virtual conference for the Steam Deck, topics include the Steam Deck hardware, development without a dev-kit, and an APU deep dive with AMD.

AMD recently rolled out ROCm support for their Navi 21 GPUs. I had been looking forward to this for a while, so I decided to benchmark the performance of my 6900XT versus two Nvidia cards I have access to: Titan V and V100. The three systems are configured as shown below:

	CPU	GPU	RAM	OS
System 1	Ryzen 5900X	6900XT	2x 16GB 3733CL16	Debian Bullseye
System 2	Threadripper 2990WX	Titan V	4x 16GB 3200CL14	Ubuntu 18.04
System 3	2x IBM POWER9 20 core (SMT4)	V100	20x 16GB ECC RDIMM (dont know MT/s or latency)	Ubuntu 20.04

I benchmarked the three systems on 4 cases, mostly using basic tutorial type of examples from Tensorflow/Huggingface transformers:

MLP Classifier: A 3 layer multilayer perceptron with ReLU activations, 2500 units in the intermediate layers and 10 output classes, trained on synthetic data. Source file
ResNet50 CIFAR10: The standard ResNet50 computer vision model on the CIFAR10 dataset. Source file
BERT IMDb: Implementation of the BERT natural language processing model from the Huggingface Transformers library trained on the IMDb dataset. Source file
BERT Glue/Mrpc: Again BERT, but with the Glue/Mrpc dataset. Source file

The performance in terms of time taken to complete an epoch is tabulated below (lower is better).

Case	6900XT/System 1	Titan V/System 2	V100/System 3
MLP Classifier	26s (1254ms/step)	19s (946ms/step)	20s (996ms/step)
ResNet50 CIFAR10	15s (149ms/step)	9s (90ms/step)	12s (117ms/step)
BERT IMDb	49s (393ms/step)	65s (525ms/step)	-
BERT Glue/Mrpc	51s (442ms/step)	73s (633ms/step)	92s (803ms/step)

Note that I had an old version of TF/transformers set up on the V100 system, which was incompatible with the IMDb dataset file. I noticed this after running this file on the first two systems, and setting it up there would've been a pain. So that I have some BERT data on all 3 systems, I added the Glue/Mrpc case too.

To summarize, the 6900XT seems to be a bit of a mixed bag at least with the current software stack. Its stellar performance in gaming thanks to the infinity cache does not translate to high performance in all ML/AI applications, though it seems to work well for the BERT model. I'd be interested to hear explanations of why the Navi 21 architecture performs well for BERT! Overall, considering the Titan V/V100 should be slightly faster than a 2080Ti for ML/AI, I expect the 3090 to handily outperform the 6900XT.

Finally, to add a few observations:

I noticed that if I allow TF to use up all of system VRAM, my desktop starts to occasionally freeze. I could resolve this by instructing TF to not allocate all VRAM at once.
Installation had a couple of quirks. I added AMD's apt repository and installed the packages and then installed tensorflow-rocm using pip. This led to a few 'lib***.so not found' type of errors, which was fixed by adding the paths of these libraries to a file in /etc/ld.so.conf.d. But then, Cuda can be problematic in this regard too...
Once everything was set up, TF ran fairly well. Did not encounter any missing functionality or (ROCm specific) bugs
Monitoring temps and power usage while running training with rocm-smi, I noticed some frightening power spikes in excess of 450W. I suspect these are actually north of 500W, as the reported 'Max Graphics Package Power' in rocm-smi is 255W but the card's actual TDP is 300W.

submitted by /u/cherryteastain
[link] [comments]

Ex-Windows Div President: Apple’s Long Journey to the M1 Pro Chip

Posted: 01 Nov 2021 03:48 AM PDT

submitted by /u/dok_DOM
[link] [comments]

Breaking News

Hardware Support

Tuesday, November 2, 2021

Hardware support: Valve announces a Steamworks virtual conference for the Steam Deck, topics include the Steam Deck hardware, development without a dev-kit, and an APU deep dive with AMD.

Hardware support: Valve announces a Steamworks virtual conference for the Steam Deck, topics include the Steam Deck hardware, development without a dev-kit, and an APU deep dive with AMD.

No comments:

Post a Comment

Popular

Blog Archive