• Breaking News

    Tuesday, November 2, 2021

    Hardware support: Valve announces a Steamworks virtual conference for the Steam Deck, topics include the Steam Deck hardware, development without a dev-kit, and an APU deep dive with AMD.

    Hardware support: Valve announces a Steamworks virtual conference for the Steam Deck, topics include the Steam Deck hardware, development without a dev-kit, and an APU deep dive with AMD.


    Valve announces a Steamworks virtual conference for the Steam Deck, topics include the Steam Deck hardware, development without a dev-kit, and an APU deep dive with AMD.

    Posted: 01 Nov 2021 01:02 PM PDT

    [Gamers Nexus] 13 Years of Dell Getting Worse

    Posted: 01 Nov 2021 09:13 PM PDT

    Dell spins off $64 billion VMware as it battles debt hangover

    Posted: 01 Nov 2021 09:53 AM PDT

    HEXUS closes its doors after 23 years of operation

    Posted: 01 Nov 2021 09:31 AM PDT

    One year later - 50 Games Tested: GeForce RTX 3080 vs. Radeon RX 6800 XT

    Posted: 01 Nov 2021 05:56 PM PDT

    Western Digital to Ship 20TB OptiNAND HDDs in November 2021

    Posted: 01 Nov 2021 11:19 PM PDT

    [HUB] AMD Ryzen: Windows 11 vs. Windows 10, Faster Gaming Performance

    Posted: 01 Nov 2021 03:20 AM PDT

    "Qualcomm Announces Goal to Achieve Net-Zero Emissions by 2040"

    Posted: 02 Nov 2021 02:47 AM PDT

    PBKreviews: "Google Pixel 6 Pro Disassembly Teardown Repair Video Review. Can The Parts Be Replaced?? UPDATED*"

    Posted: 01 Nov 2021 07:15 PM PDT

    Google says Pixel 6 Pro ghostly display flickers will be fixed in December

    Posted: 02 Nov 2021 01:49 AM PDT

    6900XT Tensorflow ML/AI benchmarks using ROCm

    Posted: 01 Nov 2021 04:45 AM PDT

    Crossposting from /r/amd

    AMD recently rolled out ROCm support for their Navi 21 GPUs. I had been looking forward to this for a while, so I decided to benchmark the performance of my 6900XT versus two Nvidia cards I have access to: Titan V and V100. The three systems are configured as shown below:

    CPU GPU RAM OS
    System 1 Ryzen 5900X 6900XT 2x 16GB 3733CL16 Debian Bullseye
    System 2 Threadripper 2990WX Titan V 4x 16GB 3200CL14 Ubuntu 18.04
    System 3 2x IBM POWER9 20 core (SMT4) V100 20x 16GB ECC RDIMM (dont know MT/s or latency) Ubuntu 20.04

    I benchmarked the three systems on 4 cases, mostly using basic tutorial type of examples from Tensorflow/Huggingface transformers:

    • MLP Classifier: A 3 layer multilayer perceptron with ReLU activations, 2500 units in the intermediate layers and 10 output classes, trained on synthetic data. Source file
    • ResNet50 CIFAR10: The standard ResNet50 computer vision model on the CIFAR10 dataset. Source file
    • BERT IMDb: Implementation of the BERT natural language processing model from the Huggingface Transformers library trained on the IMDb dataset. Source file
    • BERT Glue/Mrpc: Again BERT, but with the Glue/Mrpc dataset. Source file

    The performance in terms of time taken to complete an epoch is tabulated below (lower is better).

    Case 6900XT/System 1 Titan V/System 2 V100/System 3
    MLP Classifier 26s (1254ms/step) 19s (946ms/step) 20s (996ms/step)
    ResNet50 CIFAR10 15s (149ms/step) 9s (90ms/step) 12s (117ms/step)
    BERT IMDb 49s (393ms/step) 65s (525ms/step) -
    BERT Glue/Mrpc 51s (442ms/step) 73s (633ms/step) 92s (803ms/step)

    Note that I had an old version of TF/transformers set up on the V100 system, which was incompatible with the IMDb dataset file. I noticed this after running this file on the first two systems, and setting it up there would've been a pain. So that I have some BERT data on all 3 systems, I added the Glue/Mrpc case too.

    To summarize, the 6900XT seems to be a bit of a mixed bag at least with the current software stack. Its stellar performance in gaming thanks to the infinity cache does not translate to high performance in all ML/AI applications, though it seems to work well for the BERT model. I'd be interested to hear explanations of why the Navi 21 architecture performs well for BERT! Overall, considering the Titan V/V100 should be slightly faster than a 2080Ti for ML/AI, I expect the 3090 to handily outperform the 6900XT.

    Finally, to add a few observations:

    • I noticed that if I allow TF to use up all of system VRAM, my desktop starts to occasionally freeze. I could resolve this by instructing TF to not allocate all VRAM at once.
    • Installation had a couple of quirks. I added AMD's apt repository and installed the packages and then installed tensorflow-rocm using pip. This led to a few 'lib***.so not found' type of errors, which was fixed by adding the paths of these libraries to a file in /etc/ld.so.conf.d. But then, Cuda can be problematic in this regard too...
    • Once everything was set up, TF ran fairly well. Did not encounter any missing functionality or (ROCm specific) bugs
    • Monitoring temps and power usage while running training with rocm-smi, I noticed some frightening power spikes in excess of 450W. I suspect these are actually north of 500W, as the reported 'Max Graphics Package Power' in rocm-smi is 255W but the card's actual TDP is 300W.
    submitted by /u/cherryteastain
    [link] [comments]

    Ex-Windows Div President: Apple’s Long Journey to the M1 Pro Chip

    Posted: 01 Nov 2021 03:48 AM PDT

    No comments:

    Post a Comment