Skip to content

Add ROCm support for GPUs on supported architectures

Description

  • The changes add support for ROCm for GPUs.
  • HIP is enabled only if the device's architecture supports it, ensuring backward compatibility. (see the debian/rules)
  • The required libraries are found during runtime without user configuration i.e., the user simply links the rocblas library and the rest is taken care of. (see debian/rules)

Testing Environment

  • Hardware: AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
  • Architecture: amd64

Setup

  • The built package was installed in a debian:latest docker container
  • The docker container had the packages: librocblas-dev, cmake and librocblas-dev from sid
  • However, the host had the amdgpu-dkms installed from rocm packages. The reason being that the debian package for rocminfo was leading to an error:
ISA Info:                
rocminfo: ./src/core/runtime/amd_gpu_agent.cpp:858: virtual hsa_status_t rocr::AMD::GpuAgent::GetInfo(hsa_agent_info_t, void*) const: Assertion `cache_props_.size() > 0 && "GPU cache info missing."' failed.
Aborted
  • The docker was running with high privileges to access the host GPU via docker run --rm --device=/dev/kfd --device=/dev/dri --group-add video --privileged -it --name my-debian my-debian
  • The package was built, installed and tested in the docker container itself.
Edited by Spaarsh Thakkar

Merge request reports

Loading