Add ROCm support for GPUs on supported architectures
Description
- The changes add support for ROCm for GPUs.
- HIP is enabled only if the device's architecture supports it, ensuring backward compatibility. (see the debian/rules)
- The required libraries are found during runtime without user configuration i.e., the user simply links the rocblas library and the rest is taken care of. (see debian/rules)
Testing Environment
- Hardware: AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
- Architecture: amd64
Setup
- The built package was installed in a debian:latest docker container
- The docker container had the packages: librocblas-dev, cmake and librocblas-dev from
sid
- However, the host had the amdgpu-dkms installed from
rocm
packages. The reason being that thedebian
package forrocminfo
was leading to an error:
ISA Info:
rocminfo: ./src/core/runtime/amd_gpu_agent.cpp:858: virtual hsa_status_t rocr::AMD::GpuAgent::GetInfo(hsa_agent_info_t, void*) const: Assertion `cache_props_.size() > 0 && "GPU cache info missing."' failed.
Aborted
- The docker was running with high privileges to access the host GPU via
docker run --rm --device=/dev/kfd --device=/dev/dri --group-add video --privileged -it --name my-debian my-debian
- The package was built, installed and tested in the docker container itself.
Edited by Spaarsh Thakkar