Originally posted by antonyshen
View Post
Then I make a /usr/bin/ollama shell script. As you can see it also overrides for some newer GPUs that are supposed to work. They seem to be working for others, so I included them in my script as much for documentation as anything else.
The distributed ollama bin includes a rocm llama engine. If it finds a functional system it should work with the script. I built/installed ROCm since I want to compile from source. Ollama compiles with ROCm support quite easily if ROCm is installed correctly. I made a patched version that supports UMA to optimize for integrated graphics.
Your GPU needs to be configured with more than a 512M graphics buffer memory otherwise ollama will ignore it. My Lenovo automatically (can't change it) sets 2G on a 16G memory machine. My Asus motherboard defaults to 512M and I bumped that up to 4G. With the 4G video config ollama will fully offload llama3 (it fits). On my notebook with 2G it only does a partial offload.
One other gotcha. It seems a lot of the budget motherboards will crash if you have overclocking enabled. I had to disable D.O.C.P. on the Asus.
Code:
#! /bin/bash if [[ -x /opt/rocm/bin/clinfo ]]; then if /opt/rocm/bin/clinfo | grep -qs 'Name.*gfx1103'; then # Radeon 780m export HSA_OVERRIDE_GFX_VERSION='11.0.0' elif /opt/rocm/bin/clinfo | grep -qs 'Name.*gfx1035'; then # Radeon 680m export HSA_OVERRIDE_GFX_VERSION='10.3.0' elif /opt/rocm/bin/clinfo | grep -qs 'Name.*gfx90c'; then export HSA_OVERRIDE_GFX_VERSION='9.0.0' fi fi export OLLAMA_TMPDIR='/var/tmp/OllamaTmp' exec /usr/bin/ollama.bin "$@"
Comment