Browser-native local GGUF inference.
Backend placement stays binary: GPU uses full offload, CPU uses none.