Feature Request: Vulkan: Implement CPY op for quantized types #11127

stduhpf · 2025-01-07T17:07:19Z

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

This is mostly related to ggml, but I was advised to report the issue here.

Basically, this would require implementing quantization shaders for Vulkan (that's the easy part), and supporting them in the cpp code.

Motivation

With stable-diffusion.cpp compiled with Vulkan backend, when attempting to load a lora on a quantized model (any non float type), the program prints Missing CPY op for types: f32 q8_0 (for example) and crashes at this line.

Having more ops implemented is a good thing, especially if it fixes a crash downstream.

Possible Implementation

I'm guessing something like this for the shaders (q8_0):

#version 450

#include "quant_head.comp" //do not exixt

layout(local_size_x = 256, local_size_y = 1, local_size_z = 1) in;

layout (binding = 0) readonly buffer A {float data_a[];};
layout (binding = 1) writeonly buffer D {block_q8_0 data_b[];};

void main() {
    const uint i = gl_WorkGroupID.x * 4 + gl_LocalInvocationID.x / 64;

    const uint tid = gl_LocalInvocationID.x % 64;
    const uint il = tid / 32;
    const uint ir = tid % 32;
    const uint ib = 32 * i + ir;
    if (ib >= p.nel / 32) {
        return;
    }

    const uint b_idx = 1024 * i + 32 * ir + 16 * il;

    float absmax = 0.0;
    [[unroll]] for (uint j = 0; j < 32; ++j) {
        absmax = max(absmax, abs(data_a[b_idx + j]));
    }
 
    float d= absmax / 127.0;
   float id = d != 0. ? 1./d : d;
    data_b[ib].d = float16_t(d);
    [[unroll]] for (uint j = 0; j < 32; ++j) {
        data_b[ib].qs[16 * il + j] = uint8_t(clamp(data_a[b_idx + j] * id, -128.0, 127.0));
    }
}

I don't know how to proceed further in the implementation.

The text was updated successfully, but these errors were encountered:

jeffbolznv · 2025-01-07T18:21:46Z

I can try to work on this soon. Can you share a command line for how to repro the unsupported op in sd?

stduhpf · 2025-01-07T19:11:40Z

./build/bin/sd.exe -m ./models/model.gguf -p "<lora:lora_path:1>" --lora-model-dir "./models/loras/" --type q8_0

stduhpf added the enhancement New feature or request label Jan 7, 2025

jeffbolznv mentioned this issue Jan 9, 2025

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl #11166

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Vulkan: Implement CPY op for quantized types #11127

Feature Request: Vulkan: Implement CPY op for quantized types #11127

stduhpf commented Jan 7, 2025

jeffbolznv commented Jan 7, 2025

stduhpf commented Jan 7, 2025

Feature Request: Vulkan: Implement CPY op for quantized types #11127

Feature Request: Vulkan: Implement CPY op for quantized types #11127

Comments

stduhpf commented Jan 7, 2025

Prerequisites

Feature Description

Motivation

Possible Implementation

jeffbolznv commented Jan 7, 2025

stduhpf commented Jan 7, 2025