Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency in stable-diffusion image generation #1475

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dtrawins
Copy link
Contributor

@dtrawins dtrawins commented Jan 3, 2025

No description provided.

@github-actions github-actions bot added category: text to image Text 2 image pipeline category: cmake / build Cmake scripts category: samples GenAI samples category: GenAI C++ API Changes in GenAI C++ public headers no-match-files labels Jan 3, 2025
@dtrawins dtrawins requested a review from ilya-lavrenov January 3, 2025 15:23
@ilya-lavrenov ilya-lavrenov self-assigned this Jan 6, 2025
@@ -29,6 +29,7 @@ class OPENVINO_GENAI_EXPORTS AutoencoderKL {
std::vector<size_t> block_out_channels = { 64 };

explicit Config(const std::filesystem::path& config_path);
Config() = default;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it required?

I think you can initialize m_config in copy constructor via constructor initializer list

AutoencoderKL::AutoencoderKL(const AutoencoderKL& original_model){
encoder_compiled_model = original_model.encoder_compiled_model;
decoder_compiled_model = original_model.decoder_compiled_model;
m_decoder_request = original_model.decoder_compiled_model->create_infer_request();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if model is not compiled yet?

}
m_encoder_model = original_model.m_encoder_model;
m_decoder_model = original_model.m_decoder_model;
m_config = original_model.m_config;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not look safe that copy constructor performs infer request creation. We have code like:

    StableDiffusionPipeline(
        PipelineType pipeline_type,
        const CLIPTextModel& clip_text_model,
        const UNet2DConditionModel& unet,
        const AutoencoderKL& vae)
        : StableDiffusionPipeline(pipeline_type) {
        m_clip_text_encoder = std::make_shared<CLIPTextModel>(clip_text_model); // LEADS TO RE_CREATION OF REQUEST
        m_unet = std::make_shared<UNet2DConditionModel>(unet); // LEADS TO RE_CREATION OF REQUEST
        m_vae = std::make_shared<AutoencoderKL>(vae); // LEADS TO RE_CREATION OF REQUEST

        const bool is_lcm = m_unet->get_config().time_cond_proj_dim > 0;
        const char * const pipeline_name = is_lcm ? "LatentConsistencyModelPipeline" : "StableDiffusionPipeline";
        initialize_generation_config(pipeline_name);
    }

which means inference request will be re-created, while we don't have such goal.

ov::CompiledModel encoder_compiled_model = core.compile_model(m_encoder_model, device, properties);
ov::genai::utils::print_compiled_model_properties(encoder_compiled_model, "Auto encoder KL encoder model");
m_encoder_request = encoder_compiled_model.create_infer_request();
encoder_compiled_model = std::make_shared<ov::CompiledModel>(core.compile_model(m_encoder_model, device, properties));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ov::CompiledModel is shared model by its implementation. It's a wrapper around p_impl

void runPipeline(std::string prompt, std::filesystem::path root_dir, ov::genai::CLIPTextModel & text_encoder, ov::genai::UNet2DConditionModel & unet, ov::genai::AutoencoderKL & vae, std::promise<ov::Tensor> & Tensor_prm){
std::cout << "create pipeline" << prompt << std::endl;
auto scheduler = ov::genai::Scheduler::from_config(root_dir / "scheduler/scheduler_config.json");
auto pipe2 = ov::genai::Text2ImagePipeline::stable_diffusion(scheduler, text_encoder, unet, vae);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem with such approach is that it will be hard to apply LoRA adapters here in generic case.
E.g. SD 1.5 has simple LoRA configuration, while FLUX or other more complex models, require code like this https://github.com/openvinotoolkit/openvino.genai/pull/1602/files

Alternative approach is to have API like

Text2ImagePipeline pipeline( .. ); // similar to compile model
Text2ImagePipeline::GenerationRequest request = pipeline.create_generation_request(); // holds inference request
request.update_generation_config(guidance_scale(5.0));
Tensor image = request.generate("cat", callback(my_callback));

Text2ImagePipeline::GenerationRequest request2 = pipeline.create_generation_request(); // holds inference request
request2.update_generation_config(guidance_scale(6.0));
Tensor image2 = request2.generate("cat", width(200), height(200));

In this case all complexity with LoRA is hidden inside and even clients can use the same API (e.g. generate different images with different LoRAs / alphas in parallel)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: cmake / build Cmake scripts category: GenAI C++ API Changes in GenAI C++ public headers category: samples GenAI samples category: text to image Text 2 image pipeline no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants