UBOS Asset Marketplace: StoryDiffusion for ComfyUI - Unleash Creative AI Image Generation

In the rapidly evolving landscape of AI-driven content creation, the ability to generate compelling and contextually rich images is becoming increasingly crucial. UBOS is at the forefront of this revolution, empowering users with cutting-edge tools that integrate seamlessly into their existing workflows. One such tool is StoryDiffusion for ComfyUI, now available on the UBOS Asset Marketplace.

What is StoryDiffusion?

StoryDiffusion, originating from research by HVision-NKU and MS-Diffusion, is an innovative approach to image generation that emphasizes consistency and coherence across long-range visual narratives. This means it’s particularly adept at creating images with multiple characters and maintaining their identity and relationships within a scene.

The ComfyUI implementation of StoryDiffusion, developed by smthemex, brings this powerful technology to a user-friendly node-based interface. This allows artists, designers, and developers to harness the capabilities of StoryDiffusion without needing extensive coding knowledge.

Use Cases: Where StoryDiffusion Shines

Digital Storytelling: Create consistent characters and scenes for comics, graphic novels, and animated shorts.
Marketing & Advertising: Generate visually engaging ad campaigns with consistent branding and character representation.
Game Development: Design character concepts, environment art, and promotional materials.
Content Creation: Produce unique and captivating visuals for blog posts, social media, and presentations.
AI-Assisted Art: Augment your artistic process with AI-generated elements and refine them to your exact specifications.
Education: Visualize abstract concepts and create educational materials with memorable characters and scenes.
**Therapy: Visualise scenarios for patients to understand and interact with them

Key Features: Dive Deep into StoryDiffusion’s Capabilities

The ComfyUI_StoryDiffusion node offers a rich set of features, designed to provide granular control over the image generation process.

Seamless ComfyUI Integration: The node-based interface makes it easy to incorporate StoryDiffusion into your existing ComfyUI workflows. Drag and drop, connect nodes, and experiment with different settings to achieve your desired results.
Multi-Character Support: Generate images with multiple characters interacting within a scene. StoryDiffusion maintains character consistency, ensuring they look the same across different images.
ControlNet Integration: Leverage ControlNet for precise control over image composition, pose, and style. Use edge maps, depth maps, and other control signals to guide the AI’s creative process.
Photomaker V1 & V2 Support: Customize realistic human photos with PhotoMaker integration. Fine-tune character appearance, clothing, and accessories.
Kolors Integration: Utilize the Kolors model for photorealistic text-to-image synthesis, allowing you to create stunning visuals from textual prompts.
Flux Diffusers Pipeline: Experiment with cutting-edge diffusion techniques using the Flux pipeline. This requires substantial VRAM (16GB+) and potentially significant RAM (64GB+ if using CPU).
Lora & HyperLora Support: Fine-tune your images with Lora models. HyperLora accelerates Lora training and allows you to achieve specific stylistic effects.
Community Model Compatibility: Supports all SDXL based diffusion models, as well as non SD models. Select models from local diffusers or single SDXL community models.
Offline Mode: Use StoryDiffusion offline by specifying the absolute path to your diffusion models.
Dual Role ControlNet: Introduces Controlnet for dual character co image, supporting multi image introduction, and saving and loading character models
Adjustable Parameters: Fine-tune numerous parameters, including attention layer degrees, image dimensions, character weights, Lora scales, and more.

Deep Dive into Specific Features:

1. Model Loading & Selection

The <Storydiffusion_Model_Loader> node is the central hub for selecting and configuring your diffusion models. It provides a wide range of options:

Repo: Specify the Hugging Face repository for your desired diffusion model (e.g., “stablityai/table diffusion xl base-1.0”).
Ckptname: Select a community SDXL model from a dropdown menu.
Vae_id: Choose a specific VAE (Variational Autoencoder) if required by your model.
Character_weights: Load previously saved character weights to maintain consistency across images.
Lora: Select a Lora model for fine-tuning the image style.
Lora_scale: Adjust the strength of the Lora model.
Trigger_words: Specify the trigger words for your Lora model.
Scheduler: Choose a different scheduler to resolve errors when adding characters to the same frame in text and animation.
Model_type: Select between txt2img (text-to-image) and img2img (image-to-image) modes.
Idnumber: Specify the number of characters in the image (1 or 2).
Sa32_degree/sa64_degree: Adjustable parameters for the attention layer.
Img_width/img_height: Set the dimensions of the output image.
Photomake_mode: Choose between PhotoMaker V1 and V2.
Reset_txt2img: Enable this to fix continuous rendering errors when using MS diffusion.

2. Sampler & Prompting

The <Storydiffusion_Sampler> node controls the image generation process and allows you to specify prompts and styles:

Pipe/info: Connect this to the output of the Model Loader node.
Image: Connect this to the image output of the generation process. Use ComfyUI’s built-in image batch node for dual-role images.
Character prompt: Specify the prompt for each character. Start with [character name] or ['角色名'] (for Chinese prompts). Add “img” for graphic mode (e.g., “a man img”).
Scene prompts: Describe the scene, starting with [character name]. Use [NC] for scenes without characters. Use (character A and character B) to enable MS diffusion’s dual character mode.
Split prompt: Specify a symbol to split long prompts into paragraphs.
Negative prompt: Use this to specify elements to avoid in the image (only effective when img_style is No_style).
Seed/steps/cfg: Standard ComfyUI parameters for controlling randomness, detail, and adherence to the prompt.
Ip-adapter_strength: Control the weight of the IP-Adapter in img2img (only for Kolors).
Style_strength’ratio: Control the style weight and when the style takes effect.
Encoder’repo: Specify the path to the CLIP encoder model for dual-character images.
Role-scale: Control the weight of characters in dual-character images.
Mask_threshold: Control the position of characters in dual-character images.
Start_step: Control the starting steps for character positioning in dual-character images.
Save_character: Save character weights for future use.
Controlnet_modelpath: Select the ControlNet community model.
Controllet_scale: Control the ControlNet weight.
Layout_guidance: Enable automatic layout for the scene.

3. Comic Panel Generation

The <Comic_Type> node helps generate comic panels with stylized text overlays:

Fonts list: Specify custom fonts for the text (place font files in the fonts directory).
Text_size: Set the size of the text.
Comic_type: Choose the style of the comic panel.
Split lines: Specify a symbol to split long lines of text.

4. Prompt Pre-Translation

The <Pre_Translate_prompt> node pre-processes prompts for translation:

Keep_charactername: Specify whether to keep the character name in the translated text.

Working with Dual Characters:

StoryDiffusion excels at generating images with two characters interacting in a scene. To enable this, use the syntax (A and B) in your scene prompt, where A and B are the character names. You will also need to provide an encoder model (laion/CLIP ViT bigG-14 laion2B-39B-b160k) and an IP-Adapter fine-tuning model (ms-adapter.bin).

Tips for Success:

Character Consistency: Use saved character weights to ensure consistent character appearance across multiple images.
Prompt Engineering: Craft detailed and specific prompts to guide the AI’s creative process.
ControlNet: Leverage ControlNet for precise control over image composition and style.
Experimentation: Don’t be afraid to experiment with different settings and models to discover new and exciting results.
Dual Character Prompts: For dual character same frame, use (character A and character B) have lunch, A. B is the role name, and the middle and parentheses cannot be removed.
Encoder Model: To use dual role same frame, you need to add an encoder model (laion/CLIP ViT bigG-14 laion2B-39B-b160k, which cannot be replaced with others) and an ip adapet fine-tuning model (ms-adapter.bin, which cannot be replaced).
Playground v2.5: Playground v2.5 can be effective on txt2img, and there is no Playground v2.5 style Lora available when accelerated Lora can be used
Style Consistency: The consistency of style can be adjusted between ip-adapter_strength and style_strength’ratio in img2img;
Translation: Preprocess translation text nodes, please refer to the example diagram for usage methods. (Pay attention to changing the font for Chinese or other East Asian characters).
Paragraphs: By default, use the “;” at the end of each paragraph to divide the paragraph. After translation into Chinese, there is a chance that it will be translated as “;”, so remember to change it to “;”, otherwise it will be a sentence.

UBOS: Your AI Agent Development Platform

UBOS is a full-stack AI Agent Development Platform designed to bring AI Agents to every business department. We offer a comprehensive suite of tools and services to help you:

Orchestrate AI Agents: Manage and coordinate multiple AI Agents to achieve complex tasks.
Connect with Enterprise Data: Integrate your AI Agents with your existing enterprise data sources.
Build Custom AI Agents: Create custom AI Agents tailored to your specific needs, using your own LLM models.
Develop Multi-Agent Systems: Build sophisticated Multi-Agent Systems that can solve complex problems and automate business processes.

Benefits of using UBOS:

Accelerated Development: Reduce development time and costs with our pre-built components and intuitive interface.
Enhanced Performance: Optimize your AI Agents for maximum performance and efficiency.
Improved Collaboration: Facilitate collaboration between developers, data scientists, and business users.
Increased Innovation: Empower your team to experiment with new AI technologies and create innovative solutions.

By leveraging the UBOS platform and integrating tools like StoryDiffusion, you can unlock the full potential of AI and transform your business. Visit https://ubos.tech to learn more.

UBOS Asset Marketplace: StoryDiffusion for ComfyUI - Unleash Creative AI Image Generation

What is StoryDiffusion?

Use Cases: Where StoryDiffusion Shines