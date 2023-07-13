Stability AI’s SD-XL (Stable Diffusion XL) is nearing completion, offering an exciting new feature for users. Originally, only ComfyUI was able to support SD-XL, but it has recently come to light that Vlad also supports it. Eager to try out this new functionality, I decided to give it a go.

For those who are unfamiliar, Vlad is a popular tool that provides various features related to Stable Diffusion. Users can find more information about Vlad in the articles located in the Vlad sub-tab of this topic. These articles detail the evolution of Automatic1111 and discuss Vlad Diffusion – Stable Diffusion.

However, it is important to note that both SD-XL and Vlad’s SD-XL support are still in the experimental stage. Therefore, this experience does not cover the basics and assumes a certain level of familiarity with the tools.

To install Stability AI’s SD-XL using Vlad, it is necessary to update Vlad to the latest version. Open the command prompt in the automatic (or other installation folder) directory and enter the following command: git pull https://github.com/vladmandic/automatic. This will ensure that you have the most up-to-date version of Vlad.

Next, you will need to apply for permission from HuggingFace to download the .safetensor model of SD-XL. Unlike ComfyUI, both ComfyUI and Vlad require users to register with HuggingFace and request access to SD-XL. The application process can be completed at the following URL: https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9/tree/main. When attempting to download the sd_xl_base_0.9.safetensors file for the first time, a user specification page will appear. Simply fill in the required information and submit the application. Permission will be granted immediately.

After obtaining permission, you will need to generate an Access Token in HuggingFace. This Access Token is required to download SD-XL in Vlad. To generate the Access Token, go to the Settings page on the HuggingFace website. Click on the “New Token” button to generate a token. Choose the “Read” option and provide a name for the token. Remember to copy the generated token and store it securely.

To install and download SD-XL 0.9, open the Vlad interface and navigate to the relevant sections. You will need to download two models: stabilityai/stable-diffusion-xl-base-0.9 and stabilityai/stable-diffusion-xl-refiner-0.9. Paste your Access Token in the Huggingface token field, and then click the “Download model” button. Please note that the file is quite large, so be prepared to wait for the download to finish.

Once both models have been successfully downloaded, it is recommended to close Vlad’s cmd window and re-open Vlad. You will find the models presented as folders in the “models > Diffusers” section.

It is important to be aware of the prerequisites and settings for SD-XL. SD-XL is an algorithm based on the Diffuser model, which differs from the commonly used 1.5 model. The available sampling methods also vary. SD-XL has been trained with 1024×1024 images, so certain features, such as Text2Image, cannot utilize the smaller image resolutions provided by 1.5. Additionally, SD-XL image generation is a two-stage process, necessitating the download of both the base and refiner models. It is worth noting that SD-XL consumes more VRAM, and while the “–lowvram” and “–medvram” options can help preserve performance, they cannot match the capabilities of the 1.5 super plug-in. Furthermore, certain features like ControlNet and LyCORIS are not compatible with SD-XL. Additionally, VAE can still be used with Textual Inversion and LoRA, but specific training is required. These differences result in a less customizable experience compared to 1.5 at this stage.

To switch between modes, it is necessary to switch the mode of Vlad. While there are options available in Vlad’s Settings>Stable Diffusion tab, it is more convenient to create a webui-user.bat file and switch modes when starting Vlad. Create a .txt file, change the extension to .bat, and enter the appropriate content according to your desired mode. For example:

1.5/2.0/2.1 mode:

@echo offset PYTHON=set GIT=set VENV_DIR=set COMMANDLINE_ARGS= –backend original

call webui.bat

SD-XL mode:

@echo offset PYTHON=set GIT=set VENV_DIR=set COMMANDLINE_ARGS= –backend diffusers

call webui.bat

Please note that the mode switch is permanent until manually changed back.

During my experience with SD-XL, I encountered a memory leak issue in Vlad’s SD-XL mode. This means that VRAM is not fully released after image calculations, leading to warnings about high VRAM usage. When VRAM is insufficient, the resulting image may be incomplete or entirely black. In such cases, it is necessary to close and reopen Vlad.

There are several settings and notes that users should be aware of. For example, adjusting dpm-related sampling algorithms can help manage VRAM usage. Additionally, the use of the DPM solver algorithm is crucial when using SD-XL for the first time, as it helps avoid errors.

Overall, SD-XL presents a different set of habits compared to the 1.5 model. There are three main differences to note. Firstly, it is important to set the negative prompts for the refiner model carefully. Writing too much can have negative effects, and the Textual Inversion feature of 1.5 is ineffective with SD-XL. SD-XL 0.9 is more suitable for natural language and encompasses various styles. However, if the prompts relating to style are not clear enough, it may be challenging to achieve the desired result.

It is worth mentioning that the SD-XL model, created by Americans, shows a strong preference for American styles. This is evident in the generation of realistic photos and the depiction of Asian faces based on American styles. It is expected that future derivative models will offer a wider range of styles.

One advantage that SD-XL has over the 1.5 model is its ability to make trade-offs. The 1.5 model often struggles to balance details and clarity, resulting in a sense of falsehood in generated images. SD-XL, on the other hand, can make better trade-offs. For example, in a close-up headshot, the face is clear and detailed while the rest of the image gradually transitions to a blur, mimicking the effect seen in real-life photography. Even without explicit lighting prompts, SD-XL generates a light and shadow feeling that is well-suited for outdoor scenes.

While I will continue to use the 1.5 model until SD-XL is officially launched with customized models, I am excited about the potential of SD-XL to bring us closer to realistic simulations in the future.

In conclusion, the completion of Stability AI’s SD-XL is an exciting development. With the support of tools like ComfyUI and Vlad, users can explore the capabilities of this new algorithm. SD-XL offers a different set of features and habits compared to the 1.5 model, and its ability to make trade-offs contributes to more realistic and detailed image generation. As we await the official launch of SD-XL with customized models, there is much anticipation for the future of simulation and image generation in the field of AI.

