Skip to content

Conversation

@bghira
Copy link
Contributor

@bghira bghira commented Nov 6, 2024

What does this PR do?

Adds skip_layer parameter to the transformer model class for stable diffusion 3.

I can un-bundle the batched CFG and also include the pipeline changes for this pull request if you require.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul @yiyixuxu

@bghira bghira force-pushed the feature-sd3/skip-layer-guidance branch from cbeef91 to 21075fa Compare November 6, 2024 20:12
@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 7, 2024

hi @bghira
Thanks for the PR. Would you be able to provide a little bit of context on this parameter you are adding? for example, which layers would you skip, and what would the result look like?

cc @asomoza here too

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@bghira
Copy link
Contributor Author

bghira commented Nov 7, 2024

this is a part of the work to implement skip-layer guidance for CFG in SD 3.5 Medium. the recommendation from SAI is to skip layers 7, 8, 9 when doing negative guidance. so this will have to be altered for that to work.

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 7, 2024

cc @asomoza here
can you also take a look to see if this is the best way to support the "negative guidance skipping" feature?

@bghira
Copy link
Contributor Author

bghira commented Nov 7, 2024

upstream;

        # Run cond and uncond in a batch together
        batched = self.model.apply_model(
            torch.cat([x, x]),
            torch.cat([timestep, timestep]),
            c_crossattn=torch.cat([cond["c_crossattn"], uncond["c_crossattn"]]),
            y=torch.cat([cond["y"], uncond["y"]]),
        )
        # Then split and apply CFG Scaling
        pos_out, neg_out = batched.chunk(2)
        scaled = neg_out + (pos_out - neg_out) * cond_scale
        # Then run with skip layer
        if (
            self.slg > 0
            and self.step > (self.skip_start * self.steps)
            and self.step < (self.skip_end * self.steps)
        ):
            skip_layer_out = self.model.apply_model(
                x,
                timestep,
                c_crossattn=cond["c_crossattn"],
                y=cond["y"],
                skip_layers=self.skip_layers,
            )
            # Then scale acc to skip layer guidance
            scaled = scaled + (pos_out - skip_layer_out) * self.slg

@vladmandic
Copy link
Contributor

link to #9819

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 7, 2024

ohh got it
I will leave this open but we actually expect a PR for this feature soon:)

@bghira
Copy link
Contributor Author

bghira commented Nov 7, 2024

ok well i've already completed the feature last night and it's available in simpletuner

vanilla diffusers which looks awful
image

skipping layers 6, 7, 8, 9 with SLG scale of 5.6
image

skipping 7, 8, 9 with SLG scale of 2.8
image

it works better at 1024x1024*

@bghira
Copy link
Contributor Author

bghira commented Nov 7, 2024

@vladmandic see the above pipeline commit if you're interested

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 7, 2024

I've asked @Dango233 for a review, let's work on this PR and get it merged soon:)

@bghira
Copy link
Contributor Author

bghira commented Nov 8, 2024

i think a possible improvement would be to dynamically determine whether to skip a layer and what scale to apply but its a bit hard to do.

@bghira bghira closed this Nov 16, 2024
@bghira
Copy link
Contributor Author

bghira commented Nov 16, 2024

closing due to lack of updates.

@bghira
Copy link
Contributor Author

bghira commented Nov 16, 2024

for anyone wanting this feature it seems candle has an interest in keeping up with community pull requests. the diffusers project has been falling behind quite a bit lately in addressing development.

@bghira bghira deleted the feature-sd3/skip-layer-guidance branch November 16, 2024 14:07
@vladmandic
Copy link
Contributor

@bghira i agree diffusers have been falling behind lately, but why close this pr?

@yiyixuxu @sayakpaul what can we do here?

@bghira
Copy link
Contributor Author

bghira commented Nov 16, 2024

they said some other pull request was coming, so, i assume the preference is for that pull request.

at this point i've simply advocated for internal fork of Diffusers that fulfils our needs and merely cherry-pick fixes from this project where it makes sense to.

@yiyixuxu
Copy link
Collaborator

@bghira,

initially, the author of SD3.5 was going to send a pull request for this feature around the same time, that's why I said this #9880 (comment). I checked with him after we received your PR, and we agreed we should move forward with this PR, and have him to do a review instead,
I said it here #9880 (comment)

I was traveling for the past week and could not follow up. I understand your frustration. If you are willing to re-open the PR, we can continue to work on this and get it merged soon. Otherwise, we will add it in a new PR and add you as a co-author (cc @asomoza here )

@vladmandic
Copy link
Contributor

regardless do we reopen this pr or implement independently, we need to figure out how to make diffusers current, at least when it comes to top models - this is slg is the default behavior in sd35-medium for 3+ weeks now and we're still discussing it here. i understand, that cannot be done for all models, but for something that is currently in top-3 we should aim for parity much faster.

@bghira
Copy link
Contributor Author

bghira commented Nov 17, 2024

as the branch has been deleted, the pull request can no longer be reopened

edit: found a hidden button

@bghira bghira restored the feature-sd3/skip-layer-guidance branch November 17, 2024 13:19
@bghira bghira reopened this Nov 17, 2024
@bghira
Copy link
Contributor Author

bghira commented Nov 17, 2024

@vladmandic the strangest thing is i thought this would be much harder because no one had gotten around to it yet, which is why it took me so long to even get around to attempting it. when i saw how simple it was, i had it working in under an hour. but then made me greatly confused why SD 3.5 Medium in Diffusers is so far behind since even I was capable of doing it.

@yiyixuxu yiyixuxu requested a review from asomoza November 17, 2024 15:56
for index_block, block in enumerate(self.transformer_blocks):
# Skip specified layers
if skip_layers is not None and index_block in skip_layers:
if block_controlnet_hidden_states is not None and block.context_pre_only is False:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we skip the block of code that needs to be skipped instead of adding duplicated code here
otherwise, if we have to change this part of the code that handles controlnet residual in the future, we have to remember to change both places, which is not great

)

self._guidance_scale = guidance_scale
self._skip_layer_guidance_scale = skip_layer_guidance_scale
Copy link
Collaborator

@yiyixuxu yiyixuxu Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add a decorator for this too, like this

@asomoza
Copy link
Member

asomoza commented Nov 18, 2024

@bghira for what I take from the other discussion, you would prefer if we just take over this PR and finish it right?

@yiyixuxu
Copy link
Collaborator

@asomoza I made the change I requested - feel free to merge after you test it and think is ok.
we should add these parameters to img2img/inpaint/controlnet/pag too (can do this in a separate PR and ask community to test etc)

@asomoza
Copy link
Member

asomoza commented Nov 19, 2024

So I did some tests with this, sadly for me, I couldn't find any combinations of layers and scale that makes the generation better, it seems that it makes better the text? but overall, in my experience it made the results worse or at least I personally didn't like them. I tested this with SD 3.5 Medium since it was released with this model.

clip prompt:

high quality photo of a tea party set in an enchanted forest, long rustic table, teapots, cups, fox with a monocle, rabbit in a waistcoat, squirrel wearing a tiny top hat, fireflies provide a soft, glowing light, hanging from the trees are lanterns made from acorn shells.

t5 prompt:

high quality photo of a tea party set in an enchanted forest. The scene includes a long, rustic table adorned with an assortment of colorful teapots and cups. Seated around the table are various forest creatures: a fox with a monocle, a rabbit in a waistcoat, and a squirrel wearing a tiny top hat. Above, fireflies provide a soft, glowing light, and hanging from the trees are lanterns made from acorn shells. The background features towering trees with twisting branches, some of which have treehouses nestled in them.

original
medium_20241119141108_2980386577 medium_20241119142339_2980386577 medium_20241119142610_2980386577
medium_20241119142740_2980386577 medium_20241119143745_2980386577 medium_20241119143952_2980386577
medium_20241119144126_2980386577 medium_20241119145025_2980386577 medium_20241119145340_2980386577

Still a nice addition to have and probably my example isn't the best to showcase this but I like to test with more complex and real use cases than simple ones.

I'll give this a try again when I build a SD 3.5 testing app that I can use faster.

@asomoza
Copy link
Member

asomoza commented Nov 19, 2024

thanks!

@bghira
Copy link
Contributor Author

bghira commented Nov 19, 2024

remember that it's applying a skip to the negative prompt so that part of things also impacts testing.

for the use case for simpletuner it is to improve the results of validation so that they match inference in comfyUI where skip-layer guidance is more often than not used to improve the results. after introducing the feature, it's often reported that a given user does not want to disable the option anymore, because the results so much more closely match their inference results in CUI.

@asomoza asomoza merged commit 99c0483 into huggingface:main Nov 19, 2024
13 of 15 checks passed
@bghira bghira deleted the feature-sd3/skip-layer-guidance branch November 19, 2024 20:24
@vladmandic
Copy link
Contributor

vladmandic commented Nov 19, 2024

thanks all!
btw, i've run a few tests and i think there is more to be squeezed quality-wise than just using SAI recommended values.
image (3)

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
* add skip_layers argument to SD3 transformer model class

* add unit test for skip_layers in stable diffusion 3

* sd3: pipeline should support skip layer guidance

* up

---------

Co-authored-by: bghira <[email protected]>
Co-authored-by: yiyixuxu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants