Conditional Embedding Perturbation (CEP)#1235
Conditional Embedding Perturbation (CEP)#1235Koratahiu wants to merge 7 commits intoNerogar:masterfrom
Conversation
|
This has been tested with SDXL, Chroma, and Zib, and it works very well. |
|
|
||
|
|
||
| # Conditional Embedding Perturbation (CEP) | ||
| cep_label = components.label(frame, 10, 0, "Conditional Embedding Perturbation (CEP)", |
There was a problem hiding this comment.
is there a gamma that is a no-op?
in that case we wouldn't need an enabled switch. This is how most other parameters in OneTrainer work, that there is a 0.0 which doesn't do anything for example
There was a problem hiding this comment.
Yeah, 0 is a no-op.
1 is the paper's default value (slight noise based on the dimension of the TEs), 2 is double that, and so on.
| text_encoder_dropout_probability=config.text_encoder.dropout_probability, | ||
| ) | ||
|
|
||
| if config.cep_enabled: |
There was a problem hiding this comment.
I'd prefer this call in Model.encode_text
this is where similar functionality is implemented (such as caption dropout)
There was a problem hiding this comment.
Would model.encode_text do it on-the-fly without caching?
One benefit of this method is that it doesn't need re-caching
There was a problem hiding this comment.
model.encode_text takes the cached output and returns it, but it can (and does) still modify the cached output before returning it. doesn't mean you have to cache the perturbation, it can applied to the cached value.
| components.switch(frame, 9, 1, self.ui_state, "dynamic_timestep_shifting") | ||
|
|
||
|
|
||
| # Conditional Embedding Perturbation (CEP) |
There was a problem hiding this comment.
I think this option fits better near "Caption Dropout Probability"
There was a problem hiding this comment.
I think it fits both: injected 'noise' applied to the TE conditioning. However, wouldn't TE settings require per-model setting application? I'm trying to avoid that
|
|
||
| return noise | ||
|
|
||
| def _apply_conditional_embedding_perturbation( |
|
|
||
| # gamma controls perturbation magnitude (Paper uses gamma=1.0 as default baseline) | ||
| # Calculate scaling factor: sqrt(gamma / d) | ||
| scale = math.sqrt(gamma / d) |
There was a problem hiding this comment.
Yeah, you're right; I had (1/√d) in my mind when I wrote this
| ) | ||
|
|
||
| if config.cep_enabled: | ||
| text_encoder_output = self._apply_conditional_embedding_perturbation( |
There was a problem hiding this comment.
should CEP also be applied during validation? it currently is - validation uses the same predict().
theoretically I guess not, because you want validation to be deterministic and comparable across time. but the effect might be minor.
There was a problem hiding this comment.
Isn't this also the case with caption dropout (which is in model.encode_text)?
There was a problem hiding this comment.
good point, but that's definitely not good. I've added it here: #957 (comment)
I think it might still need tuning per model, because the magnitude of embeddings are different by text encoder |

This draft implements the Conditional Embedding Perturbation (CEP) strategy proposed in the paper:
Slight Corruption in Pre-training Data Makes Better Diffusion Models (NeurIPS 2024 spotlight)
This method aims to improve the generation quality and diversity of diffusion models by mitigating the impact of "perfect" overfitting to training pairs. The paper demonstrates theoretically that standard training can cause the generated distribution to collapse to the empirical distribution of the training data.
CEP addresses this by introducing slight, dimension-scaled noise to the conditional embeddings (e.g., text encoder outputs) during training. By optimizing the objective, the model is forced to learn a smoother conditional manifold, reducing the distance to the true data distribution and preventing memorization.
Implementation Details
Usage
Conditional Embedding Perturbation (CEP)(below timestep shifting)CEP Gammato 1TODO