fix init weights issue for critic/reward model#983
Merged
hwchen2017 merged 5 commits intodeepspeedai:masterfrom Jul 8, 2025
Merged
fix init weights issue for critic/reward model#983hwchen2017 merged 5 commits intodeepspeedai:masterfrom
hwchen2017 merged 5 commits intodeepspeedai:masterfrom
Conversation
Contributor
|
Hi @jouw, can you fix the format and DCO error? |
Signed-off-by: Hongwei Chen <hongweichen@microsoft.com> Signed-off-by: jouw <jouw@foxmail.com>
Signed-off-by: jouw <jouw@foxmail.com>
Signed-off-by: raviguptaamd <ravi.gupta@amd.com> Signed-off-by: jouw <jouw@foxmail.com>
e26fe55 to
9a7062b
Compare
Contributor
Author
hi @hwchen2017 , I have fixed the error, please help review the change, thanks! |
Signed-off-by: jouw <jouw@foxmail.com>
Contributor
Author
hi @hwchen2017 , I have fixed the errors, can you help merge the change? Thanks! |
hwchen2017
approved these changes
Jul 8, 2025
Contributor
|
Hi @jouw, It seems this breaks our CI test using DS-Chat. Can you share more about the error you encountered? |
Contributor
|
Let's revert this PR and fix it. |
hwchen2017
added a commit
that referenced
this pull request
Jul 29, 2025
This reverts commit 3d83278.
hwchen2017
added a commit
that referenced
this pull request
Jul 29, 2025
This reverts commit 3d83278. Signed-off-by: Hongwei Chen <hongweichen@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add the following code to disable init weights operation, otherwise it will init model weights and got an error. @hwchen2017
with no_init_weights():Detailed explanation as belows.
Take Qwen3Model as example, the function call stack is:
Qwen3Model.init() -> Qwen3Model.post_init() -> PreTrainedModel.init_weights()
If we don't add
with no_init_weights():for the codemodel = model_class.from_config(model_config), the parameter _init_weights will be true, and cause error.https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py