-
Notifications
You must be signed in to change notification settings - Fork 45
Description
在跑mix-chord的代码,actor/expert/sft_phi_loss这个指标的维度不一样,在调用verl/verl/utils/metric/utils.py reduce_metrics 的时候,metrics[key] = np.mean(val) 会报错:ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part。请教下可能是什么问题呢?
这个是配置:
sample_strategy_args:
expert_data_ratio: 0.2 # 0.2 = train_batch_size_expert//ppo_mini_batch_size
policy_loss_fn_args: # feel free to change, we encourage you to try out different hyperparameters
mu_warmup_steps: 200 # 0 for chord-mu and chord-phi
mu_decay_steps: 400 # 200 for chord-mu and 0 for chord-phi
mu_peak: 0.5 # 0.9 for chord-mu and 0.1 for chord-phi
mu_valley: 0.02 # 0.05 for chord-mu and 0.1 for chord-phi
enable_phi_function: true # false for chord-mu and true for chord-phi
clip_range: 0.2
use_token_level_loss_in_sft: true
clip_range_low: 0.2
clip_range_high: 0.28
use_dynamic_bsz: true
ppo_mini_batch_size: 160
ppo_micro_batch_size_per_gpu: 2
ngpus_trainer: 4
train_batch_size_expert: 32 # train_batch_size_expert//ngpus_trainer
train_batch_size_usual: 128 # 8 batchsize * 16 repeat times
model:
model_path: Qwen/Qwen3-4B-Instruct-2507
max_response_tokens: 2000
max_model_len: 12000
cluster:
node_num: 1
gpu_per_node: 8
buffer:
total_epochs: 2
batch_size: 8
train_batch_size: 160 # train_batch_size_usual + train_batch_size_expert