mindformers.core.callback.MFLossMonitor¶
- class mindformers.core.callback.MFLossMonitor(learning_rate: Optional[Union[float, LearningRateSchedule]] = None, per_print_times: int = 1, micro_batch_num: int = 1, micro_batch_interleave_num: int = 1, origin_epochs: Optional[int] = None, dataset_size: Optional[int] = None, initial_epoch: int = 0, global_batch_size: int = 0, device_num: int = 0)[源代码]¶
Loss Monitor for classification.
- Args:
learning_rate (Union[float, LearningRateSchedule], optional): The learning rate schedule. Default: None. per_print_times (int): Every how many steps to print the log information. Default: 1. micro_batch_num (int): MicroBatch size for Pipeline Parallel. Default: 1. micro_batch_interleave_num (int): split num of batch size. Default: 1. origin_epochs (int): Training epoches. Default: None. dataset_size (int): Training dataset size. Default: None.
- Examples:
>>> from mindformers.core.callback import MFLossMonitor >>> lr = [0.01, 0.008, 0.006, 0.005, 0.002] >>> monitor = MFLossMonitor(per_print_times=10)
- dump_info_to_modelarts(ma_step_num, ma_loss)[源代码]¶
dump modelarts info to display evaluation result page
- epoch_begin(run_context)[源代码]¶
Record time at the beginning of epoch.
- Args:
run_context (RunContext): Context of the process running.
- epoch_end(run_context)[源代码]¶
Print training info at the end of epoch.
- Args:
run_context (RunContext): Context of the process running.
- print_output_info(cb_params, cur_epoch_num, origin_epochs, throughput, cur_step_num, steps_per_epoch, loss, per_step_seconds, overflow, scaling_sens, time_remain, percent)[源代码]¶
print output information.