最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

machine learning - TrainingArguments: Do "packing" and "group_by_length" counteract each oth

programmeradmin4浏览0评论

In the HuggingFace's TrainingArguments and SFTConfig (inheriting from TrainingArguments), there are two arguments for initializing SFTConfig():

  • group_by_length: Whether or not to group together samples of roughly the same length in the training dataset (to minimize padding applied and be more efficient). Only useful if applying dynamic padding.
  • packing: Whether to pack multiple sequences into a fixed-length format. Uses max_length to define sequence length.
config = SFTConfig(..., 
                   group_by_length=True, 
                   packing=True, ...)

Those arguments serve the purpose of reducing the effort to filling in paddings. However, when packing=True, it is pointless to use group_by_length=True. Shall we use both to increase the training performance? Do they counteract each other?

发布评论

评论列表(0)

  1. 暂无评论