MCPcopy
hub / github.com/lm-sys/FastChat / __init__

Method __init__

fastchat/train/train_yuan2.py:319–330  ·  view source on GitHub ↗
(
        self, raw_data, data_args, tokenizer: transformers.PreTrainedTokenizer
    )

Source from the content-addressed store, hash-verified

317 """Dataset for supervised fine-tuning."""
318
319 def __init__(
320 self, raw_data, data_args, tokenizer: transformers.PreTrainedTokenizer
321 ):
322 super(SupervisedDataset, self).__init__()
323
324 rank0_print("Formatting inputs...")
325 sources = [example["conversations"] for example in raw_data]
326 data_dict = preprocess(sources, tokenizer, data_args)
327
328 self.input_ids = data_dict["input_ids"]
329 self.labels = data_dict["labels"]
330 self.attention_mask = data_dict["attention_mask"]
331
332 def __len__(self):
333 return len(self.input_ids)

Callers 1

__init__Method · 0.45

Calls 2

rank0_printFunction · 0.70
preprocessFunction · 0.70

Tested by

no test coverage detected