MCPcopy Index your code
hub / github.com/zai-org/CogView / process_fn

Function process_fn

data_utils/datasets.py:104–108  ·  view source on GitHub ↗
(row)

Source from the content-addressed store, hash-verified

102 if dataset_type == 'TokenizedDataset':
103 # already tokenized when saved
104 def process_fn(row):
105 ret, attention_mask_sep = pad_to_len(row.flatten())
106 return {'text': ret,
107 'loss_mask': np.array([1] * attention_mask_sep + [0] * (len(ret) - attention_mask_sep))
108 }
109
110 elif dataset_type == 'TextCodeDataset':
111 def process_fn(row):

Callers 1

EncodeAsIdsMethod · 0.85

Calls 2

pad_to_lenFunction · 0.85
TextCodeTemplateFunction · 0.85

Tested by

no test coverage detected