MCPcopy
hub / github.com/evalplus/evalplus / script

Function script

evalplus/lecacy_sanitize.py:119–191  ·  view source on GitHub ↗
(
    samples: str,
    eofs: List[str] = [],
    inplace: bool = False,
    rm_prefix_lines: str = None,
    debug_task: str = None,
    mbpp_version: str = "default",
)

Source from the content-addressed store, hash-verified

117
118
119def script(
120 samples: str,
121 eofs: List[str] = [],
122 inplace: bool = False,
123 rm_prefix_lines: str = None,
124 debug_task: str = None,
125 mbpp_version: str = "default",
126):
127 # task_id -> entry_point
128 entry_point = {}
129 dataset = {**get_human_eval_plus(), **get_mbpp_plus(version=mbpp_version)}
130
131 for task_id, problem in dataset.items():
132 entry_point[task_id] = problem["entry_point"]
133
134 # make a new folder with "-sanitized" suffix
135 is_folder = os.path.isdir(samples)
136 target_path = pathlib.Path(samples)
137 if not inplace:
138 if is_folder:
139 new_name = target_path.name + "-sanitized"
140 else:
141 new_name = target_path.name.replace(".jsonl", "-sanitized.jsonl")
142 target_path = target_path.parent / new_name
143 target_path = str(target_path)
144
145 nsan = 0
146 ntotal = 0
147
148 new_solutions = []
149
150 for solution in tqdm(load_solutions(samples)):
151 task_id = solution["task_id"]
152 dbg_identifier = solution["_identifier"]
153 if debug_task is not None and task_id != debug_task:
154 continue
155
156 ntotal += 1
157 if "solution" in solution:
158 old_code = solution["solution"]
159 else:
160 assert "completion" in solution
161 old_code = dataset[task_id]["prompt"] + "\n" + solution["completion"]
162
163 old_code = old_code.strip()
164
165 new_code = sanitize(
166 old_code=old_code,
167 entry_point=entry_point[task_id],
168 rm_prefix_lines=rm_prefix_lines,
169 eofs=eofs,
170 ).strip()
171
172 # if changed, print the message
173 if new_code != old_code:
174 msg = "Sanitized: " + dbg_identifier
175 if is_folder:
176 msg += " -> " + dbg_identifier.replace(samples, target_path)

Callers

nothing calls this directly

Calls 6

get_human_eval_plusFunction · 0.90
get_mbpp_plusFunction · 0.90
load_solutionsFunction · 0.90
write_directoryFunction · 0.90
write_jsonlFunction · 0.90
sanitizeFunction · 0.70

Tested by

no test coverage detected