MCPcopy
hub / github.com/allenai/open-instruct / get_successful_tests_stdio

Function get_successful_tests_stdio

open_instruct/code_utils/code_utils.py:222–263  ·  view source on GitHub ↗

Same as above but for stdio format. Parameter: program: a string representation of the python program you want to run tests: a list of (input, output) pairs max_execution_time: the number of second each individual test can run before it is considered failed an

(
    program: str, tests: list[Any], max_execution_time: float = 1.0
)

Source from the content-addressed store, hash-verified

220
221
222def get_successful_tests_stdio(
223 program: str, tests: list[Any], max_execution_time: float = 1.0
224) -> tuple[list[int], list[float]]:
225 """Same as above but for stdio format.
226 Parameter:
227 program: a string representation of the python program you want to run
228 tests: a list of (input, output) pairs
229 max_execution_time: the number of second each individual test can run before
230 it is considered failed and terminated
231 Return:
232 a tuple of (results, runtimes). results is a list of 0/1 indicating
233 passed or not, runtimes is a list of execution times for each test.
234 """
235 test_ct = len(tests)
236 if test_ct == 0:
237 return [], []
238 if not should_execute(program=program, tests=tests):
239 logger.info("Not executing program %s", program)
240 return [0] * len(tests), [-1.0] * len(tests)
241
242 stdio_test_results = multiprocessing.Array("i", test_ct)
243 stdio_runtimes = multiprocessing.Array("d", test_ct)
244
245 for i in range(test_ct):
246 stdio_test_results[i] = 0 # Initialize results to 0 (failure)
247 stdio_runtimes[i] = -1.0
248
249 # Total timeout needs to account for all tests running sequentially.
250 total_timeout = max_execution_time * test_ct + 5.0
251
252 p = multiprocessing.Process(
253 target=run_tests_stdio_helper, args=(program, tests, max_execution_time, stdio_test_results, stdio_runtimes)
254 )
255 p.start()
256 p.join(timeout=total_timeout)
257
258 if p.is_alive():
259 p.kill()
260 p.join()
261 p.close()
262
263 return [stdio_test_results[i] for i in range(test_ct)], [stdio_runtimes[i] for i in range(test_ct)]
264
265
266# -------------------------------------------------------------

Callers 1

test_program_stdioFunction · 0.90

Calls 3

should_executeFunction · 0.85
startMethod · 0.45
closeMethod · 0.45

Tested by 1

test_program_stdioFunction · 0.72