MCPcopy
hub / github.com/garrytan/gstack / skill-llm-eval.test.ts

File skill-llm-eval.test.ts

test/skill-llm-eval.test.ts:None–None  ·  view source on GitHub ↗

Source from the content-addressed store, hash-verified

1/**
2 * LLM-as-a-Judge evals for generated SKILL.md quality.
3 *
4 * Uses the Anthropic API directly (not Agent SDK) to evaluate whether

Callers

nothing calls this directly

Calls 12

detectBaseBranchFunction · 0.90
getChangedFilesFunction · 0.90
selectTestsFunction · 0.90
judgeFunction · 0.90
extractGrepLinesFunction · 0.85
runWorkflowJudgeFunction · 0.85
addTestMethod · 0.80
createMethod · 0.80
finalizeMethod · 0.80
describeIfSelectedFunction · 0.70
testIfSelectedFunction · 0.70
pushMethod · 0.45

Tested by

no test coverage detected