Shared configuration for all extractors. ``run_dir``: single directory for all run artifacts (logs, debug files, failed responses). When set, debug artifacts land in ``markdown/``, ``prompts/``, ``responses/`` subdirs keyed by an encoded form of the input path (see ``repo_root``).
| 109 | |
| 110 | @dataclass(frozen=True) |
| 111 | class ExtractorConfig: |
| 112 | """Shared configuration for all extractors. |
| 113 | |
| 114 | ``run_dir``: single directory for all run artifacts (logs, debug |
| 115 | files, failed responses). When set, debug artifacts land in |
| 116 | ``markdown/``, ``prompts/``, ``responses/`` subdirs keyed by an |
| 117 | encoded form of the input path (see ``repo_root``). |
| 118 | |
| 119 | ``repo_root``: when set, per-page artifact filenames encode the |
| 120 | .gz file's repo-relative path (slashes replaced with ``__``) so |
| 121 | same-basename pages from different distros/sections don't collide. |
| 122 | When unset, falls back to the bare basename. |
| 123 | |
| 124 | ``debug``: when True, ``_finalize()`` writes full prompt/response |
| 125 | artifacts (.md, .prompt.json, .response.txt) into *run_dir*. |
| 126 | Failed responses are always written when *run_dir* is set. |
| 127 | """ |
| 128 | |
| 129 | model: str | None = None |
| 130 | run_dir: str | None = None |
| 131 | repo_root: str | None = None |
| 132 | debug: bool = False |
| 133 | |
| 134 | |
| 135 | @runtime_checkable |
no outgoing calls