Check if subset's description mentions any of the superset's extra flags. This guards against false-positive subset merges that occur when the LLM normalises dashless BSD options (e.g. ``U`` → ``-U``), creating spurious flag overlaps with genuinely different POSIX options (e.g. ``-U``/`
(subset: Option, extra_flags: frozenset[str])
| 92 | |
| 93 | |
| 94 | def _subset_has_cross_reference(subset: Option, extra_flags: frozenset[str]) -> bool: |
| 95 | """Check if subset's description mentions any of the superset's extra flags. |
| 96 | |
| 97 | This guards against false-positive subset merges that occur when the LLM |
| 98 | normalises dashless BSD options (e.g. ``U`` → ``-U``), creating spurious |
| 99 | flag overlaps with genuinely different POSIX options (e.g. ``-U``/``--User``). |
| 100 | Legitimate duplicates typically cross-reference the other entry's flags |
| 101 | ("Identical to -M", "same as --sort"). |
| 102 | |
| 103 | Flags are matched with non-alphanumeric boundaries so that bare |
| 104 | single-letter flags like ``k`` do not match inside ordinary words |
| 105 | ("work", "make") and flag prefixes like ``-M`` do not match inside |
| 106 | longer tokens. |
| 107 | """ |
| 108 | desc = subset.text |
| 109 | for flag in extra_flags: |
| 110 | pattern = r"(?<![a-zA-Z0-9\-])" + re.escape(flag) + r"(?![a-zA-Z0-9\-])" |
| 111 | if re.search(pattern, desc): |
| 112 | return True |
| 113 | return False |
| 114 | |
| 115 | |
| 116 | def dedup_options(options: list[Option]) -> tuple[list[Option], int]: |
no outgoing calls