Filter is an interface implemented by deduplication mechanism
| 2 | |
| 3 | // Filter is an interface implemented by deduplication mechanism |
| 4 | type Filter interface { |
| 5 | // Close closes the filter and releases associated resources |
| 6 | Close() |
| 7 | // UniqueURL specifies whether a URL is unique |
| 8 | UniqueURL(url string) bool |
| 9 | // UniqueContent specifies whether a content is unique |
| 10 | // Deduplication is done by hashing of the response data. |
| 11 | // |
| 12 | // TODO: Consider levenshtein length / keyword based hashing |
| 13 | // to account for dynamic response content. |
| 14 | UniqueContent(content []byte) bool |
| 15 | // IsCycle attempts to detect if the current URL is a cycle |
| 16 | // until graph navigation is implemented, the only ways to discard a potential |
| 17 | // loop cycle are |
| 18 | // - implementing upper hard limit to the URL length (https://bugs.chromium.org/p/chromium/issues/detail?id=69227 => 2Mb) |
| 19 | // - Heuristically find the longest repeating substring and set a max threshold of how many max times it should repeat (eg. 10) |
| 20 | // Todo: This should be replace with graph cycle detection => https://github.com/projectdiscovery/katana/pull/174 |
| 21 | IsCycle(url string) bool |
| 22 | } |
no outgoing calls
no test coverage detected