Convert multiple Markdown documents to one safe HTML document. One could also achieve this by calling `markdown_to_safe_html` multiple times and combining the results. Compared to that approach, this function may be faster, because HTML sanitization (which can be expensive) is perfo
(markdown_strings, combine)
| 120 | |
| 121 | |
| 122 | def markdowns_to_safe_html(markdown_strings, combine): |
| 123 | """Convert multiple Markdown documents to one safe HTML document. |
| 124 | |
| 125 | One could also achieve this by calling `markdown_to_safe_html` |
| 126 | multiple times and combining the results. Compared to that approach, |
| 127 | this function may be faster, because HTML sanitization (which can be |
| 128 | expensive) is performed only once rather than once per input. It may |
| 129 | also be less precise: if one of the input documents has unsafe HTML |
| 130 | that is sanitized away, that sanitization might affect other |
| 131 | documents, even if those documents are safe. |
| 132 | |
| 133 | Args: |
| 134 | markdown_strings: List of Markdown source strings to convert, as |
| 135 | Unicode strings or UTF-8--encoded bytestrings. Markdown tables |
| 136 | are supported. |
| 137 | combine: Callback function that takes a list of unsafe HTML |
| 138 | strings of the same shape as `markdown_strings` and combines |
| 139 | them into a single unsafe HTML string, which will be sanitized |
| 140 | and returned. |
| 141 | |
| 142 | Returns: |
| 143 | A string containing safe HTML. |
| 144 | """ |
| 145 | unsafe_htmls = [] |
| 146 | total_null_bytes = 0 |
| 147 | |
| 148 | for source in markdown_strings: |
| 149 | # Convert to utf-8 whenever we have a binary input. |
| 150 | if isinstance(source, bytes): |
| 151 | source_decoded = source.decode("utf-8") |
| 152 | # Remove null bytes and warn if there were any, since it probably means |
| 153 | # we were given a bad encoding. |
| 154 | source = source_decoded.replace("\x00", "") |
| 155 | total_null_bytes += len(source_decoded) - len(source) |
| 156 | unsafe_html = _MARKDOWN_STORE.markdown.convert(source) |
| 157 | unsafe_htmls.append(unsafe_html) |
| 158 | |
| 159 | unsafe_combined = combine(unsafe_htmls) |
| 160 | sanitized_combined = _CLEANER_STORE.cleaner.clean(unsafe_combined) |
| 161 | |
| 162 | warning = "" |
| 163 | if total_null_bytes: |
| 164 | warning = ( |
| 165 | "<!-- WARNING: discarded %d null bytes in markdown string " |
| 166 | "after UTF-8 decoding -->\n" |
| 167 | ) % total_null_bytes |
| 168 | |
| 169 | return warning + sanitized_combined |
| 170 | |
| 171 | |
| 172 | def context(environ): |
no test coverage detected
searching dependent graphs…