A utility method that's useful for debugging mysterious Unicode. It breaks down a string, showing you for each codepoint its number in hexadecimal, its glyph, its category in the Unicode standard, and its name in the Unicode standard. >>> explain_unicode('(╯°□°)╯︵ ┻━┻')
(text: str)
| 766 | |
| 767 | |
| 768 | def explain_unicode(text: str) -> None: |
| 769 | """ |
| 770 | A utility method that's useful for debugging mysterious Unicode. |
| 771 | |
| 772 | It breaks down a string, showing you for each codepoint its number in |
| 773 | hexadecimal, its glyph, its category in the Unicode standard, and its name |
| 774 | in the Unicode standard. |
| 775 | |
| 776 | >>> explain_unicode('(╯°□°)╯︵ ┻━┻') |
| 777 | U+0028 ( [Ps] LEFT PARENTHESIS |
| 778 | U+256F ╯ [So] BOX DRAWINGS LIGHT ARC UP AND LEFT |
| 779 | U+00B0 ° [So] DEGREE SIGN |
| 780 | U+25A1 □ [So] WHITE SQUARE |
| 781 | U+00B0 ° [So] DEGREE SIGN |
| 782 | U+0029 ) [Pe] RIGHT PARENTHESIS |
| 783 | U+256F ╯ [So] BOX DRAWINGS LIGHT ARC UP AND LEFT |
| 784 | U+FE35 ︵ [Ps] PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS |
| 785 | U+0020 [Zs] SPACE |
| 786 | U+253B ┻ [So] BOX DRAWINGS HEAVY UP AND HORIZONTAL |
| 787 | U+2501 ━ [So] BOX DRAWINGS HEAVY HORIZONTAL |
| 788 | U+253B ┻ [So] BOX DRAWINGS HEAVY UP AND HORIZONTAL |
| 789 | """ |
| 790 | for char in text: |
| 791 | if char.isprintable(): |
| 792 | display = char |
| 793 | else: |
| 794 | display = char.encode("unicode-escape").decode("ascii") |
| 795 | print( |
| 796 | "U+{code:04X} {display} [{category}] {name}".format( |
| 797 | display=display_ljust(display, 7), |
| 798 | code=ord(char), |
| 799 | category=unicodedata.category(char), |
| 800 | name=unicodedata.name(char, "<unknown>"), |
| 801 | ) |
| 802 | ) |
nothing calls this directly
no test coverage detected