MCPcopy
hub / github.com/Mimino666/langdetect / unicode_block

Function unicode_block

langdetect/utils/unicode_block.py:449–465  ·  view source on GitHub ↗

Return the Unicode block name for ch, or None if ch has no block.

(ch)

Source from the content-addressed store, hash-verified

447
448
449def unicode_block(ch):
450 '''Return the Unicode block name for ch, or None if ch has no block.'''
451 cp = ord(ch)
452 # special case basic latin
453 if cp <= 0x7F:
454 return UNICODE_BASIC_LATIN
455 # binary search for the correct block
456 be, en = 0, NUM_BLOCKS - 1
457 while be <= en:
458 mid = (be+en) >> 1
459 name, start, end = _unicode_blocks[mid]
460 if start <= cp <= end:
461 return name
462 if cp < start:
463 en = mid-1
464 else:
465 be = mid+1

Callers 2

cleaning_textMethod · 0.90
normalizeMethod · 0.90

Calls

no outgoing calls

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…