MCPcopy
hub / github.com/dask/dask / parse_bytes

Function parse_bytes

dask/utils.py:1582–1636  ·  view source on GitHub ↗

Parse byte string to numbers >>> from dask.utils import parse_bytes >>> parse_bytes('100') 100 >>> parse_bytes('100 MB') 100000000 >>> parse_bytes('100M') 100000000 >>> parse_bytes('5kB') 5000 >>> parse_bytes('5.4 kB') 5400 >>> parse_bytes('1kiB')

(s: float | str)

Source from the content-addressed store, hash-verified

1580
1581
1582def parse_bytes(s: float | str) -> int:
1583 """Parse byte string to numbers
1584
1585 >>> from dask.utils import parse_bytes
1586 >>> parse_bytes('100')
1587 100
1588 >>> parse_bytes('100 MB')
1589 100000000
1590 >>> parse_bytes('100M')
1591 100000000
1592 >>> parse_bytes('5kB')
1593 5000
1594 >>> parse_bytes('5.4 kB')
1595 5400
1596 >>> parse_bytes('1kiB')
1597 1024
1598 >>> parse_bytes('1e6')
1599 1000000
1600 >>> parse_bytes('1e6 kB')
1601 1000000000
1602 >>> parse_bytes('MB')
1603 1000000
1604 >>> parse_bytes(123)
1605 123
1606 >>> parse_bytes('5 foos')
1607 Traceback (most recent call last):
1608 ...
1609 ValueError: Could not interpret 'foos' as a byte unit
1610 """
1611 if isinstance(s, (int, float)):
1612 return int(s)
1613 s = s.replace(" ", "")
1614 if not any(char.isdigit() for char in s):
1615 s = f"1{s}"
1616
1617 for i in range(len(s) - 1, -1, -1):
1618 if not s[i].isalpha():
1619 break
1620 index = i + 1
1621
1622 prefix = s[:index]
1623 suffix = s[index:]
1624
1625 try:
1626 n = float(prefix)
1627 except ValueError as e:
1628 raise ValueError(f"Could not interpret '{prefix}' as a number") from e
1629
1630 try:
1631 multiplier = byte_sizes[suffix.lower()]
1632 except KeyError as e:
1633 raise ValueError(f"Could not interpret '{suffix}' as a byte unit") from e
1634
1635 result = n * multiplier
1636 return int(result)
1637
1638
1639byte_sizes = {

Callers 14

read_bytesFunction · 0.90
read_textFunction · 0.90
repartition_sizeFunction · 0.90
_sizeMethod · 0.90
_fragment_to_tableMethod · 0.90
read_pandasFunction · 0.90
read_sql_queryFunction · 0.90
aggregate_row_groupsFunction · 0.90
_infer_split_row_groupsFunction · 0.90
test_parse_bytesFunction · 0.90
normalize_chunksFunction · 0.90

Calls 2

anyFunction · 0.85
replaceMethod · 0.45

Tested by 1

test_parse_bytesFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…