Encode a domain that is present into a line into `idna`. This way we avoid most encoding issues. Parameters ---------- line : str The line we have to encode/decode. Returns ------- line : str The line in a converted format. Notes -----
(line)
| 1594 | |
| 1595 | |
| 1596 | def domain_to_idna(line): |
| 1597 | """ |
| 1598 | Encode a domain that is present into a line into `idna`. This way we |
| 1599 | avoid most encoding issues. |
| 1600 | |
| 1601 | Parameters |
| 1602 | ---------- |
| 1603 | line : str |
| 1604 | The line we have to encode/decode. |
| 1605 | |
| 1606 | Returns |
| 1607 | ------- |
| 1608 | line : str |
| 1609 | The line in a converted format. |
| 1610 | |
| 1611 | Notes |
| 1612 | ----- |
| 1613 | - This function encodes only the domain to `idna` format because in |
| 1614 | most cases, the encoding issue is due to a domain which looks like |
| 1615 | `b'\xc9\xa2oogle.com'.decode('idna')`. |
| 1616 | - About the splitting: |
| 1617 | We split because we only want to encode the domain and not the full |
| 1618 | line, which may cause some issues. Keep in mind that we split, but we |
| 1619 | still concatenate once we encoded the domain. |
| 1620 | |
| 1621 | - The following split the prefix `0.0.0.0` or `127.0.0.1` of a line. |
| 1622 | - The following also split the trailing comment of a given line. |
| 1623 | """ |
| 1624 | |
| 1625 | if not line.startswith("#"): |
| 1626 | tabs = "\t" |
| 1627 | space = " " |
| 1628 | |
| 1629 | tabsposition, spaceposition = (line.find(tabs), line.find(space)) |
| 1630 | |
| 1631 | if tabsposition > -1 and spaceposition > -1: |
| 1632 | if spaceposition < tabsposition: |
| 1633 | separator = space |
| 1634 | else: |
| 1635 | separator = tabs |
| 1636 | elif not tabsposition == -1: |
| 1637 | separator = tabs |
| 1638 | elif not spaceposition == -1: |
| 1639 | separator = space |
| 1640 | else: |
| 1641 | separator = "" |
| 1642 | |
| 1643 | if separator: |
| 1644 | splited_line = line.split(separator) |
| 1645 | |
| 1646 | try: |
| 1647 | index = 1 |
| 1648 | while index < len(splited_line): |
| 1649 | if splited_line[index]: |
| 1650 | break |
| 1651 | index += 1 |
| 1652 | |
| 1653 | if "#" in splited_line[index]: |
no outgoing calls