MCPcopy Index your code
hub / github.com/sqlmapproject/sqlmap / checkCharEncoding

Function checkCharEncoding

lib/request/basic.py:159–250  ·  view source on GitHub ↗

Checks encoding name, repairs common misspellings and adjusts to proper namings used in codecs module >>> checkCharEncoding('iso-8858', False) 'iso8859-1' >>> checkCharEncoding('en_us', False) 'utf8'

(encoding, warn=True)

Source from the content-addressed store, hash-verified

157
158@cachedmethod
159def checkCharEncoding(encoding, warn=True):
160 """
161 Checks encoding name, repairs common misspellings and adjusts to
162 proper namings used in codecs module
163
164 >>> checkCharEncoding('iso-8858', False)
165 'iso8859-1'
166 >>> checkCharEncoding('en_us', False)
167 'utf8'
168 """
169
170 if isinstance(encoding, six.binary_type):
171 encoding = getUnicode(encoding)
172
173 if isListLike(encoding):
174 encoding = unArrayizeValue(encoding)
175
176 if encoding:
177 encoding = encoding.lower()
178 else:
179 return encoding
180
181 # Reference: http://www.destructor.de/charsets/index.htm
182 translate = {"windows-874": "iso-8859-11", "utf-8859-1": "utf8", "en_us": "utf8", "macintosh": "iso-8859-1", "euc_tw": "big5_tw", "th": "tis-620", "unicode": "utf8", "utc8": "utf8", "ebcdic": "ebcdic-cp-be", "iso-8859": "iso8859-1", "iso-8859-0": "iso8859-1", "ansi": "ascii", "gbk2312": "gbk", "windows-31j": "cp932", "en": "us"}
183
184 for delimiter in (';', ',', '('):
185 if delimiter in encoding:
186 encoding = encoding[:encoding.find(delimiter)].strip()
187
188 encoding = encoding.replace("&quot", "")
189
190 # popular typos/errors
191 if "8858" in encoding:
192 encoding = encoding.replace("8858", "8859") # iso-8858 -> iso-8859
193 elif "8559" in encoding:
194 encoding = encoding.replace("8559", "8859") # iso-8559 -> iso-8859
195 elif "8895" in encoding:
196 encoding = encoding.replace("8895", "8859") # iso-8895 -> iso-8859
197 elif "5889" in encoding:
198 encoding = encoding.replace("5889", "8859") # iso-5889 -> iso-8859
199 elif "5589" in encoding:
200 encoding = encoding.replace("5589", "8859") # iso-5589 -> iso-8859
201 elif "2313" in encoding:
202 encoding = encoding.replace("2313", "2312") # gb2313 -> gb2312
203 elif encoding.startswith("x-"):
204 encoding = encoding[len("x-"):] # x-euc-kr -> euc-kr / x-mac-turkish -> mac-turkish
205 elif "windows-cp" in encoding:
206 encoding = encoding.replace("windows-cp", "windows") # windows-cp-1254 -> windows-1254
207
208 # name adjustment for compatibility
209 if encoding.startswith("8859"):
210 encoding = "iso-%s" % encoding
211 elif encoding.startswith("cp-"):
212 encoding = "cp%s" % encoding[3:]
213 elif encoding.startswith("euc-"):
214 encoding = "euc_%s" % encoding[4:]
215 elif encoding.startswith("windows") and not encoding.startswith("windows-"):
216 encoding = "windows-%s" % encoding[7:]

Callers 2

_basicOptionValidationFunction · 0.90
decodePageFunction · 0.85

Calls 10

getUnicodeFunction · 0.90
isListLikeFunction · 0.90
unArrayizeValueFunction · 0.90
getBytesFunction · 0.90
randomStrFunction · 0.90
singleTimeLogMessageFunction · 0.90
findMethod · 0.80
lookupMethod · 0.80
replaceMethod · 0.45
searchMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…