Bug report
Bug description:
Hi, when I use the mimetypes module and one of the known mime.types files include a non utf-8 encoded comment, the operation fails with UnicodeDecodeError:
......
File "/usr/lib/python3.9/urllib/request.py", line 1506, in open_local_file
mtype = mimetypes.guess_type(filename)[0]
File "/usr/lib/python3.9/mimetypes.py", line 289, in guess_type
init()
File "/usr/lib/python3.9/mimetypes.py", line 362, in init
db.read(file)
File "/usr/lib/python3.9/mimetypes.py", line 204, in read
self.readfp(fp, strict)
File "/usr/lib/python3.9/mimetypes.py", line 215, in readfp
line = fp.readline()
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 168: invalid start byte
The same can be forced with:
import mimetypes
mimetypes.init(files=["mimefile"])
and occurs because the file is opened in text mode expecting unicode encoding:
|
with open(filename, encoding='utf-8') as fp: |
I am not sure whether there is a convention for which encoding the mime.types file will use, but I feel that at least comments should be allowed in any encoding?
CPython versions tested on:
3.9, 3.11
Operating systems tested on:
Linux, Other
Linked PRs
Bug report
Bug description:
Hi, when I use the
mimetypesmodule and one of the knownmime.typesfiles include a non utf-8 encoded comment, the operation fails withUnicodeDecodeError:The same can be forced with:
and occurs because the file is opened in text mode expecting unicode encoding:
cpython/Lib/mimetypes.py
Line 215 in 2e098ab
I am not sure whether there is a convention for which encoding the
mime.typesfile will use, but I feel that at least comments should be allowed in any encoding?CPython versions tested on:
3.9, 3.11
Operating systems tested on:
Linux, Other
Linked PRs