~mcepl/json_diff#15: 
cannot process UTF-8 files ('charmap' codec can't decode byte 0x81 in position 2071: character maps to <undefined>)

This is probably problem with UTF-8 encoding according to some similar issue. I ran the script (installed with pip) on Windows 10 with "Python 3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)] on win32" on 2 UTF-8 files with czech characters. One file had CRLF line endings, the second one LF.

Full traceback:

Traceback (most recent call last):
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\site-packages\json_diff.py", line 153, in __init__
    self.obj1 = json.load(fn1)
                ^^^^^^^^^^^^^^
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
                 ^^^^^^^^^
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1250.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1255: character maps to <undefined>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Scripts\json_diff-script.py", line 33, in <module>
    sys.exit(load_entry_point('json-diff==1.5.0', 'console_scripts', 'json_diff')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\site-packages\json_diff.py", line 369, in main
    diff = Comparator(old_file, new_file, options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\site-packages\json_diff.py", line 155, in __init__
    raise BadJSONError("Cannot decode object from JSON.\n%s" %
json_diff.BadJSONError: Cannot decode object from JSON.
'charmap' codec can't decode byte 0x81 in position 1255: character maps to <undefined>
Status
REPORTED
Submitter
~mcepl
Assigned to
No-one
Submitted
9 months ago
Updated
9 months ago
Labels
No labels applied.

~mcepl 9 months ago

The fix should be adding explicit encoding="utf-8" parameters to function open, like stated in mentioned issue.

~mcepl 9 months ago

Hmm, after fixing the encoding in this way another error comes:

Traceback (most recent call last):
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Scripts\json_diff-script.py", line 33, in <module>
    sys.exit(load_entry_point('json-diff==1.5.0', 'console_scripts', 'json_diff')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\site-packages\json_diff.py", line 370, in main
    diff_res = diff.compare_dicts()
               ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\$me\AppData\Local\Programs\Python\Python311\Lib\site-packages\json_diff.py", line 305, in compare_dicts
    old_keys = set(old_obj.keys())
                   ^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'keys'

But this I am leaving for someone else :)

Register here or Log in to comment, or comment via email.