본문 바로가기

Computer Science/Python

[Python] json.dumps 한글깨짐 해결

반응형

기본적으로 json출력은 ascii 문자 이외에는 모두 이스케이프 처리 하기때문에 ascii 문자가 아니라면 flag 설정이 필요하다.

ensure_ascii=False 를 추가하면 된다.

json.dumps(data, indent=4, ensure_ascii=False)

 

 

자세한 스펙은 파이썬 공식문서를 참고하였다.

docs.python.org/3.7/library/json.html

 

Character Encodings

The RFC requires that JSON be represented using either UTF-8, UTF-16, or UTF-32, with UTF-8 being the recommended default for maximum interoperability.

As permitted, though not required, by the RFC, this module’s serializer sets ensure_ascii=True by default, thus escaping the output so that the resulting strings only contain ASCII characters.

Other than the ensure_ascii parameter, this module is defined strictly in terms of conversion between Python objects and Unicode strings, and thus does not otherwise directly address the issue of character encodings.

The RFC prohibits adding a byte order mark (BOM) to the start of a JSON text, and this module’s serializer does not add a BOM to its output. The RFC permits, but does not require, JSON deserializers to ignore an initial BOM in their input. This module’s deserializer raises a ValueError when an initial BOM is present.

The RFC does not explicitly forbid JSON strings which contain byte sequences that don’t correspond to valid Unicode characters (e.g. unpaired UTF-16 surrogates), but it does note that they may cause interoperability problems. By default, this module accepts and outputs (when present in the original str) code points for such sequences.

 

반응형