UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 4249: invalid continuation byte
比较烦人的问题,用notepad打开显示’utf-8’,但是还是不行
df = pd.read_csv(r'...\11-23.txt',header=None, sep='\t',encoding='utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 142121: invalid continuation byte
error_bad_lines
貌似没有生效
df = pd.read_csv(r'...\11-23.txt',header=None, sep='\t', error_bad_lines=False)
FutureWarning: The error_bad_lines argument has been deprecated and will be removed in a future version.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 142121: invalid continuation byte
看了下源码,使用encoding_errors
解决问题
df = pd.read_csv(r'...\time_space_tag\11-23.txt',header=None, sep='\t', encoding_errors='ignore')
版权声明:本文为weixin_40548136原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。