关于UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 199: illegal multibyte sequence

1.问题描述:机器学习实战中,运行程序清单4-5,报错

关于UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 199: illegal multibyte sequence

2.解决方式:

    原因是email的ham的23.txt文件中存在?,识别不出,从Pycharm中打开时如下所示,估计是读取过程出了问题:

关于UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 199: illegal multibyte sequence

    那么把该方底?改为正常的问号即可:

关于UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 199: illegal multibyte sequence

3.检查:

关于UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 199: illegal multibyte sequence

    不报错了