I want to save data (which has non-Latin symbols) from SPSS 25. I use syntax
.... /TYPE=CSV /ENCODING='UTF8'... or 'UTF16'
and then I want to read CSV into Pandas:
p1=pandas.read_csv(mypath, sep=';',header=0, quotechar='"', low_memory=False, usecols=cols)
and Pandas returns error:
in case of UTF16
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
I open this CSV in Notepad++ and it shows encoding is UCS LE-2 BOM, which I manually change to UTF16 and save file, and then it is OK.
in case of UTF8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 254 (*there are only numbers and latin symbols at the beginning of the file): unexpected end of data
I open this CSV in Notepad++ and it shows encoding is UTF-8 BOM, and even if I change it to UTF-8, Pandas returns the same error.
How can I save to CSV normally, without additional converting via text-editor ?