node.js - convert rss encoding from windows 1255 to utf 8 node js -
i trying parse hebrew rss one: http://rss.walla.co.il/?w=/3/0/12/@rss.e
i using feedparser , request, , problem encoding windows-1255 , not utf-8
so see text like: ����� ������� , , not regular hebrew text.
i tried converts (like iconv-lite) did not succeed.
this code:
function getall(url) { var request = require('request'); request(url, function (error, response, body) { if (!error && response.statuscode == 200) { var allxml = body.substring(body.indexof('<title>') + ('<title>').length, body.indexof('</title>')); var text = iconv.decode(new buffer(allxml), 'win1255'); console.log("text = ", text); } })
}
and print: text = ן¿½ן¿½ן¿½ן¿½ן¿½! ן¿½ן¿½ן¿½ן¿½ן¿½ - ן¿½ן¿½ן¿½ן¿½ן¿½
you can use module such iconv
or iconv-lite
convert between encodings, since node natively supports utf8, utf16le, latin1/binary, ascii, hex, , base64.
Comments
Post a Comment