[xmlsec] non us-ascii filenames in user locale
xmlsec at roumenpetrov.info
Wed Jun 23 02:23:13 PDT 2004
Aleksey Sanin wrote:
>> Agreed, application must deal with locales.
> Good :)
>> Current xmlsec don't allow me to replace file IO from an application.
> Of course, it does :)
xmlSecIORegisterCallbacks will append my callback to end of list.
Note in the list are default callbacks.
Later xmlSecIOCallbackPtrListFind return default file io callback - it
is before my callback.
Or I miss something?
> If I understand you correctly, the problem is that you have a UCS2
> (or UCS4) unicode filename that has all these zeros in the middle.
Filename is in charset, as example ISO-8859-1, where symbols are in
Locale charmap is same.
> Now you want to call LibXML2/xmlsec IO functions and you are in trouble
> because these functions need zero-terminated ASCII strings and
> later they use fopen() function that needs the ASCII string in the
> "current locale" (which might not be the locale that was used to
> create the filename).
> I guess, the only option I have for you is to do the following:
> 1) Convert the UCS filename into UTF8 on application level.
> 2) Replace the libxml2/xmlsec IO "file:" protocol handlers with custom
> ones that convert filenames back from UTF8 to UCS2/4 and then use
> "_wfopen" function instead of "fopen".
> Igor, it might make sense to do the item 2) on Windows platform
> all the time. I.e. LibXML2 assumes that the filename is in UTF8, then
> it converts it to unicode and uses _wfopen function instead of fopen
> (again, this is *only* for windows). This approach solves all the
> localization problems with "current locale" required for fopen.
I discover problem first on linux.
After this I debug it on windows (9x) fat32 filesystem.
> BTW, if you haven't seen this paper, I strongly recommend it as a brief
> introduction to Unicode mess:
This is just one of more articles about UNICODE.
On xfree86 site we can found a link
http://www.cl.cam.ac.uk/~mgk25/unicode.html - Markus Kuhn explain in
detail unicode for UNIX/LINUX.
XFree86 X-Windows 4.x has support for unicode. But cyrillic in versions
up to 4.3 is not stable.
One day I will test X11R6.7(XFree86 4.4) and when all is ok I will
change user locales to use xx_YY.UTF-8.
More information about the xmlsec