[xmlsec] non us-ascii filenames in user locale

Roumen Petrov xmlsec at roumenpetrov.info
Wed Jun 23 02:23:13 PDT 2004


Aleksey Sanin wrote:

>
>> Agreed, application must deal with locales.
>
> Good :)
>
>> Current xmlsec don't allow me to replace file IO from an application.
>
> Of course, it does :)
>     http://xmlsoft.org/xmlio.html
>     http://www.aleksey.com/xmlsec/api/xmlsec-io.html

xmlSecIORegisterCallbacks will append my callback to end of list.
Note in the list are default callbacks.

Later xmlSecIOCallbackPtrListFind return default file io callback - it 
is before my callback.

Or I miss something?

<http://www.aleksey.com/xmlsec/api/xmlsec-io.html#XMLSECIOREGISTERCALLBACKS> 


>     
> If I understand you correctly, the problem is that you have a UCS2
> (or UCS4) unicode filename that has all these zeros in the middle.

No.

Filename is in charset, as example ISO-8859-1, where symbols are in 
range 32-255.
Locale charmap is same.

> Now you want to call LibXML2/xmlsec IO functions and you are in trouble
> because these functions need zero-terminated ASCII strings and
> later they use fopen() function that needs the ASCII string in the
> "current locale" (which might not be the locale that was used to
> create the filename).
>
> I guess, the only option I have for you is to do the following:
> 1) Convert the UCS filename into UTF8 on application level.
> 2) Replace the libxml2/xmlsec IO "file:" protocol handlers with custom
> ones that convert filenames back from UTF8 to UCS2/4 and then use
> "_wfopen" function instead of "fopen".
>
> Igor, it might make sense to do the item 2) on Windows platform
> all the time. I.e. LibXML2 assumes that the filename is in UTF8, then
> it converts it to unicode and uses _wfopen function instead of fopen
> (again, this is *only* for windows). This approach solves all the
> localization problems with "current locale" required for fopen.

I discover problem first on linux.
After this I debug it on windows (9x) fat32 filesystem.

>
> BTW, if you haven't seen this paper, I strongly recommend it as a brief
> introduction to Unicode mess:
>     http://www.joelonsoftware.com/articles/Unicode.html

This is just one of more articles about UNICODE.
On xfree86 site we can found a link 
http://www.cl.cam.ac.uk/~mgk25/unicode.html - Markus Kuhn explain in 
detail unicode for UNIX/LINUX.
XFree86 X-Windows 4.x has support for unicode. But cyrillic in versions 
up to 4.3 is not stable.
One day I will test X11R6.7(XFree86 4.4) and when all is ok I will 
change user locales to use xx_YY.UTF-8.

>
>
> Best,
> Aleksey
>





More information about the xmlsec mailing list