[darcs-users] OT: UTF-8 in filenames [was: darcs and non-ASCII characters]

Juliusz Chroboczek Juliusz.Chroboczek at pps.jussieu.fr
Sat Nov 5 23:32:59 UTC 2005


> The operating system part that handles files consists of two layers,
> Mac OS X and HFS+.

More precisely, you've got the kernel which sits above a number of
distinct FS layers (UFS, HFS+, isofs, NFS, webdav, vfat, etc.).
That's why you're able to plug a USB key in Windows format and have it
just work.

> The upper layer accepts all byte sequences (up to limitations with
> "/" and "\000", probably),

That's exactly right.

> and the lower layer handles only UTF-8.

It's up to the particular FS to decide whether it'll accept a given
filename.  HFS+ requires UTF-8, UFS and NFS handle all byte sequences,
isofs... well, isofs is weird.

> (One of the two -- probably also HFS+ -- is also case insensitive.)

Indeed, HFS+ is case-preserving case-insensitive.  Both UFS and NFS
are case-sensitive.  But you won't notice the difference unless you
use the shell, as the GUI cleverly hides the case-sensitivity from you.

Note that all of the above is only about the *kernel*.  It is quite
likely that the GUI libraries refuse to handle anything that's not
UTF-8.  So your initial point still stands.

                                        Juliusz




More information about the darcs-users mailing list