From f63d6bf4a15147d764b0fefe0ced71d333d99e39 Mon Sep 17 00:00:00 2001 From: Vladimir 'phcoder' Serbinenko Date: Sun, 25 Dec 2011 14:46:44 +0100 Subject: [PATCH] * docs/grub.texi (Filesystems): Clarify restrictions. (Regexp): Mention non-Unicode regexp behaviour. (Other): Mention non-Unicode matching behaviour. --- ChangeLog | 8 +++++++- docs/grub.texi | 36 ++++++++++++++++++++++++++---------- 2 files changed, 33 insertions(+), 11 deletions(-) diff --git a/ChangeLog b/ChangeLog index ccfe212e7..2fc512655 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,6 +1,12 @@ +2011-12-25 Vladimir Serbinenko + + * docs/grub.texi (Filesystems): Clarify restrictions. + (Regexp): Mention non-Unicode regexp behaviour. + (Other): Mention non-Unicode matching behaviour. + 2011-12-24 Vladimir Serbinenko - Make HFS implementation to use MacRoman. + Make HFS implementation use MacRoman. * grub-core/fs/hfs.c (MAX_UTF8_PER_MAC_ROMAN): New define. (macroman): New const array. diff --git a/docs/grub.texi b/docs/grub.texi index f5a6de69e..ddd44b11c 100644 --- a/docs/grub.texi +++ b/docs/grub.texi @@ -3932,7 +3932,7 @@ appropriate representation is used. All text files (including config) are assumed to be encoded in UTF-8. @chapter Filesystems -NTFS, JFS, UDF, HFS+, exFAT, long filesnames in FAT, Joliet part of +NTFS, JFS, UDF, HFS+, exFAT, long filenames in FAT, Joliet part of ISO9660 are treated as UTF-16 as per specification. BFS is read as UTF-8, again according to specification. BtrFS, cpio, tar, squash4, minix, minix2, minix3, ROMFS, ReiserFS, XFS, ext2, ext3, ext4, FAT (short names), @@ -3942,15 +3942,20 @@ but as long as the charset used is superset of ASCII you should be able to access ASCII-named files. And it's recommended to configure your system to use UTF-8 to access the filesystem, convmv may help with migration. AFFS and HFS never use unicode and GRUB assumes them to be in Latin1 and MacRoman -respectively. NTFS, HFS+, FAT and exFAT are case-insensitive however no -attempt is performed at case conversion of international characters so e.g. -a file named lowercase greek alpha is treated as different from the one named -as uppercase alpha. Also similar to POSIX systems GRUB make no attempt at check -of canonical equivalence so a file name u-diaresis is treated as distinct from -u+combining diaresis. This however means that in order to access file on -HFS+ its name must be specified in normalisation form D. On ZFS subvolumes -marked as case insensitive files containing lowercase international characters -are inaccessible. +respectively. GRUB handles filesystem case-insensitivity however no attempt +is performed at case conversion of international characters so e.g. a file +named lowercase greek alpha is treated as different from the one named +as uppercase alpha. The filesystems in questions are NTFS (except POSIX +namespace), HFS+ (by default), FAT, exFAT +and ZFS (configurable on per-subvolume basis by property ``casesensitivity'', +default sensitive). On ZFS subvolumes marked as case insensitive files +containing lowercase international characters are inaccessible. +Also like all supported filesystems except HFS+ and ZFS (configurable on +per-subvolume basis by property ``normalization'', default none) GRUB makes +no attempt at check of canonical equivalence so a file name u-diaresis is +treated as distinct from u+combining diaresis. This however means that in +order to access file on HFS+ its name must be specified in normalisation form D. +On normalized ZFS subvolumes filenames out of normalisation are inaccessible. @chapter Output terminal Firmware output console ``console'' on ARC and IEEE1275 are limited to ASCII. @@ -3985,6 +3990,17 @@ makes difficult to enter any text using non-Latin alphabet. @chapter Gettext GRUB supports being translated. For this you need to have language *.mo files in $prefix/locale, load gettext module and set ``lang'' variable. +@chapter Regexp +Regexps work on unicode characters, however no attempt at checking cannonical +equivalence has been made. Moreover the classes like [:alpha:] match only +ASCII subset. + +@chapter Other +IEEE1275 aliases are matched case-insensitively except non-ASCII which is +matched as binary. Similar behaviour is for matching OSBundleRequired. +Since IEEE1275 aliases and OSBundleRequired don't contain any non-ASCII it +should never be a problem in practice. + @node Security @chapter Authentication and authorisation