NSStringの状態でNFD/NFKD/NFC/NFKCの各正規化形式で正規化して、そのままAppleScriptのstringに「as string」でcastしても、その正規化の状態は維持されます。
AppleScript名:Unicodeの文字をNormalizeする |
— Created 2015-09-30 by Takaaki Naganoya — 2015 Piyomaru Software use AppleScript version "2.4" use scripting additions use framework "Foundation" –Reference: –http://akisute.com/2010/05/utf-8-normalize.html –http://nomenclator.la.coocan.jp/unicode/normalization.htm set a to "がぎぐげご" set aStr to current application’s NSString’s stringWithString:a log hexDumpString(aStr) –> {"E3", "81", "8C", "E3", "81", "8E", "E3", "81", "90", "E3", "81", "92", "E3", "81", "94"} –NFD set aNFD to aStr’s decomposedStringWithCanonicalMapping() –> (NSString) "がぎぐげご" log hexDumpString(aNFD) –> {"E3", "81", "8B", "E3", "82", "99", "E3", "81", "8D", "E3", "82", "99", "E3", "81", "8F", "E3", "82", "99", "E3", "81", "91", "E3", "82", "99", "E3", "81", "93", "E3", "82", "99"} –NFKD set aNFKD to aStr’s decomposedStringWithCompatibilityMapping() –> (NSString) "がぎぐげご" log hexDumpString(aNFKD) –> {"E3", "81", "8B", "E3", "82", "99", "E3", "81", "8D", "E3", "82", "99", "E3", "81", "8F", "E3", "82", "99", "E3", "81", "91", "E3", "82", "99", "E3", "81", "93", "E3", "82", "99"} –NFC set aNFC to aStr’s precomposedStringWithCanonicalMapping() –> (NSString) "がぎぐげご" log hexDumpString(aNFC) –> {"E3", "81", "8C", "E3", "81", "8E", "E3", "81", "90", "E3", "81", "92", "E3", "81", "94"} –NFKC set aNFKC to aStr’s precomposedStringWithCompatibilityMapping() –> (NSString) "がぎぐげご" log hexDumpString(aNFKC) –> {"E3", "81", "8C", "E3", "81", "8E", "E3", "81", "90", "E3", "81", "92", "E3", "81", "94"} –NSStringをhexdumpする on hexDumpString(theNSString) set theNSData to theNSString’s dataUsingEncoding:(current application’s NSUTF8StringEncoding) set theString to (theNSData’s |description|()’s uppercaseString()) –Remove "<" ">" characters in head and tail set tLength to (theString’s |length|()) – 2 set aRange to current application’s NSMakeRange(1, tLength) set theString2 to theString’s substringWithRange:aRange –Replace Space Characters set aString to current application’s NSString’s stringWithString:theString2 set bString to aString’s stringByReplacingOccurrencesOfString:" " withString:"" set aResList to splitString(bString, 2) –> {"E3", "81", "82", "E3", "81", "84", "E3", "81", "86", "E3", "81", "88", "E3", "81", "8A"} return aResList end hexDumpString –Split NSString in specified aNum characters on splitString(aText, aNum) set aStr to current application’s NSString’s stringWithString:aText if aStr’s |length|() ≤ aNum then return aText set anArray to current application’s NSMutableArray’s new() set mStr to current application’s NSMutableString’s stringWithString:aStr set aRange to current application’s NSMakeRange(0, aNum) repeat while (mStr’s |length|()) > 0 if (mStr’s |length|()) < aNum then anArray’s addObject:(current application’s NSString’s stringWithString:mStr) mStr’s deleteCharactersInRange:(current application’s NSMakeRange(0, mStr’s |length|())) else anArray’s addObject:(mStr’s substringWithRange:aRange) mStr’s deleteCharactersInRange:aRange end if end repeat return (current application’s NSArray’s arrayWithArray:anArray) as list end splitString |