1D List(配列)に入れた文字要素を文字種類で該当するものだけ抽出するAppleScriptです。
文字種類でデータ抽出する、という用途はけっこう多いので、単体で使えるようにしておきました。プログラムを見ていただくとわかるとおり、
数字:”9″
英字:”A”
半角記号:”$”
ひらがな:”ひ”
カタカナ:”カ”
漢字:”漢”
で文字種類を指定します。
以前のバージョンではありもののルーチンを組み合わせただけなので、全体的に無駄があって処理速度についてはあまり感心できないレベルだったので、若干の高速化を図りました(繰り返し処理部分で無駄な演算を省略)。
ただし、「ひらがな+カタカナは許容する」というふうに、複数の文字種を許可する例が多いので、これではまだ実用レベルには達していないと思います。
AppleScript名:1D Listのうち指定文字種で構成される要素のみ抽出 |
— – Created by: Takaaki Naganoya – Created on: 2018/12/20 — – Copyright © 2018 Piyomaru Software, All Rights Reserved — use AppleScript version "2.4" use scripting additions use framework "Foundation" property NSString : a reference to current application’s NSString property NSScanner : a reference to current application’s NSScanner property NSNumber : a reference to current application’s NSNumber property NSDictionary : a reference to current application’s NSDictionary property NSCountedSet : a reference to current application’s NSCountedSet property NSCharacterSet : a reference to current application’s NSCharacterSet property NSMutableArray : a reference to current application’s NSMutableArray property NSNumberFormatter : a reference to current application’s NSNumberFormatter property NSMutableCharacterSet : a reference to current application’s NSMutableCharacterSet property NSRegularExpressionSearch : a reference to current application’s NSRegularExpressionSearch property NSNumberFormatterRoundUp : a reference to current application’s NSNumberFormatterRoundUp property NSStringTransformFullwidthToHalfwidth : a reference to current application’s NSStringTransformFullwidthToHalfwidth set aList to {"Naganoya", "ながのや", "ナガノヤ", "長野谷"} –Alphabet, Hiragana, Katakana, Kanji set aRes to filterByCharKind(aList, "A") of me –アルファベットで構成される要素のみ抽出 –> {"Naganoya"} set bRes to filterByCharKind(aList, "ひ") of me –ひらがなだけで構成される要素のみ抽出 –> {"ながのや"} set cRes to filterByCharKind(aList, "カ") of me –カタカナだけで構成される要素のみ抽出 –> {"ナガノヤ"} set dRes to filterByCharKind(aList, "漢") of me –漢字だけで構成される要素のみ抽出 –> {"長野谷"} –文字種別を判定して指定文字種のみから構成されるものを抽出 on filterByCharKind(aList as list, targCharKind as string) set dList to {} repeat with i in aList set j to contents of i set tmpPat to retAtrPatternFromStr(j) of me if tmpPat is equal to {targCharKind} then set the end of dList to j end if end repeat return dList end filterByCharKind –Objective-Cライクなパラメータ記述 on makeUniqueListOf:theList set theSet to current application’s NSOrderedSet’s orderedSetWithArray:theList return (theSet’s array()) as list end makeUniqueListOf: –Pure AS風のパラメータ記述 on makeUniqueListFrom(theList) set aList to my makeUniqueListOf:theList return aList end makeUniqueListFrom –1D Listを文字列長でソート v2 on sort1DListByStringLength(aList as list, sortOrder as boolean) set aArray to current application’s NSArray’s arrayWithArray:aList set desc1 to current application’s NSSortDescriptor’s sortDescriptorWithKey:"length" ascending:sortOrder set desc2 to current application’s NSSortDescriptor’s sortDescriptorWithKey:"self" ascending:true selector:"localizedCaseInsensitiveCompare:" set bArray to aArray’s sortedArrayUsingDescriptors:{desc1, desc2} return bArray as list of string or string end sort1DListByStringLength –文字種別の判定 on retAtrPatternFromStr(aText as string) set b1List to {"9", "A", "$", "漢", "ひ", "カ"} –数字、アルファベット、記号、全角漢字、全角ひらがな、全角カタカナ –set cStr to zenToHan(aText) of me set outList to {} set cList to characters of (aText) repeat with i in cList set j to contents of i set chk1 to ((my chkNumeric:j) as integer) * 1 set chk2 to ((my chkAlphabet:j) as integer) * 2 set chk3 to ((my chkSymbol:j) as integer) * 3 set chk4 to ((my chkKanji:j) as integer) * 4 set chk5 to ((my chkHiragana:j) as integer) * 5 set chk6 to ((my chkKatakana:j) as integer) * 6 set itemVal to (chk1 + chk2 + chk3 + chk4 + chk5 + chk6) –if itemVal > 0 then set aVal to (contents of item itemVal of b1List) if aVal is not in outList then set the end of outList to aVal end if –end if end repeat return outList end retAtrPatternFromStr –全角→半角変換 on zenToHan(aStr) set aString to NSString’s stringWithString:aStr return (aString’s stringByApplyingTransform:(NSStringTransformFullwidthToHalfwidth) |reverse|:false) as string end zenToHan –数字か on chkNumeric:checkString set digitCharSet to NSCharacterSet’s characterSetWithCharactersInString:"0123456789" set ret to my chkCompareString:checkString baseString:digitCharSet return ret as boolean end chkNumeric: –記号か on chkSymbol:checkString set muCharSet to NSCharacterSet’s alloc()’s init() muCharSet’s addCharactersInString:"$\"!~&=#[]._-+`|{}?%^*/’@-/:;()," set ret to my chkCompareString:checkString baseString:muCharSet return ret as boolean end chkSymbol: –漢字か on chkKanji:aChar return detectCharKind(aChar, "[一-龠]") of me end chkKanji: –ひらがなか on chkHiragana:aChar return detectCharKind(aChar, "[ぁ-ん]") of me end chkHiragana: –カタカナか on chkKatakana:aChar return detectCharKind(aChar, "[ァ-ヶ]") of me end chkKatakana: –半角スペースか on chkSpace:checkString set muCharSet to NSCharacterSet’s alloc()’s init() muCharSet’s addCharactersInString:" " –半角スペース(20h) set ret to my chkCompareString:checkString baseString:muCharSet return ret as boolean end chkSpace: — アルファベットか on chkAlphabet:checkString set aStr to NSString’s stringWithString:checkString set allCharSet to NSMutableCharacterSet’s alloc()’s init() allCharSet’s addCharactersInRange:({location:97, |length|:26}) –97 = id of "a" allCharSet’s addCharactersInRange:({location:65, |length|:26}) –65 = id of "A" set aBool to my chkCompareString:aStr baseString:allCharSet return aBool as boolean end chkAlphabet: on chkCompareString:checkString baseString:baseString set aScanner to NSScanner’s localizedScannerWithString:checkString aScanner’s setCharactersToBeSkipped:(missing value) aScanner’s scanCharactersFromSet:baseString intoString:(missing value) return (aScanner’s isAtEnd()) as boolean end chkCompareString:baseString: on detectCharKind(aChar, aPattern) set aChar to NSString’s stringWithString:aChar set searchStr to NSString’s stringWithString:aPattern set matchRes to aChar’s rangeOfString:searchStr options:(NSRegularExpressionSearch) if matchRes’s location() = (current application’s NSNotFound) or (matchRes’s location() as number) > 9.99999999E+8 then return false else return true end if end detectCharKind |