1D List(配列)に入れた文字要素を文字種類で該当するものだけ抽出するAppleScriptの改良版です。
プログラムを見ていただくとわかるとおり、
数字:”9″
英字:”A”
半角記号:”$”
ひらがな:”ひ”
カタカナ:”カ”
漢字:”漢”
で文字種類を指定します。
ルーチンは2種類用意しており、
filterByMultipleCharKindStrictly:文字種別を判定して指定文字種のみから構成されるものを抽出(厳密に文字種別を遵守)
filterByMultipleCharKind:文字種別を判定して指定文字種のみから構成されるものを抽出
前者は複数の文字種リストを指定したら、その文字種リストと等しいパターンの文字列だけを抽出します。
後者は複数の文字種リストを指定したら、その文字種のうちどれかだけで構成される文字列だけを抽出します。
{"Naganoya", "ながのや", "ナガノヤ", "長野谷", "ぴよまるソフトウェア", "ぴよまるSoftware", "Piyomaru12345", "123456789", "ぴよまる1234"}
という文字列を与え、
{"ひ", "カ"}--Hiragana, Katakana
という抽出パターンを指定すると、
filterByMultipleCharKindStrictly --> {"ぴよまるソフトウェア"}
filterByMultipleCharKind --> {"ながのや", "ナガノヤ"}
のように抽出します。
スピードを考えなければ、機能は実用レベルにあると思われます。
高速化する場合には、
(1)Cocoaの機能を一切使わない(小さいデータを小分けにしてCocoaの機能を呼び出しているので、Cocoaの機能を使って高速化するのに最悪の処理パターンになっている。データ量のわりに時間がかかる。むしろCocoaの機能を使わない方が高速)
(2)Cocoaで一括処理できるように検討する(NSPredicatesで文字種別を同時に指定して抽出するなど)
といったところでしょうか。長い割にはそんなによくないプログラムですね。機能すること以外に取り柄がないというか、、、
AppleScript名:1D Listのうち指定文字種(複数指定可)で構成される要素のみ抽出 v2 |
— – Created by: Takaaki Naganoya – Created on: 2019/01/01 — – Copyright © 2019 Piyomaru Software, All Rights Reserved — use AppleScript version "2.4" use scripting additions use framework "Foundation"
property NSArray : a reference to current application’s NSArray property NSString : a reference to current application’s NSString property NSScanner : a reference to current application’s NSScanner property NSNumber : a reference to current application’s NSNumber property NSDictionary : a reference to current application’s NSDictionary property NSOrderedSet : a reference to current application’s NSOrderedSet property NSCountedSet : a reference to current application’s NSCountedSet property NSCharacterSet : a reference to current application’s NSCharacterSet property NSMutableArray : a reference to current application’s NSMutableArray property NSSortDescriptor : a reference to current application’s NSSortDescriptor property NSNumberFormatter : a reference to current application’s NSNumberFormatter property NSMutableCharacterSet : a reference to current application’s NSMutableCharacterSet property NSRegularExpressionSearch : a reference to current application’s NSRegularExpressionSearch property NSNumberFormatterRoundUp : a reference to current application’s NSNumberFormatterRoundUp property NSStringTransformFullwidthToHalfwidth : a reference to current application’s NSStringTransformFullwidthToHalfwidth
set aList to {"Naganoya", "ながのや", "ナガノヤ", "長野谷", "ぴよまるソフトウェア", "ぴよまるSoftware", "Piyomaru12345", "123456789", "ぴよまる1234"} –Alphabet, Hiragana, Katakana, Kanji, Hiragana+Katakana, Hiragana + Alphabet, Alphabet + Numeric, Numeric
set aRes to filterByMultipleCharKind(aList, {"A"}) of me –アルファベットで構成される要素のみ抽出 –> {"Naganoya"}
set bRes to filterByMultipleCharKind(aList, {"ひ"}) of me –ひらがなだけで構成される要素のみ抽出 –> {"ながのや"}
set cRes to filterByMultipleCharKind(aList, {"カ"}) of me –カタカナだけで構成される要素のみ抽出 –> {"ナガノヤ"}
set dRes to filterByMultipleCharKind(aList, {"漢"}) of me –漢字だけで構成される要素のみ抽出 –> {"長野谷"}
set eRes1 to filterByMultipleCharKindStrictly(aList, {"ひ", "カ"}) of me –ひらがな+カタカナで構成される要素のみ抽出 –> {"ぴよまるソフトウェア"}
set eRes2 to filterByMultipleCharKind(aList, {"ひ", "カ"}) of me –ひらがな or カタカナ だけで構成される要素のみ抽出 –> {"ながのや", "ナガノヤ"}
set fRes1 to filterByMultipleCharKindStrictly(aList, {"A", "9"}) of me –Alphabet + Numericだけで構成される要素のみ抽出 –> {"Piyomaru12345"}
set fRes2 to filterByMultipleCharKind(aList, {"A", "9"}) of me –Alphabet or Numericだけで構成される要素のみ抽出 –> {"Naganoya", "123456789"}
set gRes1 to filterByMultipleCharKindStrictly(aList, {"ひ", "9"}) of me –Hiragana + Numericだけで構成される要素のみ抽出 –> {"ぴよまる1234"}
–文字種別を判定して指定文字種のみから構成されるものを抽出(厳密に文字種別を遵守) on filterByMultipleCharKindStrictly(aList as list, targCharKindList as list) set dList to {} set paramList to sort1DStringList(targCharKindList, true) of me repeat with i in aList set j to contents of i set tmpPat to retAtrPatternFromStr(j) of me if tmpPat is equal to paramList then set the end of dList to j end if end repeat return dList end filterByMultipleCharKindStrictly
–文字種別を判定して指定文字種のみから構成されるものを抽出 on filterByMultipleCharKind(aList as list, targCharKindList as list) set dList to {} repeat with i in aList set j to contents of i set tmpPat to retAtrPatternFromStr(j) of me if tmpPat is in targCharKindList then set the end of dList to j end if end repeat return dList end filterByMultipleCharKind
–Objective-Cライクなパラメータ記述 on makeUniqueListOf:theList set theSet to NSOrderedSet’s orderedSetWithArray:theList return (theSet’s array()) as list end makeUniqueListOf:
–Pure AS風のパラメータ記述 on makeUniqueListFrom(theList) set aList to my makeUniqueListOf:theList return aList end makeUniqueListFrom
–1D Listを文字列長でソート v2 on sort1DListByStringLength(aList as list, sortOrder as boolean) set aArray to NSArray’s arrayWithArray:aList set desc1 to NSSortDescriptor’s sortDescriptorWithKey:"length" ascending:sortOrder set desc2 to NSSortDescriptor’s sortDescriptorWithKey:"self" ascending:true selector:"localizedCaseInsensitiveCompare:" set bArray to aArray’s sortedArrayUsingDescriptors:{desc1, desc2} return bArray as list of string or string end sort1DListByStringLength
–1D List(文字)をsort / ascOrderがtrueだと昇順ソート、falseだと降順ソート on sort1DStringList(theList as list, aBool as boolean) set aDdesc to NSSortDescriptor’s sortDescriptorWithKey:"self" ascending:aBool selector:"localizedCaseInsensitiveCompare:" set theArray to NSArray’s arrayWithArray:theList return (theArray’s sortedArrayUsingDescriptors:{aDdesc}) as list end sort1DStringList
–文字種別の判定 on retAtrPatternFromStr(aText as string) set b1List to {"9", "A", "$", "漢", "ひ", "カ"} –数字、アルファベット、記号、全角漢字、全角ひらがな、全角カタカナ set outList to {} set cList to characters of (aText) repeat with i in cList set j to contents of i set chk1 to ((my chkNumeric:j) as integer) * 1 set chk2 to ((my chkAlphabet:j) as integer) * 2 set chk3 to ((my chkSymbol:j) as integer) * 3 set chk4 to ((my chkKanji:j) as integer) * 4 set chk5 to ((my chkHiragana:j) as integer) * 5 set chk6 to ((my chkKatakana:j) as integer) * 6 set itemVal to (chk1 + chk2 + chk3 + chk4 + chk5 + chk6) –if itemVal > 0 then set aVal to (contents of item itemVal of b1List) if aVal is not in outList then set the end of outList to aVal end if –end if end repeat set out2List to sort1DStringList(outList, true) of me return out2List end retAtrPatternFromStr
–全角→半角変換 on zenToHan(aStr) set aString to NSString’s stringWithString:aStr return (aString’s stringByApplyingTransform:(NSStringTransformFullwidthToHalfwidth) |reverse|:false) as string end zenToHan
–数字か on chkNumeric:checkString set digitCharSet to NSCharacterSet’s characterSetWithCharactersInString:"0123456789" set ret to my chkCompareString:checkString baseString:digitCharSet return ret as boolean end chkNumeric:
–記号か on chkSymbol:checkString set muCharSet to NSCharacterSet’s alloc()’s init() muCharSet’s addCharactersInString:"$\"!~&=#[]._-+`|{}?%^*/’@-/:;()," set ret to my chkCompareString:checkString baseString:muCharSet return ret as boolean end chkSymbol:
–漢字か on chkKanji:aChar return detectCharKind(aChar, "[一-龠]") of me end chkKanji:
–ひらがなか on chkHiragana:aChar return detectCharKind(aChar, "[ぁ-ん]") of me end chkHiragana:
–カタカナか on chkKatakana:aChar return detectCharKind(aChar, "[ァ-ヶ]") of me end chkKatakana:
–半角スペースか on chkSpace:checkString set muCharSet to NSCharacterSet’s alloc()’s init() muCharSet’s addCharactersInString:" " –半角スペース(20h) set ret to my chkCompareString:checkString baseString:muCharSet return ret as boolean end chkSpace:
— アルファベットか on chkAlphabet:checkString set aStr to NSString’s stringWithString:checkString set allCharSet to NSMutableCharacterSet’s alloc()’s init() allCharSet’s addCharactersInRange:({location:97, |length|:26}) –97 = id of "a" allCharSet’s addCharactersInRange:({location:65, |length|:26}) –65 = id of "A" set aBool to my chkCompareString:aStr baseString:allCharSet return aBool as boolean end chkAlphabet:
on chkCompareString:checkString baseString:baseString set aScanner to NSScanner’s localizedScannerWithString:checkString aScanner’s setCharactersToBeSkipped:(missing value) aScanner’s scanCharactersFromSet:baseString intoString:(missing value) return (aScanner’s isAtEnd()) as boolean end chkCompareString:baseString:
on detectCharKind(aChar, aPattern) set aChar to NSString’s stringWithString:aChar set searchStr to NSString’s stringWithString:aPattern set matchRes to aChar’s rangeOfString:searchStr options:(NSRegularExpressionSearch) if matchRes’s location() = (current application’s NSNotFound) or (matchRes’s location() as number) > 9.99999999E+8 then return false else return true end if end detectCharKind
|
★Click Here to Open This Script
|