AppleScript名:本文テキスト内容からリンクURLを抽出する |
— Created 2017-08-12 by Takaaki Naganoya — 2017 Piyomaru Software use AppleScript version "2.5" use scripting additions use framework "Foundation" use framework "Quartz" use BPlus : script "BridgePlus" –https://www.macosxautomation.com/applescript/apps/Script_Libs.html#BridgePlus property NSString : a reference to current application’s NSString property NSCharacterSet : a reference to current application’s NSCharacterSet property NSRegularExpression : a reference to current application’s NSRegularExpression property NSRegularExpressionAnchorsMatchLines : a reference to current application’s NSRegularExpressionAnchorsMatchLines property NSRegularExpressionDotMatchesLineSeparators : a reference to current application’s NSRegularExpressionDotMatchesLineSeparators script spdPDF property foundURLs : {} property aList : {} property outList : {} end script load framework set (foundURLs of spdPDF) to {} set theFile to POSIX path of (choose file of type "com.adobe.pdf") set (aList of spdPDF) to textInPDFinEachPage(theFile) of me set pCounter to 1 repeat with i in (aList of spdPDF) set aURL1 to extractLinksFromNaturalText(i as string) of me set aURL2 to (current application’s SMSForder’s arrayByDeletingBlanksIn:((aURL1) as list)) set tmpOut to {} repeat with ii in aURL2 set aURL to (ii’s absoluteString()) as string if aURL begins with "http://piyocast.com/as/" then set the end of tmpOut to aURL end if end repeat if tmpOut is not equal to {} then set (outList of spdPDF) to (outList of spdPDF) & tmpOut end if end repeat set outStr to retArrowText(outList of spdPDF, return) of me on textInPDFinEachPage(thePath) script textStorage property aList : {} end script set (aList of textStorage) to {} set anNSURL to (current application’s |NSURL|’s fileURLWithPath:thePath) set theDoc to current application’s PDFDocument’s alloc()’s initWithURL:anNSURL set theCount to theDoc’s pageCount() as integer repeat with i from 1 to theCount set thePage to (theDoc’s pageAtIndex:(i – 1)) set curStr to (thePage’s |string|()) set curStr2 to curStr’s decomposedStringWithCanonicalMapping() –Normalize Text with NFC set targString to string id 13 & string id 10 & string id 32 & string id 65532 –Object Replacement Character set bStr to (curStr2’s stringByTrimmingCharactersInSet:(current application’s NSCharacterSet’s characterSetWithCharactersInString:targString)) set the end of (aList of textStorage) to (bStr as string) end repeat return contents of (aList of textStorage) end textInPDFinEachPage on extractLinksFromNaturalText(aString) set anNSString to current application’s NSString’s stringWithString:aString set {theDetector, theError} to current application’s NSDataDetector’s dataDetectorWithTypes:(current application’s NSTextCheckingTypeLink) |error|:(reference) set theMatches to theDetector’s matchesInString:anNSString options:0 range:{0, anNSString’s |length|()} set theResults to theMatches’s valueForKey:"URL" return theResults as list end extractLinksFromNaturalText –リストを指定デリミタをはさんでテキスト化 on retStrFromArrayWithDelimiter(aList, aDelim) set anArray to current application’s NSArray’s arrayWithArray:aList set aRes to anArray’s componentsJoinedByString:aDelim return aRes as text end retStrFromArrayWithDelimiter on retArrowText(aList, aDelim) –自分のASでよく使うハンドラ名称なので、同じものを用意 return my retStrFromArrayWithDelimiter(aList, aDelim) end retArrowText |
More from my site
(Visited 39 times, 1 visits today)