├── .gitignore ├── Configuration └── SampleCode.xcconfig ├── Documentation └── StickyNote.jpg ├── LICENSE └── LICENSE.txt ├── README.md ├── Vision+ML Example.xcodeproj ├── .xcodesamplecode.plist └── project.pbxproj └── Vision+ML Example ├── AppDelegate.swift ├── Assets.xcassets └── AppIcon.appiconset │ └── Contents.json ├── Base.lproj ├── LaunchScreen.storyboard └── Main.storyboard ├── Info.plist ├── MNISTClassifier.mlmodel ├── Utilities.swift └── ViewController.swift /.gitignore: -------------------------------------------------------------------------------- 1 | # See LICENSE folder for this sample’s licensing information. 2 | # 3 | # Apple sample code gitignore configuration. 4 | 5 | # Finder 6 | .DS_Store 7 | 8 | # Xcode - User files 9 | xcuserdata/ 10 | *.xcworkspace 11 | -------------------------------------------------------------------------------- /Configuration/SampleCode.xcconfig: -------------------------------------------------------------------------------- 1 | // 2 | // SampleCode.xcconfig 3 | // 4 | 5 | // The `SAMPLE_CODE_DISAMBIGUATOR` configuration is to make it easier to build 6 | // and run a sample code project. Once you set your project's development team, 7 | // you'll have a unique bundle identifier. This is because the bundle identifier 8 | // is derived based on the 'SAMPLE_CODE_DISAMBIGUATOR' value. Do not use this 9 | // approach in your own projects—it's only useful for sample code projects because 10 | // they are frequently downloaded and don't have a development team set. 11 | SAMPLE_CODE_DISAMBIGUATOR=${DEVELOPMENT_TEAM} 12 | -------------------------------------------------------------------------------- /Documentation/StickyNote.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josephzhang23/ImageClassificationwithVisionandCoreML/4a26b7cd4504e9a9133de10c0c87999bb77bfe4f/Documentation/StickyNote.jpg -------------------------------------------------------------------------------- /LICENSE/LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright 2017 Apple Inc. 2 | 3 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 4 | 5 | 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 6 | 7 | 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 8 | 9 | 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. 10 | 11 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Classifying Images with Vision and Core ML 2 | 3 | Demonstrates using Vision with Core ML to preprocess images and perform image classification. 4 | 5 | ## Overview 6 | 7 | Among the many features of the Core ML framework is the ability to classify input data using a trained machine-learning model. The Vision framework works with Core ML to apply classification features to images, and to preprocess those images to make machine learning tasks easier and more reliable. 8 | 9 | This sample app uses a model based on the public MNIST database (a collection of handwriting samples) to identify handwritten digits found on rectangular objects in the image (such as sticky notes, as seen in the image below). 10 | 11 | ![StickyNote](Documentation/StickyNote.jpg) 12 | 13 | ## Getting Started 14 | 15 | Vision and Core ML require macOS 10.13, iOS 11, or tvOS 11. This example project runs only in iOS 11. 16 | 17 | ## Using the Sample App 18 | 19 | Build and run the project, then use the buttons in the sample app's toolbar to take a picture or choose an image from your photo library. The sample app then: 20 | 21 | 1. Uses Vision to detect rectangular areas in the image, 22 | 2. Uses Core image filters to prepare those areas for processing by the ML model, 23 | 3. Applies the model to produce an image classification result, and 24 | 4. Presents that result as a text label in the UI. 25 | 26 | ## Detecting Rectangles and Preparing for ML Processing 27 | 28 | The example app's `ViewController` class provides a UI for choosing an image with the system-provided [`UIImagePickerController`](https://developer.apple.com/documentation/uikit/uiimagepickercontroller) feature. After the user chooses an image (in the [`imagePickerController(_:didFinishPickingMediaWithInfo:)`](https://developer.apple.com/documentation/uikit/uiimagepickercontrollerdelegate/1619126-imagepickercontroller) method), the sample runs a Vision request for detecting rectangles in the image: 29 | 30 | ``` swift 31 | lazy var rectanglesRequest: VNDetectRectanglesRequest = { 32 | return VNDetectRectanglesRequest(completionHandler: self.handleRectangles) 33 | }() 34 | ``` 35 | 36 | Vision detects the corners of a rectangular object in the image scene. Because that object might appear in perspective in the image, the sample app uses those four corners and the Core Image `CIPerspectiveCorrection` filter to produce a rectangular image more appropriate for image classification: 37 | 38 | ``` swift 39 | func handleRectangles(request: VNRequest, error: Error?) { 40 | guard let observations = request.results as? [VNRectangleObservation] 41 | else { fatalError("unexpected result type from VNDetectRectanglesRequest") } 42 | guard let detectedRectangle = observations.first else { 43 | DispatchQueue.main.async { 44 | self.classificationLabel.text = "No rectangles detected." 45 | } 46 | return 47 | } 48 | let imageSize = inputImage.extent.size 49 | 50 | // Verify detected rectangle is valid. 51 | let boundingBox = detectedRectangle.boundingBox.scaled(to: imageSize) 52 | guard inputImage.extent.contains(boundingBox) 53 | else { print("invalid detected rectangle"); return } 54 | 55 | // Rectify the detected image and reduce it to inverted grayscale for applying model. 56 | let topLeft = detectedRectangle.topLeft.scaled(to: imageSize) 57 | let topRight = detectedRectangle.topRight.scaled(to: imageSize) 58 | let bottomLeft = detectedRectangle.bottomLeft.scaled(to: imageSize) 59 | let bottomRight = detectedRectangle.bottomRight.scaled(to: imageSize) 60 | let correctedImage = inputImage 61 | .cropping(to: boundingBox) 62 | .applyingFilter("CIPerspectiveCorrection", withInputParameters: [ 63 | "inputTopLeft": CIVector(cgPoint: topLeft), 64 | "inputTopRight": CIVector(cgPoint: topRight), 65 | "inputBottomLeft": CIVector(cgPoint: bottomLeft), 66 | "inputBottomRight": CIVector(cgPoint: bottomRight) 67 | ]) 68 | .applyingFilter("CIColorControls", withInputParameters: [ 69 | kCIInputSaturationKey: 0, 70 | kCIInputContrastKey: 32 71 | ]) 72 | .applyingFilter("CIColorInvert", withInputParameters: nil) 73 | 74 | // Show the pre-processed image 75 | DispatchQueue.main.async { 76 | self.correctedImageView.image = UIImage(ciImage: correctedImage) 77 | } 78 | 79 | // Run the Core ML MNIST classifier -- results in handleClassification method 80 | let handler = VNImageRequestHandler(ciImage: correctedImage) 81 | do { 82 | try handler.perform([classificationRequest]) 83 | } catch { 84 | print(error) 85 | } 86 | } 87 | ``` 88 | 89 | ## Classifying the Image with an ML Model 90 | 91 | After rectifying the image, the sample app runs a Vision request that applies the bundled Core ML model to classify the image. Setting up that model requires only loading the ML model file from the app bundle: 92 | 93 | ``` swift 94 | lazy var classificationRequest: VNCoreMLRequest = { 95 | // Load the ML model through its generated class and create a Vision request for it. 96 | do { 97 | let model = try VNCoreMLModel(for: MNISTClassifier().model) 98 | return VNCoreMLRequest(model: model, completionHandler: self.handleClassification) 99 | } catch { 100 | fatalError("can't load Vision ML model: \(error)") 101 | } 102 | }() 103 | ``` 104 | 105 | The ML model request's completion handler provides [`VNClassificationObservation`](https://developer.apple.com/documentation/vision/vnclassificationobservation) objects, indicating what classification the model applied to the image and its confidence in that classification: 106 | 107 | ``` swift 108 | func handleClassification(request: VNRequest, error: Error?) { 109 | guard let observations = request.results as? [VNClassificationObservation] 110 | else { fatalError("unexpected result type from VNCoreMLRequest") } 111 | guard let best = observations.first 112 | else { fatalError("can't get best result") } 113 | 114 | DispatchQueue.main.async { 115 | self.classificationLabel.text = "Classification: \"\(best.identifier)\" Confidence: \(best.confidence)" 116 | } 117 | } 118 | ``` 119 | -------------------------------------------------------------------------------- /Vision+ML Example.xcodeproj/.xcodesamplecode.plist: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /Vision+ML Example.xcodeproj/project.pbxproj: -------------------------------------------------------------------------------- 1 | // !$*UTF8*$! 2 | { 3 | archiveVersion = 1; 4 | classes = { 5 | }; 6 | objectVersion = 48; 7 | objects = { 8 | 9 | /* Begin PBXBuildFile section */ 10 | 11063F621EC0EECC0033EE6D /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 11063F611EC0EECC0033EE6D /* AppDelegate.swift */; }; 11 | 11063F641EC0EECC0033EE6D /* ViewController.swift in Sources */ = {isa = PBXBuildFile; fileRef = 11063F631EC0EECC0033EE6D /* ViewController.swift */; }; 12 | 11063F671EC0EECC0033EE6D /* Main.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 11063F651EC0EECC0033EE6D /* Main.storyboard */; }; 13 | 11063F691EC0EECC0033EE6D /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 11063F681EC0EECC0033EE6D /* Assets.xcassets */; }; 14 | 11063F741EC0F5270033EE6D /* Utilities.swift in Sources */ = {isa = PBXBuildFile; fileRef = 11063F731EC0F5270033EE6D /* Utilities.swift */; }; 15 | 113695D81EC1151400CA6C43 /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 113695D61EC1151400CA6C43 /* LaunchScreen.storyboard */; }; 16 | 11E0E0661ED35A7600D15919 /* MNISTClassifier.mlmodel in Sources */ = {isa = PBXBuildFile; fileRef = 11E0E0651ED35A6C00D15919 /* MNISTClassifier.mlmodel */; }; 17 | /* End PBXBuildFile section */ 18 | 19 | /* Begin PBXFileReference section */ 20 | 01D0AD8ADE10E7B14306204D /* LICENSE.txt */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text; path = LICENSE.txt; sourceTree = ""; }; 21 | 11063F5E1EC0EECC0033EE6D /* Vision+ML Example.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = "Vision+ML Example.app"; sourceTree = BUILT_PRODUCTS_DIR; }; 22 | 11063F611EC0EECC0033EE6D /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = ""; }; 23 | 11063F631EC0EECC0033EE6D /* ViewController.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ViewController.swift; sourceTree = ""; }; 24 | 11063F661EC0EECC0033EE6D /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/Main.storyboard; sourceTree = ""; }; 25 | 11063F681EC0EECC0033EE6D /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = ""; }; 26 | 11063F6D1EC0EECC0033EE6D /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = ""; }; 27 | 11063F731EC0F5270033EE6D /* Utilities.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = Utilities.swift; sourceTree = ""; }; 28 | 113695D71EC1151400CA6C43 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = ""; }; 29 | 11E0E0651ED35A6C00D15919 /* MNISTClassifier.mlmodel */ = {isa = PBXFileReference; lastKnownFileType = file.mlmodel; path = MNISTClassifier.mlmodel; sourceTree = ""; }; 30 | 6CF29CEB6EF7054AFBC2F7BE /* SampleCode.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = SampleCode.xcconfig; path = Configuration/SampleCode.xcconfig; sourceTree = ""; }; 31 | C858E5CC1ECA99CC006C5CAA /* README.md */ = {isa = PBXFileReference; lastKnownFileType = net.daringfireball.markdown; path = README.md; sourceTree = ""; }; 32 | /* End PBXFileReference section */ 33 | 34 | /* Begin PBXFrameworksBuildPhase section */ 35 | 11063F5B1EC0EECC0033EE6D /* Frameworks */ = { 36 | isa = PBXFrameworksBuildPhase; 37 | buildActionMask = 2147483647; 38 | files = ( 39 | ); 40 | runOnlyForDeploymentPostprocessing = 0; 41 | }; 42 | /* End PBXFrameworksBuildPhase section */ 43 | 44 | /* Begin PBXGroup section */ 45 | 11063F551EC0EECC0033EE6D = { 46 | isa = PBXGroup; 47 | children = ( 48 | C858E5CC1ECA99CC006C5CAA /* README.md */, 49 | 11063F601EC0EECC0033EE6D /* Vision+ML Example */, 50 | 11063F5F1EC0EECC0033EE6D /* Products */, 51 | E6AE83D83E941F9198E016B5 /* Configuration */, 52 | D16AC0324B3B050245ADF47E /* LICENSE */, 53 | ); 54 | sourceTree = ""; 55 | }; 56 | 11063F5F1EC0EECC0033EE6D /* Products */ = { 57 | isa = PBXGroup; 58 | children = ( 59 | 11063F5E1EC0EECC0033EE6D /* Vision+ML Example.app */, 60 | ); 61 | name = Products; 62 | sourceTree = ""; 63 | }; 64 | 11063F601EC0EECC0033EE6D /* Vision+ML Example */ = { 65 | isa = PBXGroup; 66 | children = ( 67 | 11063F611EC0EECC0033EE6D /* AppDelegate.swift */, 68 | 11063F631EC0EECC0033EE6D /* ViewController.swift */, 69 | 11063F731EC0F5270033EE6D /* Utilities.swift */, 70 | 11E0E0651ED35A6C00D15919 /* MNISTClassifier.mlmodel */, 71 | 11063F651EC0EECC0033EE6D /* Main.storyboard */, 72 | 11063F681EC0EECC0033EE6D /* Assets.xcassets */, 73 | 11063F6D1EC0EECC0033EE6D /* Info.plist */, 74 | 113695D61EC1151400CA6C43 /* LaunchScreen.storyboard */, 75 | ); 76 | path = "Vision+ML Example"; 77 | sourceTree = ""; 78 | }; 79 | D16AC0324B3B050245ADF47E /* LICENSE */ = { 80 | isa = PBXGroup; 81 | children = ( 82 | 01D0AD8ADE10E7B14306204D /* LICENSE.txt */, 83 | ); 84 | path = LICENSE; 85 | sourceTree = ""; 86 | }; 87 | E6AE83D83E941F9198E016B5 /* Configuration */ = { 88 | isa = PBXGroup; 89 | children = ( 90 | 6CF29CEB6EF7054AFBC2F7BE /* SampleCode.xcconfig */, 91 | ); 92 | name = Configuration; 93 | sourceTree = ""; 94 | }; 95 | /* End PBXGroup section */ 96 | 97 | /* Begin PBXNativeTarget section */ 98 | 11063F5D1EC0EECC0033EE6D /* Vision+ML Example */ = { 99 | isa = PBXNativeTarget; 100 | buildConfigurationList = 11063F701EC0EECC0033EE6D /* Build configuration list for PBXNativeTarget "Vision+ML Example" */; 101 | buildPhases = ( 102 | 11063F5A1EC0EECC0033EE6D /* Sources */, 103 | 11063F5B1EC0EECC0033EE6D /* Frameworks */, 104 | 11063F5C1EC0EECC0033EE6D /* Resources */, 105 | ); 106 | buildRules = ( 107 | ); 108 | dependencies = ( 109 | ); 110 | name = "Vision+ML Example"; 111 | productName = "Vision+ML Example"; 112 | productReference = 11063F5E1EC0EECC0033EE6D /* Vision+ML Example.app */; 113 | productType = "com.apple.product-type.application"; 114 | }; 115 | /* End PBXNativeTarget section */ 116 | 117 | /* Begin PBXProject section */ 118 | 11063F561EC0EECC0033EE6D /* Project object */ = { 119 | isa = PBXProject; 120 | attributes = { 121 | LastSwiftUpdateCheck = 0900; 122 | LastUpgradeCheck = 0900; 123 | ORGANIZATIONNAME = Apple; 124 | TargetAttributes = { 125 | 11063F5D1EC0EECC0033EE6D = { 126 | CreatedOnToolsVersion = 9.0; 127 | }; 128 | }; 129 | }; 130 | buildConfigurationList = 11063F591EC0EECC0033EE6D /* Build configuration list for PBXProject "Vision+ML Example" */; 131 | compatibilityVersion = "Xcode 8.0"; 132 | developmentRegion = en; 133 | hasScannedForEncodings = 0; 134 | knownRegions = ( 135 | en, 136 | Base, 137 | ); 138 | mainGroup = 11063F551EC0EECC0033EE6D; 139 | productRefGroup = 11063F5F1EC0EECC0033EE6D /* Products */; 140 | projectDirPath = ""; 141 | projectRoot = ""; 142 | targets = ( 143 | 11063F5D1EC0EECC0033EE6D /* Vision+ML Example */, 144 | ); 145 | }; 146 | /* End PBXProject section */ 147 | 148 | /* Begin PBXResourcesBuildPhase section */ 149 | 11063F5C1EC0EECC0033EE6D /* Resources */ = { 150 | isa = PBXResourcesBuildPhase; 151 | buildActionMask = 2147483647; 152 | files = ( 153 | 113695D81EC1151400CA6C43 /* LaunchScreen.storyboard in Resources */, 154 | 11063F691EC0EECC0033EE6D /* Assets.xcassets in Resources */, 155 | 11063F671EC0EECC0033EE6D /* Main.storyboard in Resources */, 156 | ); 157 | runOnlyForDeploymentPostprocessing = 0; 158 | }; 159 | /* End PBXResourcesBuildPhase section */ 160 | 161 | /* Begin PBXSourcesBuildPhase section */ 162 | 11063F5A1EC0EECC0033EE6D /* Sources */ = { 163 | isa = PBXSourcesBuildPhase; 164 | buildActionMask = 2147483647; 165 | files = ( 166 | 11063F741EC0F5270033EE6D /* Utilities.swift in Sources */, 167 | 11063F641EC0EECC0033EE6D /* ViewController.swift in Sources */, 168 | 11063F621EC0EECC0033EE6D /* AppDelegate.swift in Sources */, 169 | 11E0E0661ED35A7600D15919 /* MNISTClassifier.mlmodel in Sources */, 170 | ); 171 | runOnlyForDeploymentPostprocessing = 0; 172 | }; 173 | /* End PBXSourcesBuildPhase section */ 174 | 175 | /* Begin PBXVariantGroup section */ 176 | 11063F651EC0EECC0033EE6D /* Main.storyboard */ = { 177 | isa = PBXVariantGroup; 178 | children = ( 179 | 11063F661EC0EECC0033EE6D /* Base */, 180 | ); 181 | name = Main.storyboard; 182 | sourceTree = ""; 183 | }; 184 | 113695D61EC1151400CA6C43 /* LaunchScreen.storyboard */ = { 185 | isa = PBXVariantGroup; 186 | children = ( 187 | 113695D71EC1151400CA6C43 /* Base */, 188 | ); 189 | name = LaunchScreen.storyboard; 190 | sourceTree = ""; 191 | }; 192 | /* End PBXVariantGroup section */ 193 | 194 | /* Begin XCBuildConfiguration section */ 195 | 11063F6E1EC0EECC0033EE6D /* Debug */ = { 196 | isa = XCBuildConfiguration; 197 | baseConfigurationReference = 6CF29CEB6EF7054AFBC2F7BE /* SampleCode.xcconfig */; 198 | buildSettings = { 199 | ALWAYS_SEARCH_USER_PATHS = NO; 200 | CLANG_ANALYZER_NONNULL = YES; 201 | CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE; 202 | CLANG_CXX_LANGUAGE_STANDARD = "gnu++14"; 203 | CLANG_CXX_LIBRARY = "libc++"; 204 | CLANG_ENABLE_MODULES = YES; 205 | CLANG_ENABLE_OBJC_ARC = YES; 206 | CLANG_WARN_BOOL_CONVERSION = YES; 207 | CLANG_WARN_CONSTANT_CONVERSION = YES; 208 | CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR; 209 | CLANG_WARN_DOCUMENTATION_COMMENTS = YES; 210 | CLANG_WARN_EMPTY_BODY = YES; 211 | CLANG_WARN_ENUM_CONVERSION = YES; 212 | CLANG_WARN_INFINITE_RECURSION = YES; 213 | CLANG_WARN_INT_CONVERSION = YES; 214 | CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR; 215 | CLANG_WARN_RANGE_LOOP_ANALYSIS = YES; 216 | CLANG_WARN_SUSPICIOUS_MOVE = YES; 217 | CLANG_WARN_UNGUARDED_AVAILABILITY = YES; 218 | CLANG_WARN_UNREACHABLE_CODE = YES; 219 | CLANG_WARN__DUPLICATE_METHOD_MATCH = YES; 220 | CODE_SIGN_IDENTITY = "iPhone Developer"; 221 | COPY_PHASE_STRIP = NO; 222 | DEBUG_INFORMATION_FORMAT = dwarf; 223 | ENABLE_STRICT_OBJC_MSGSEND = YES; 224 | ENABLE_TESTABILITY = YES; 225 | GCC_C_LANGUAGE_STANDARD = gnu11; 226 | GCC_DYNAMIC_NO_PIC = NO; 227 | GCC_NO_COMMON_BLOCKS = YES; 228 | GCC_OPTIMIZATION_LEVEL = 0; 229 | GCC_PREPROCESSOR_DEFINITIONS = ( 230 | "DEBUG=1", 231 | "$(inherited)", 232 | ); 233 | GCC_WARN_64_TO_32_BIT_CONVERSION = YES; 234 | GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR; 235 | GCC_WARN_UNDECLARED_SELECTOR = YES; 236 | GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE; 237 | GCC_WARN_UNUSED_FUNCTION = YES; 238 | GCC_WARN_UNUSED_VARIABLE = YES; 239 | IPHONEOS_DEPLOYMENT_TARGET = 11.0; 240 | MTL_ENABLE_DEBUG_INFO = YES; 241 | ONLY_ACTIVE_ARCH = YES; 242 | SDKROOT = iphoneos; 243 | SWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG; 244 | SWIFT_OPTIMIZATION_LEVEL = "-Onone"; 245 | SWIFT_VERSION = 3.0; 246 | }; 247 | name = Debug; 248 | }; 249 | 11063F6F1EC0EECC0033EE6D /* Release */ = { 250 | isa = XCBuildConfiguration; 251 | baseConfigurationReference = 6CF29CEB6EF7054AFBC2F7BE /* SampleCode.xcconfig */; 252 | buildSettings = { 253 | ALWAYS_SEARCH_USER_PATHS = NO; 254 | CLANG_ANALYZER_NONNULL = YES; 255 | CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE; 256 | CLANG_CXX_LANGUAGE_STANDARD = "gnu++14"; 257 | CLANG_CXX_LIBRARY = "libc++"; 258 | CLANG_ENABLE_MODULES = YES; 259 | CLANG_ENABLE_OBJC_ARC = YES; 260 | CLANG_WARN_BOOL_CONVERSION = YES; 261 | CLANG_WARN_CONSTANT_CONVERSION = YES; 262 | CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR; 263 | CLANG_WARN_DOCUMENTATION_COMMENTS = YES; 264 | CLANG_WARN_EMPTY_BODY = YES; 265 | CLANG_WARN_ENUM_CONVERSION = YES; 266 | CLANG_WARN_INFINITE_RECURSION = YES; 267 | CLANG_WARN_INT_CONVERSION = YES; 268 | CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR; 269 | CLANG_WARN_RANGE_LOOP_ANALYSIS = YES; 270 | CLANG_WARN_SUSPICIOUS_MOVE = YES; 271 | CLANG_WARN_UNGUARDED_AVAILABILITY = YES; 272 | CLANG_WARN_UNREACHABLE_CODE = YES; 273 | CLANG_WARN__DUPLICATE_METHOD_MATCH = YES; 274 | CODE_SIGN_IDENTITY = "iPhone Developer"; 275 | COPY_PHASE_STRIP = NO; 276 | DEBUG_INFORMATION_FORMAT = "dwarf-with-dsym"; 277 | ENABLE_NS_ASSERTIONS = NO; 278 | ENABLE_STRICT_OBJC_MSGSEND = YES; 279 | GCC_C_LANGUAGE_STANDARD = gnu11; 280 | GCC_NO_COMMON_BLOCKS = YES; 281 | GCC_WARN_64_TO_32_BIT_CONVERSION = YES; 282 | GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR; 283 | GCC_WARN_UNDECLARED_SELECTOR = YES; 284 | GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE; 285 | GCC_WARN_UNUSED_FUNCTION = YES; 286 | GCC_WARN_UNUSED_VARIABLE = YES; 287 | IPHONEOS_DEPLOYMENT_TARGET = 11.0; 288 | MTL_ENABLE_DEBUG_INFO = NO; 289 | SDKROOT = iphoneos; 290 | SWIFT_OPTIMIZATION_LEVEL = "-Owholemodule"; 291 | SWIFT_VERSION = 3.0; 292 | VALIDATE_PRODUCT = YES; 293 | }; 294 | name = Release; 295 | }; 296 | 11063F711EC0EECC0033EE6D /* Debug */ = { 297 | isa = XCBuildConfiguration; 298 | baseConfigurationReference = 6CF29CEB6EF7054AFBC2F7BE /* SampleCode.xcconfig */; 299 | buildSettings = { 300 | ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; 301 | COREML_CODEGEN_LANGUAGE = Swift; 302 | DEVELOPMENT_TEAM = ""; 303 | INFOPLIST_FILE = "Vision+ML Example/Info.plist"; 304 | LD_RUNPATH_SEARCH_PATHS = "$(inherited) @executable_path/Frameworks"; 305 | PRODUCT_BUNDLE_IDENTIFIER = "com.example.apple-samplecode.Vision-ML-Example${SAMPLE_CODE_DISAMBIGUATOR}"; 306 | PRODUCT_NAME = "$(TARGET_NAME)"; 307 | PROVISIONING_PROFILE_SPECIFIER = ""; 308 | SWIFT_VERSION = 4.0; 309 | TARGETED_DEVICE_FAMILY = "1,2"; 310 | }; 311 | name = Debug; 312 | }; 313 | 11063F721EC0EECC0033EE6D /* Release */ = { 314 | isa = XCBuildConfiguration; 315 | baseConfigurationReference = 6CF29CEB6EF7054AFBC2F7BE /* SampleCode.xcconfig */; 316 | buildSettings = { 317 | ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; 318 | COREML_CODEGEN_LANGUAGE = Swift; 319 | DEVELOPMENT_TEAM = ""; 320 | INFOPLIST_FILE = "Vision+ML Example/Info.plist"; 321 | LD_RUNPATH_SEARCH_PATHS = "$(inherited) @executable_path/Frameworks"; 322 | PRODUCT_BUNDLE_IDENTIFIER = "com.example.apple-samplecode.Vision-ML-Example${SAMPLE_CODE_DISAMBIGUATOR}"; 323 | PRODUCT_NAME = "$(TARGET_NAME)"; 324 | PROVISIONING_PROFILE_SPECIFIER = ""; 325 | SWIFT_VERSION = 4.0; 326 | TARGETED_DEVICE_FAMILY = "1,2"; 327 | }; 328 | name = Release; 329 | }; 330 | /* End XCBuildConfiguration section */ 331 | 332 | /* Begin XCConfigurationList section */ 333 | 11063F591EC0EECC0033EE6D /* Build configuration list for PBXProject "Vision+ML Example" */ = { 334 | isa = XCConfigurationList; 335 | buildConfigurations = ( 336 | 11063F6E1EC0EECC0033EE6D /* Debug */, 337 | 11063F6F1EC0EECC0033EE6D /* Release */, 338 | ); 339 | defaultConfigurationIsVisible = 0; 340 | defaultConfigurationName = Release; 341 | }; 342 | 11063F701EC0EECC0033EE6D /* Build configuration list for PBXNativeTarget "Vision+ML Example" */ = { 343 | isa = XCConfigurationList; 344 | buildConfigurations = ( 345 | 11063F711EC0EECC0033EE6D /* Debug */, 346 | 11063F721EC0EECC0033EE6D /* Release */, 347 | ); 348 | defaultConfigurationIsVisible = 0; 349 | defaultConfigurationName = Release; 350 | }; 351 | /* End XCConfigurationList section */ 352 | }; 353 | rootObject = 11063F561EC0EECC0033EE6D /* Project object */; 354 | } 355 | -------------------------------------------------------------------------------- /Vision+ML Example/AppDelegate.swift: -------------------------------------------------------------------------------- 1 | /* 2 | See LICENSE folder for this sample’s licensing information. 3 | 4 | Abstract: 5 | Empty app delegate. 6 | */ 7 | 8 | import UIKit 9 | 10 | @UIApplicationMain 11 | class AppDelegate: UIResponder, UIApplicationDelegate { 12 | 13 | var window: UIWindow? 14 | 15 | // Nothing to do here. See ViewController for main functionality. 16 | 17 | } 18 | 19 | -------------------------------------------------------------------------------- /Vision+ML Example/Assets.xcassets/AppIcon.appiconset/Contents.json: -------------------------------------------------------------------------------- 1 | { 2 | "images" : [ 3 | { 4 | "idiom" : "iphone", 5 | "size" : "20x20", 6 | "scale" : "2x" 7 | }, 8 | { 9 | "idiom" : "iphone", 10 | "size" : "20x20", 11 | "scale" : "3x" 12 | }, 13 | { 14 | "idiom" : "iphone", 15 | "size" : "29x29", 16 | "scale" : "2x" 17 | }, 18 | { 19 | "idiom" : "iphone", 20 | "size" : "29x29", 21 | "scale" : "3x" 22 | }, 23 | { 24 | "idiom" : "iphone", 25 | "size" : "40x40", 26 | "scale" : "2x" 27 | }, 28 | { 29 | "idiom" : "iphone", 30 | "size" : "40x40", 31 | "scale" : "3x" 32 | }, 33 | { 34 | "idiom" : "iphone", 35 | "size" : "60x60", 36 | "scale" : "2x" 37 | }, 38 | { 39 | "idiom" : "iphone", 40 | "size" : "60x60", 41 | "scale" : "3x" 42 | }, 43 | { 44 | "idiom" : "ipad", 45 | "size" : "20x20", 46 | "scale" : "1x" 47 | }, 48 | { 49 | "idiom" : "ipad", 50 | "size" : "20x20", 51 | "scale" : "2x" 52 | }, 53 | { 54 | "idiom" : "ipad", 55 | "size" : "29x29", 56 | "scale" : "1x" 57 | }, 58 | { 59 | "idiom" : "ipad", 60 | "size" : "29x29", 61 | "scale" : "2x" 62 | }, 63 | { 64 | "idiom" : "ipad", 65 | "size" : "40x40", 66 | "scale" : "1x" 67 | }, 68 | { 69 | "idiom" : "ipad", 70 | "size" : "40x40", 71 | "scale" : "2x" 72 | }, 73 | { 74 | "idiom" : "ipad", 75 | "size" : "76x76", 76 | "scale" : "1x" 77 | }, 78 | { 79 | "idiom" : "ipad", 80 | "size" : "76x76", 81 | "scale" : "2x" 82 | }, 83 | { 84 | "idiom" : "ipad", 85 | "size" : "83.5x83.5", 86 | "scale" : "2x" 87 | } 88 | ], 89 | "info" : { 90 | "version" : 1, 91 | "author" : "xcode" 92 | } 93 | } -------------------------------------------------------------------------------- /Vision+ML Example/Base.lproj/LaunchScreen.storyboard: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 26 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | -------------------------------------------------------------------------------- /Vision+ML Example/Base.lproj/Main.storyboard: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | -------------------------------------------------------------------------------- /Vision+ML Example/Info.plist: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | CFBundleDevelopmentRegion 6 | $(DEVELOPMENT_LANGUAGE) 7 | CFBundleExecutable 8 | $(EXECUTABLE_NAME) 9 | CFBundleIdentifier 10 | $(PRODUCT_BUNDLE_IDENTIFIER) 11 | CFBundleInfoDictionaryVersion 12 | 6.0 13 | CFBundleName 14 | $(PRODUCT_NAME) 15 | CFBundlePackageType 16 | APPL 17 | CFBundleShortVersionString 18 | 1.0 19 | CFBundleVersion 20 | 1 21 | LSRequiresIPhoneOS 22 | 23 | UILaunchStoryboardName 24 | LaunchScreen 25 | NSCameraUsageDescription 26 | Takes pictures to demonstrate image recognition. 27 | UIMainStoryboardFile 28 | Main 29 | UIRequiredDeviceCapabilities 30 | 31 | armv7 32 | 33 | UISupportedInterfaceOrientations 34 | 35 | UIInterfaceOrientationPortrait 36 | UIInterfaceOrientationLandscapeLeft 37 | UIInterfaceOrientationLandscapeRight 38 | 39 | UISupportedInterfaceOrientations~ipad 40 | 41 | UIInterfaceOrientationPortrait 42 | UIInterfaceOrientationPortraitUpsideDown 43 | UIInterfaceOrientationLandscapeLeft 44 | UIInterfaceOrientationLandscapeRight 45 | 46 | 47 | 48 | -------------------------------------------------------------------------------- /Vision+ML Example/MNISTClassifier.mlmodel: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/josephzhang23/ImageClassificationwithVisionandCoreML/4a26b7cd4504e9a9133de10c0c87999bb77bfe4f/Vision+ML Example/MNISTClassifier.mlmodel -------------------------------------------------------------------------------- /Vision+ML Example/Utilities.swift: -------------------------------------------------------------------------------- 1 | /* 2 | See LICENSE folder for this sample’s licensing information. 3 | 4 | Abstract: 5 | Core Graphics utility extensions used in the sample code. 6 | */ 7 | 8 | import UIKit 9 | import CoreGraphics 10 | import ImageIO 11 | 12 | extension CGPoint { 13 | func scaled(to size: CGSize) -> CGPoint { 14 | return CGPoint(x: self.x * size.width, y: self.y * size.height) 15 | } 16 | } 17 | extension CGRect { 18 | func scaled(to size: CGSize) -> CGRect { 19 | return CGRect( 20 | x: self.origin.x * size.width, 21 | y: self.origin.y * size.height, 22 | width: self.size.width * size.width, 23 | height: self.size.height * size.height 24 | ) 25 | } 26 | } 27 | 28 | extension CGImagePropertyOrientation { 29 | init(_ orientation: UIImageOrientation) { 30 | switch orientation { 31 | case .up: self = .up 32 | case .upMirrored: self = .upMirrored 33 | case .down: self = .down 34 | case .downMirrored: self = .downMirrored 35 | case .left: self = .left 36 | case .leftMirrored: self = .leftMirrored 37 | case .right: self = .right 38 | case .rightMirrored: self = .rightMirrored 39 | } 40 | } 41 | } 42 | 43 | -------------------------------------------------------------------------------- /Vision+ML Example/ViewController.swift: -------------------------------------------------------------------------------- 1 | /* 2 | See LICENSE folder for this sample’s licensing information. 3 | 4 | Abstract: 5 | View controller for selecting images and applying Vision + Core ML processing. 6 | */ 7 | 8 | import UIKit 9 | import CoreML 10 | import Vision 11 | import ImageIO 12 | 13 | class ViewController: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate { 14 | 15 | @IBOutlet weak var imageView: UIImageView! 16 | @IBOutlet weak var correctedImageView: UIImageView! 17 | @IBOutlet weak var classificationLabel: UILabel! 18 | 19 | @IBAction func takePicture(_ sender: Any) { 20 | let picker = UIImagePickerController() 21 | picker.delegate = self 22 | picker.sourceType = .camera 23 | picker.cameraCaptureMode = .photo 24 | present(picker, animated: true) 25 | } 26 | @IBAction func chooseImage(_ sender: Any) { 27 | // The photo library is the default source, editing not allowed 28 | let picker = UIImagePickerController() 29 | picker.delegate = self 30 | picker.sourceType = .savedPhotosAlbum 31 | present(picker, animated: true) 32 | } 33 | 34 | var inputImage: CIImage! // The image to be processed. 35 | 36 | func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) { 37 | picker.dismiss(animated: true) 38 | classificationLabel.text = "Analyzing Image…" 39 | correctedImageView.image = nil 40 | 41 | guard let uiImage = info[UIImagePickerControllerOriginalImage] as? UIImage 42 | else { fatalError("no image from image picker") } 43 | guard let ciImage = CIImage(image: uiImage) 44 | else { fatalError("can't create CIImage from UIImage") } 45 | let orientation = CGImagePropertyOrientation(uiImage.imageOrientation) 46 | inputImage = ciImage.oriented(forExifOrientation: Int32(orientation.rawValue)) 47 | 48 | // Show the image in the UI. 49 | imageView.image = uiImage 50 | 51 | // Run the rectangle detector, which upon completion runs the ML classifier. 52 | let handler = VNImageRequestHandler(ciImage: ciImage, orientation: CGImagePropertyOrientation(rawValue: UInt32(Int32(orientation.rawValue)))!) 53 | DispatchQueue.global(qos: .userInteractive).async { 54 | do { 55 | try handler.perform([self.rectanglesRequest]) 56 | } catch { 57 | print(error) 58 | } 59 | } 60 | } 61 | 62 | lazy var classificationRequest: VNCoreMLRequest = { 63 | // Load the ML model through its generated class and create a Vision request for it. 64 | do { 65 | let model = try VNCoreMLModel(for: MNISTClassifier().model) 66 | return VNCoreMLRequest(model: model, completionHandler: self.handleClassification) 67 | } catch { 68 | fatalError("can't load Vision ML model: \(error)") 69 | } 70 | }() 71 | 72 | func handleClassification(request: VNRequest, error: Error?) { 73 | guard let observations = request.results as? [VNClassificationObservation] 74 | else { fatalError("unexpected result type from VNCoreMLRequest") } 75 | guard let best = observations.first 76 | else { fatalError("can't get best result") } 77 | 78 | DispatchQueue.main.async { 79 | self.classificationLabel.text = "Classification: \"\(best.identifier)\" Confidence: \(best.confidence)" 80 | } 81 | } 82 | 83 | lazy var rectanglesRequest: VNDetectRectanglesRequest = { 84 | return VNDetectRectanglesRequest(completionHandler: self.handleRectangles) 85 | }() 86 | func handleRectangles(request: VNRequest, error: Error?) { 87 | guard let observations = request.results as? [VNRectangleObservation] 88 | else { fatalError("unexpected result type from VNDetectRectanglesRequest") } 89 | guard let detectedRectangle = observations.first else { 90 | DispatchQueue.main.async { 91 | self.classificationLabel.text = "No rectangles detected." 92 | } 93 | return 94 | } 95 | let imageSize = inputImage.extent.size 96 | 97 | // Verify detected rectangle is valid. 98 | let boundingBox = detectedRectangle.boundingBox.scaled(to: imageSize) 99 | guard inputImage.extent.contains(boundingBox) 100 | else { print("invalid detected rectangle"); return } 101 | 102 | // Rectify the detected image and reduce it to inverted grayscale for applying model. 103 | let topLeft = detectedRectangle.topLeft.scaled(to: imageSize) 104 | let topRight = detectedRectangle.topRight.scaled(to: imageSize) 105 | let bottomLeft = detectedRectangle.bottomLeft.scaled(to: imageSize) 106 | let bottomRight = detectedRectangle.bottomRight.scaled(to: imageSize) 107 | let correctedImage = inputImage 108 | .cropped(to: boundingBox) 109 | .applyingFilter("CIPerspectiveCorrection", parameters: [ 110 | "inputTopLeft": CIVector(cgPoint: topLeft), 111 | "inputTopRight": CIVector(cgPoint: topRight), 112 | "inputBottomLeft": CIVector(cgPoint: bottomLeft), 113 | "inputBottomRight": CIVector(cgPoint: bottomRight) 114 | ]) 115 | .applyingFilter("CIColorControls", parameters: [ 116 | kCIInputSaturationKey: 0, 117 | kCIInputContrastKey: 32 118 | ]) 119 | .applyingFilter("CIColorInvert", parameters: [:]) 120 | 121 | // Show the pre-processed image 122 | DispatchQueue.main.async { 123 | self.correctedImageView.image = UIImage(ciImage: correctedImage) 124 | } 125 | 126 | // Run the Core ML MNIST classifier -- results in handleClassification method 127 | let handler = VNImageRequestHandler(ciImage: correctedImage) 128 | do { 129 | try handler.perform([classificationRequest]) 130 | } catch { 131 | print(error) 132 | } 133 | } 134 | 135 | } 136 | 137 | --------------------------------------------------------------------------------