Human-in-the-loop interfaces for machine learning provide a promising way to reduce the annotation effort required to obtain an accurate machine learning model, particularly when it is used with transfer learning to exploit existing knowledge gleaned from another domain. This paper explores the use of a human-in-the-loop strategy that is designed to build a deep-learning image classification model iteratively using successive batches of images that the user labels. Specifically, we examine whether class-agnostic object detection can improve performance by providing a focus area for image classification in the form of a bounding box. The goal is to reduce the amount of effort required to label a batch of images by presenting the user with the current predictions of the model on a new batch of data and only requiring correction of those predictions. User effort is measured in terms of the number of corrections made. Results show that the use of bounding boxes always leads to fewer corrections. The benefit of a bounding box is that it also provides feedback to the user because it indicates whether or not the classification of the deep learning model is based on the appropriate part of the image. This has implications for the design of user interfaces in this application scenario. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.