A Unified Framework for Tumor Proliferation Score Prediction in Breast Histopathology


Predicting tumor proliferation scores is an important biomarker indicative of breast cancer patients' prognosis. In this paper, we present a unified framework to predict tumor proliferation scores from whole slide images in breast histopathology. The proposed system is offers a fully automated solution to predicting both a molecular data based, and a mitosis counting based tumor proliferation score. The framework integrates three modules, each fine-tuned to maximize the overall performance: an image processing component for handling whole slide images, a deep learning based mitosis detection network, and a proliferation scores prediction module. We have achieved 0.567 quadratic weighted Cohen's kappa in mitosis counting based score prediction and 0.652 F1-score in mitosis detection. On Spearman's correlation coefficient, which evaluates prediction on the molecular data based score, the system obtained 0.6171. Our system won first place in all of the three tasks in Tumor Proliferation Assessment Challenge at MICCAI 2016, outperforming all other approaches.

Semantic Noise Modeling for Better Representation Learning


Latent representation learned from multi-layered neural networks via hierarchical feature abstraction enables recent success of deep learning. Under the deep learning framework, generalization performance highly depends on the learned latent representation which is obtained from an appropriate training scenario with a task-specific objective on a designed network model. In this work, we propose a novel latent space modeling method to learn better latent representation. We designed a neural network model based on the assumption that good base representation can be attained by maximizing the total correlation between the input, latent, and output variables. From the base model, we introduce a semantic noise modeling method which enables class-conditional perturbation on latent space to enhance the representational power of learned latent feature. During training, latent vector representation can be stochastically perturbed by a modeled class-conditional additive noise while maintaining its original semantic feature. It implicitly brings the effect of semantic augmentation on the latent space. The proposed model can be easily learned by back-propagation with common gradient-based optimization algorithms. Experimental results show that the proposed method helps to achieve performance benefits against various previous approaches. We also provide the empirical analyses for the proposed class-conditional perturbation process including t-SNE visualization.

@article{DBLP:journals/corr/KimHC16, author = {Hyo{-}Eun Kim and Sangheum Hwang and Kyunghyun Cho}, title = {Semantic Noise Modeling for Better Representation Learning}, journal = {CoRR}, volume = {abs/1611.01268}, year = {2016}, url = {}, timestamp = {Thu, 01 Dec 2016 19:32:08 +0100}, biburl = {}, bibsource = {dblp computer science bibliography,} }



Recent advances of deep learning have achieved remarkable performances in various challenging computer vision tasks. Especially in object localization, deep convolutional neural networks outperform traditional approaches based on extraction of data/task-driven features instead of hand-crafted features. Although location information of region-of-interests (ROIs) gives good prior for object localization, it requires heavy annotation efforts from human resources. Thus a weakly supervised framework for object localization is introduced. The term "weakly" means that this framework only uses image-level labeled datasets to train a network. With the help of transfer learning which adopts weight parameters of a pre-trained network, the weakly supervised learning framework for object localization performs well because the pre-trained network already has well-trained class-specific features. However, those approaches cannot be used for some applications which do not have pre-trained networks or well-localized large scale images. Medical image analysis is a representative among those applications because it is impossible to obtain such pre-trained networks. In this work, we present a "fully" weakly supervised framework for object localization ("semi"-weakly is the counterpart which uses pre-trained filters for weakly supervised localization) named as self-transfer learning (STL). It jointly optimizes both classification and localization networks simultaneously. By controlling a supervision level of the localization network, STL helps the localization network focus on correct ROIs without any types of priors. We evaluate the proposed STL framework using two medical image datasets, chest X-rays and mammograms, and achieve signiticantly better localization performance compared to previous weakly supervised approaches.

@ARTICLE{2016arXiv160201625H, author = {{Hwang}, S. and {Kim}, H.-E.}, title = "{Self-Transfer Learning for Fully Weakly Supervised Object Localization}", keywords = {Computer Science - Computer Vision and Pattern Recognition}, year = 2016 }

Pixel-level Domain Transfer


We present an image-conditional image generation model. The model transfers an input domain to a target domain in semantic level, and generates the target image in pixel level. To generate realistic target images, we employ the real/fake-discriminator in Generative Adversarial Nets, but also introduce a novel domain-discriminator to make the generated image relevant to the input image. We verify our model through a challenging task of generating a piece of clothing from an input image of a dressed person. We present a high quality clothing dataset containing the two domains, and succeed in demonstrating decent results.

@article{DBLP:journals/corr/YooKPPK16, author = {Donggeun Yoo and Namil Kim and Sunggyun Park and Anthony S. Paek and In{-}So Kweon}, title = {Pixel-Level Domain Transfer}, year = {2016} }



Mitosis counting is time and labor-consuming work and it frequently reveals inter-observer variability. Although deep convolutional neural network, the most accurate image classification algorithm, has been used for detecting mitosis, only public data sets were tested and it had never been utilized for routine histologic slide images. Recently, smartphone cameras with adaptors to the microscope were tried for easier image acquisition and they significantly resolved a barrier for applying computer algorithms to analyze histologic images. Histologic slides of 70 invasive ductal carcinomas of breast were selected and 1761 high-power field histologic images (400x) were acquired by using smartphone application with an adaptor attached to the microscope manufactured by us. Mitoses were annotated by four pathologists blindly. More than three pathologists’ concordance was regarded as true. 2004 mitotic cells and 801600 non-mitotic cells from 60 cases were divided into 10 sets and the algorithm was sequentially trained using fine-tuning method. After the training, ten patients’ images were tested for the concordance of detection with pathologists. During the algorithm training, sensitivity for mitosis detection was calculated between 75-83%. Specificity for mitosis detection was increased to achieve 97% as we trained the algorithm with more images. The trained algorithm identified 189 mitoses in 748 images from 10 cases and showed 79% sensitivity and 96% specificity for detecting mitosis compared to the pathologists. The detected mitoses were displayed in the application within 14 seconds in average. The proposed deep convolutional neural network-based mitosis detection system revealed remarkable sensitivity and specificity, and the performance improved as more images were utilized for training. Along with the smartphone application and the adaptor we manufactured, it assists pathologists to identify mitosis so that reduce time and labor costs, while resulting objective diagnosis.



We propose an automatic TB screening system based on deep CNN. Since CNN extracts the most discriminative features according to target objective from given data by itself, the proposed system does not require manually-designed features for TB screening. Also, we show that transfer learning from lower convolutional layers of pre-trained networks resolves the difficulties in handling high-resolution medical images and training huge parameters with limited number of images. Experiments are conducted using three real field datasets, the KIT, MC and Shenzhen sets, and the results show that the proposed system has high screening performance in terms of AUC and accuracy.

@proceeding{doi:10.1117/12.2216198, author = {Hwang, Sangheum and Kim, Hyo-Eun and Jeong, Jihoon and Kim, Hee-Jin}, title = { A novel approach for tuberculosis screening based on deep convolutional neural networks }, journal = {Proc. SPIE}, volume = {9785}, pages = {97852W-97852W-8}, year = {2016}, URL = {} }



A weakly-supervised semantic segmentation framework with a tied deconvolutional neural network is presented. Each deconvolution layer in the framework consists of unpooling and deconvolution operations. 'Unpooling' upsamples the input feature map based on unpooling switches defined by corresponding convolution layer's pooling operation. 'Deconvolution' convolves the input unpooled features by using convolutional weights tied with the corresponding convolution layer's convolution operation. The unpooling-deconvolution combination helps to eliminate less discriminative features in a feature extraction stage, since output features of the deconvolution layer are reconstructed from the most discriminative unpooled features instead of the raw one. This results in reduction of false positives in a pixel-level inference stage. All the feature maps restored from the entire deconvolution layers can constitute a rich discriminative feature set according to different abstraction levels. Those features are stacked to be selectively used for generating class-specific activation maps. Under the weak supervision (image-level labels), the proposed framework shows promising results on lesion segmentation in medical images (chest X-rays) and achieves state-of-the-art performance on the PASCAL VOC segmentation dataset in the same experimental condition.

@ARTICLE{2016arXiv160204984K, author = {{Kim}, H.-E. and {Hwang}, S.}, title = "{Scale-Invariant Feature Learning using Deconvolutional Neural Networks for Weakly-Supervised Semantic Segmentation}", keywords = {Computer Science - Computer Vision and Pattern Recognition}, year = 2016 }



We present a novel detection method using a deep convolutional neural network (CNN), named AttentionNet. We cast an object detection problem as an iterative classification problem, which is the most suitable form of a CNN. AttentionNet provides quantized weak directions pointing a target object and the ensemble of iterative predictions from AttentionNet converges to an accurate object boundary box. Since AttentionNet is a unified network for object detection, it detects objects without any separated models from the object proposal to the post bounding-box regression. We evaluate AttentionNet by a human detection task and achieve the state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 with an 8-layered architecture only.

@inproceedings{attentionNet, title={AttentionNet: Aggregating Weak Directions for Accurate Object Detection}, author={Donggeun Yoo, Sunggyun Park, Joon-Young Lee*, Anthony S. Paek, In So Kweon}, booktitle={Computer Vision (ICCV)}, year={2015} }
back to top