Activity

  • Trolle Roach posted an update 1 week ago

    To make the network more robust to unanticipated noise perturbations, we use PGD to search for noise patterns that can trigger the network to give over-confident mistakes. By evaluating on two different benchmark datasets containing consensus annotations from three radiologists, we show that the proposed techniques can improve the detection performance on real CT data. To understand the limitations of both the conventional networks and the proposed augmented networks, we also perform stress-tests on the false positive reduction networks by feeding different types of artificially produced patches. We show that the augmented networks are more robust to both under-represented nodules as well as resistant to noise perturbations.The quantification of 3D shape aesthetics has so far focused on specific shape features and manually defined criteria such as the curvature and the rule of thirds respectively. In this paper, we build a model of 3D shape aesthetics directly from human aesthetics preference data and show it to be well aligned with human perception of aesthetics. To build this model, we first crowdsource a large number of human aesthetics preferences by showing shapes in pairs in an online study and then use the same to build a 3D shape multi-view based deep neural network architecture to allow us learn a measure of 3D shape aesthetics. In comparison to previous approaches, we do not use any pre-defined notions of aesthetics to build our model. Our algorithmically computed measure of shape aesthetics is beneficial to a range of applications in graphics such as search, visualization and scene composition.We developed a minimum-cost circulation framework for solving the global data association problem, which plays a key role in the tracking-by-detection paradigm of multi-object tracking. The problem was extensively studied under the minimum-cost flow framework, which is theoretically attractive as being flexible and globally solvable. However, the high computational burden has been a long-standing obstacle to its wide adoption in practice. While enjoying the same theoretical advantages and maintaining the same optimal solution as the minimum-cost flow framework, our new framework has a better theoretical complexity bound and leads to orders of practical efficiency improvement. Exploring the specialty that an overwhelming majority of the vertices are with unit capacity, we designed an implementation of the framework and proved it has the best theoretical complexity so far. We evaluated our method with 40 experiments on five MOT benchmark data sets. Our method was always the most efficient and averagely 53 to 1,192 times faster than the three state-of-the-art methods. When our method served as a sub-module for global data association methods using higher-order constraints, similar efficiency improvement was attained. read more We further illustrated through several case studies how the improved computational efficiency enables more sophisticated tracking models and yields better tracking accuracy.Domain adaptation, which transfers the knowledge from label-rich source domain to unlabeled target domains, is a challenging task in machine learning. The prior domain adaptation methods focus on pairwise adaptation assumption with a single source and a single target domain, while little work concerns the scenario of one source domain and multiple target domains. Applying pairwise adaptation methods to this setting may be suboptimal, as they fail to consider the semantic association among multiple target domains. In this work we propose a deep semantic information propagation approach in the novel context of multiple unlabeled target domains and one labeled source domain. Our model aims to learn a unified subspace common for all domains with a heterogeneous graph attention network, where the transductive ability of the graph attention network can conduct semantic propagation of the related samples among multiple domains. In particular, the attention mechanism is applied to optimize the relationships of multiple domain samples for better semantic transfer. Then, the pseudo labels of the target domains predicted by the graph attention network are utilized to learn domain-invariant representations by aligning labeled source centroid and pseudo-labeled target centroid. We test our approach on four challenging public datasets, and it outperforms several popular domain adaptation methods.A densely-sampled light field (LF) is highly desirable in various applications. However, it is costly to acquire such data. Although many computational methods have been proposed to reconstruct a densely-sampled LF from a sparsely-sampled one, they still suffer from either low reconstruction quality, low computational efficiency, or the restriction on the regularity of the sampling pattern. To this end, we propose a novel learning-based method, which accepts sparsely-sampled LFs with irregular structures, and produces densely-sampled LFs with arbitrary angular resolution accurately and efficiently. We also propose a simple yet effective method for optimizing the sampling pattern. Our proposed method, an end-to-end trainable network, reconstructs a densely-sampled LF in a coarse-to-fine manner. Specifically, the coarse sub-aperture image (SAI) synthesis module first explores the scene geometry from an unstructured sparsely-sampled LF and leverages it to independently synthesize novel SAIs, in which a confidence-based blending strategy is proposed to fuse the information from different input SAIs, giving an intermediate densely-sampled LF. Then, the efficient LF refinement module learns the angular relationship within the intermediate result to recover the LF parallax structure. Comprehensive experimental evaluations demonstrate the superiority of our method on both real-world and synthetic LF images when compared with state-of-the-art methods.Built on deep networks, end-to-end optimized image compression has made impressive progress in the past few years. Previous studies usually adopt a compressive auto-encoder, where the encoder part first converts image into latent features, and then quantizes the features before encoding them into bits. Both the conversion and the quantization incur information loss, resulting in a difficulty to optimally achieve arbitrary compression ratio. We propose iWave++ as a new end-to-end optimized image compression scheme, in which iWave, a trained wavelet-like transform, converts images into coefficients without any information loss. Then the coefficients are optionally quantized and encoded into bits. Different from the previous schemes, iWave++ is versatile a single model supports both lossless and lossy compression, and also achieves arbitrary compression ratio by simply adjusting the quantization scale. iWave++ also features a carefully designed entropy coding engine to encode the coefficients progressively, and a de-quantization module for lossy compression.