CRS4

MultiPanoWise: holistic deep architecture for multi-task dense prediction from a single panoramic image

Uzair Shah, Muhammad Tukur, Mahmood Alzubaidi, Giovanni Pintore, Enrico Gobbetti, Mowafa Househ, Jens Schneider, Marco Agus
Proc. OmniCV - IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) - 2024
Télécharger la publication : s41095-023-0358-0.pdf [5.5Mo]  
We present a novel holistic deep-learning approach for multi-task learning from a single indoor panoramic im- age. Our framework, named MultiPanoWise, extends vi- sion transformers to jointly infer multiple pixel-wise sig- nals, such as depth, normals, and semantic segmentation, as well as signals from intrinsic decomposition, such as re- flectance and shading. Our solution leverages a specific ar- chitecture combining a transformer-based encoder-decoder with multiple heads, by introducing, in particular, a novel context adjustment approach, to enforce knowledge distil- lation between the various signals. Moreover, at train- ing time we introduce a hybrid loss scalarization method based on an augmented Chebychev/hypervolume scheme. We demonstrate the capabilities of the proposed architec- ture on public-domain synthetic and real-world datasets. We showcase performance improvements with respect to the most recent methods specifically designed for single tasks, like, for example, individual depth estimation or semantic segmentation. To the best of our knowledge, this is the first architecture able to achieve state-of-the-art performance on the joint extraction of heterogeneous signals from single in- door omnidirectional images.

Images et films

 

Références BibTex

@InProceedings{STAPGHSA24,
  author       = {Shah, U. and Tukur, M. and Alzubaidi, M. and Pintore, G. and Gobbetti, E. and Househ, M. and Schneider, J. and Agus, M.},
  title        = {MultiPanoWise: holistic deep architecture for multi-task dense prediction from a single panoramic image},
  booktitle    = {Proc. OmniCV - IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  year         = {2024},
  note         = {To appear},
  keywords     = {deep-learning, indoor panoramic},
  url          = {https://publications.crs4.it/pubdocs/2024/STAPGHSA24},
}

Autres publications dans la base

» Uzair Shah
» Muhammad Tukur
» Mahmood Alzubaidi
» Giovanni Pintore
» Enrico Gobbetti
» Mowafa Househ
» Jens Schneider
» Marco Agus