• Title/Summary/Keyword: 2차원-3차원 시각적 특징 앙상블

Search Result 1, Processing Time 0.014 seconds

Class-Agnostic 3D Mask Proposal and 2D-3D Visual Feature Ensemble for Efficient Open-Vocabulary 3D Instance Segmentation (효율적인 개방형 어휘 3차원 개체 분할을 위한 클래스-독립적인 3차원 마스크 제안과 2차원-3차원 시각적 특징 앙상블)

  • Sungho Song;Kyungmin Park;Incheol Kim
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.335-347
    • /
    • 2024
  • Open-vocabulary 3D point cloud instance segmentation (OV-3DIS) is a challenging visual task to segment a 3D scene point cloud into object instances of both base and novel classes. In this paper, we propose a novel model Open3DME for OV-3DIS to address important design issues and overcome limitations of the existing approaches. First, in order to improve the quality of class-agnostic 3D masks, our model makes use of T3DIS, an advanced Transformer-based 3D point cloud instance segmentation model, as mask proposal module. Second, in order to obtain semantically text-aligned visual features of each point cloud segment, our model extracts both 2D and 3D features from the point cloud and the corresponding multi-view RGB images by using pretrained CLIP and OpenSeg encoders respectively. Last, to effectively make use of both 2D and 3D visual features of each point cloud segment during label assignment, our model adopts a unique feature ensemble method. To validate our model, we conducted both quantitative and qualitative experiments on ScanNet-V2 benchmark dataset, demonstrating significant performance gains.