Multimodal Data Recognition
Research Team
Research Summary
Our goal is comprehensive environment understanding around a robot. It includes object recognition, 3D scene recognition, and human activity recognition through signal processing / pattern recognition on multimodal sensory data. Especially, we focus on recognizing unknown events / objects.
- Main Research Fields
-
- Computer Vision
- Robot Vision
- Multimodal Recognition
- Keywords
-
- Object Recognition
- Activity Recognition
- Spatio-temporal Environmental Understanding
- Perception of Unknown Event/Object
- Scene Graph Generation
- Research theme
-
- Unknown Object Recognition
- Recognition from a Skeleton Sequence
- Scene Change Detection
- Human Behavior Change Detection
Yasutomo Kawanishi
History
- 2006
- Bachelor of Engineering, Kyoto University
- 2008
- Master of Informatics, Kyoto University
- 2011
- Ph.D Informatics, Kyoto University
Award
- 2009
- Best Paper Award
- 2016
- IEEE ITS Society Nagoya Chapter Young Researcher Award
Members
- Motoharu Sonogashira
- Research Scientist
- Vijay John
- Research Scientist
- Itthisak Phueaksri
- Postdoctoral Researcher
- Christiane Mietzsch
- Special technical staff
- Shohei Nobuhara
- Visiting Scientist
- Tomohiro Fujita
- Visiting Scientist
- Akira Kohjin
- Administrative Part-time Worker II and Student Trainee
- Tingwei Liu
- Administrative Part-time Worker II and Student Trainee
- Daiju Kanaoka
- Student Trainee
- Diego Hernandez Rodriguez
- Student Trainee
- Da Huo
- Student Trainee
- Nguyen Trung Thanh
- Student Trainee
- Taiyo Tamaki
- Student Trainee
- Yuga Yano
- Student Trainee
- Ozaki Airi
- Student Trainee
- Murakawa Toshikazu
- Student Trainee
- Hiei Satoshi
- Student Trainee
- Yamada Shion
- Student Trainee
- Yo-hsin Fang
- Student Trainee
- Hao-yu Hou
- Student Trainee
- Jia-yi Chen
- Student Trainee
- Yu-chen Lai
- Student Trainee
- Hirakawa Hayato
- Student Trainee
- Yo-Hsin Fang
- Student Trainee
Former member
- Ziqi Li
- Student Trainee(2022/12-2024/3)
- Joy Battocchio
- Research Intern(2023/09-2023/10)
- Hayato Yumiya
- Research Intern(2021/07-2021/08)
- Masaya Mizuno
- Research Intern(2021/08-2021/09)
- Thomas Reolon
- Research Intern(2022/12-2023/01)
- Kotaro Fujishiro
- Research Intern(2023/9)
- Haruto Kugo
- Research Intern(2023/9)
- Daijiro Suzuki
- Research Intern(2023/9)
Research results
Unknown object recognition and description
When we humans see an unknown object, we can recognize it as some kind of object even if we don't know what it is. We also describe the relationship with other objects, e.g., an unknown object is on the table and besides the laptop.
On the other hand, robots can only detect objects that their object detectors have learned about and cannot estimate the relationship with other objects. Our team is researching the topic, "object recognition including unknown objects and relationship estimation".
The recognition problem including unknown objects is called the open-set recognition problem, which has recently attracted much attention in the computer vision field. On the other hand, the problem of recognizing relations among objects and describing them in a graph structure is called scene graph generation (SGG). Our team has named the problem of describing a scene containing unknown objects in a graph structure as open-set scene graph generation (Open-set SGG).
We have formulated the problem setup, proposed experimental protocols and evaluation metrics, and proposed a baseline method of the problem.
Human pose prediction from short time observations
Observing a person's activities and predicting the person's current state and future pose a few seconds later are important for many applications, such as proactive support by robots. Our team is working on predicting a person's future poses by observing the short-term behavior of the person.
Recent development in pose estimation techniques has led to many studies on human behavior using a sequence of human skeletons. Sequences of human skeletons are often considered a graph; vertices have locations of body joints, and edges represent the connectivity of body joints. Thus, graph-convolution-based methods have been proposed. However, some of the body motions cannot be distinguished only from the skeleton sequence in the future pose estimation task. In our study, we have proposed a method to predict future motions by using additional information, such as human surroundings.
Selected Publications
-
Motoharu Sonogashira, Masaaki Iiyama, Yasutomo Kawanishi
“Relationship-Aware Unknown Object Detection for Open-Set Scene Graph Generation”
IEEE Access, vol.12, pp.122513 - 122523, (2024) (open access). -
植田 暢大, 波部 英子, 松井 陽子, 湯口 彰重, 河野 誠也, 川西 康友, 黒橋 禎夫, 吉野 幸一郎
“J-CRe3:実世界における参照関係解決のための日本語対話データセット”
自然言語処理, vol. 31, no. 3, (2024) (open access). -
Vijay John, Yasutomo Kawanishi
“Frame-Level Latent Embedding using Weak Labels for Multi-view Action Recognition”
IEEE International Conference on Multimedia Information Processing and Retrieval, (2024). -
Tingwei Liu, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide
“Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association”
CV4Animals: Computer Vision for Animal Behavior Tracking and Modeling, In conjunction with Computer Vision and Pattern Recognition 2024, (2024). -
Yoshimitsu Kajiwara, Wanwan Zheng, Yasutomo Kawanishi
“Iconographic analysis of ancient roof tiles using a data science approach”
The Indonesian Journal of Social Studies, vol. 7, no. 2, pp.41-49, (2024) (open access). -
Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide
“One-stage open-vocabulary temporal action detection leveraging temporal multi-scale and action label features”
Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, (2024). -
Shun Inadumi, Seiya Kawano, Akishige Yuguchi, Yasutomo Kawanishi, Koichiro Yoshino
“A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions”
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, (2024). -
Nobuhiro Ueda, Hideko Habe, Akishige Yuguchi, Seiya Kawano, Yasutomo Kawanishi, Sadao Kurohashi, Koichiro Yoshino
“J-CRe3: A Japanese Conversation Dataset for Real-world Reference Resolution”
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, (2024). -
Yukinori Kawae, Yasutomo Kawanishi, Ichiroh Kanaya, Yoshihiro Yasumuro
“3D Survey of the Menkaure Pyramid”
Virtual Annual Meeting, American Research Center in Egypt, (2024). -
Trung Thanh Nguyen, Phi Le Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide
“Zero-Shot Pill-Prescription Matching With Graph Convolutional Network and Contrastive Learning”
IEEE Access, vol. 12, pp. 55889-55904, (2024) (open access). -
畑 隆聖, 出口 大輔, 平山 高嗣, 川西 康友, 村瀬 洋
“Eye-contact Transformer: シーンコンテキストを考慮した遠方歩行者のアイコンタクト検出”
電子情報通信学会論文誌, Vol.J107-D, No.04, pp.231-242, (2024). -
Chihaya Matsuhira, Marc Aurel Kastner, Takahiro Komamizu, Takatsugu Hirayama, Keisuke Doman, Yasutomo Kawanishi, Ichiro Ide
“Interpolating the Text-to-Image Correspondence Based on Phonetic and Phonological Similarities for Nonword-to-Image Generation”
IEEE Access, vol.12, pp.41299 -41316, (2024) (open access). -
Masaya Mizuno, Tomohiro Fujita, Yasutomo Kawanishi, Daisuke Deguchi, Hiroshi Murase
“Subjective Baggage-Weight Estimation based on Human Walking Behavior”
IEEE Access, Vol. 12, pp. 39390 - 39398, (2024) (open access) -
Hiroki Tatemichi, Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase
“Category-level Object Pose Estimation in Heavily Cluttered Scenes by Generalized Two-stage Shape Reconstructor”
IEEE Access, vol. 12, pp. 33440-33448, (2024) (open access). -
Naoya Kawamura, Wataru Sato, Koh Shimokawa, Tomohiro Fujita, Yasutomo Kawanishi
“Machine learning-based interpretable modeling for subjective emotional dynamics sensing using facial EMG”
Sensors, vol. 24, no. 5, 1536, (2024) (open access). -
Angel Garcia Contreras, Seiya Kawano, Yasutomo Kawanishi, Yutaka Nakamura, Saito Satoru, Koichiro Yoshino
“Examining the Impact of a Forgetful Multi-store Memory System in a Cognitive Assistive Robot”
The 14th International Workshop on Spoken Dialogue Systems Technology, (2024). -
Hiroto Murakami, Jialei Chen, Daisuke Deguchi, Takatsugu Hirayama, Yasutomo Kawanishi, Hiroshi Murase
“Pedestrian's Gaze Object Detection in Traffic Scene”
Proceedings of the 19th International Conference on Computer Vision Theory and Applications (VISAPP), (2024). -
Itthisak Phueaksri, Marc A. Kastner, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide
“Image-Collection Summarization Using Scene-Graph Generation With External Knowledge”
IEEE Access, vol.12, pp. 17499 - 17512, (2024) (open access) -
Itthisak Phueaksri, Marc A. Kastner, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide
“An Approach to Generate a Caption for an Image Collection Using Scene Graph Generation”
IEEE Access, vol.11, pp. 128245 - 128260, (2023) (open access) -
Daiju Kanaoka, Hakaru Tamukoh, Motoharu Sonogashira, Yasutomo Kawanishi
“ManifoldNeRF: View-dependent Image Feature Supervision for Few-shot Neural Radiance Fields”
In Proceedings of the 34th British Machine Vision Conference, (2023) -
Shu Nakamura, Yasutomo Kawanishi, Shohei Nobuhara, Ko Nishino
“DeePoint: Visual Pointing Recognition and Direction Estimation”
In Proceedings of the 19th International Conference on Computer Vision, (2023) -
Tomohiro Fujita, Yasutomo Kawanishi
“Human Pose Prediction by Progressive Generation in Multi-scale Frequency Domain”
In Proceedings of the 18th International Conference on Machine Vision Applications, (2023) -
Vijay John, Yasutomo Kawanishi
“Combining Knowledge Distillation and Transfer Learning for Sensor Fusion in Visible and Thermal Camera-based Person Classification”
In Proceedings of the 18th International Conference on Machine Vision Applications, (2023) -
Vijay John, Yasutomo Kawanishi
“Multimodal Cascaded Framework with Metric Learning Robust to Missing Modalities for Person Classification”
In Proceedings of the 14th ACM Multimedia Systems Conference, (2023) (open access) -
Vijay John, Yasutomo Kawanishi
"Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations"
Sensors 2023, 23(10), 4666 (2023) (open access) -
Masaya Mizuno, Tomohiro Fujita, Yasutomo Kawanishi, Daisuke Deguchi, Hiroshi Murase
"Subjective Baggage-Weight Estimation from Gait ---Can you estimate how heavy the person feels?---"
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), (2023) -
Hayato Yumiya, Yasutomo Kawanishi, Daisuke Deguchi, Hiroshi Murase
"End-to-End Gaze Grounding of a Person Pictured from Behind"
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), (2023) -
Tomohiro Fujita, Yasutomo Kawanishi
"Future Pose Prediction from 3D Human Skeleton Sequence with Surrounding Situation"
Sensors 2023, 23(2), 876 (2023) (open access) -
Itthisak Phueaksri, Marc A. Kastner, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide
"Towards Captioning an Image Collection from a Combined Scene Graph Representation Approach"
In Proceedings of the 29th International Conference on MultiMedia Modeling (2023) -
Vijay John, Yasutomo Kawanishi
"Audio-Visual Sensor Fusion Framework using Person Attributes Robust to Missing Visual Modality for Person Recognition"
In Proceedings of the 29th International Conference on MultiMedia Modeling (2023) -
Jiaxin Li, Yasutomo Kawanishi, Daisuke Deguchi, Hiroshi Murase
"A Preliminary Study on View Independent Panoptic Scene Change Detection"
In proceedings of the 2023 International Workshop on Advanced Image Technology (2023) -
Vijay John, Yasutomo Kawanishi
"A Multimodal Sensor Fusion Framework Robust to Missing Modalities for Person Recognition"
In proceedings of the ACM Multimedia Asia 2022 (2022) -
Yasutomo Kawanishi, Ichiro Ide, Baidong Chu, Chihaya Matsuhira, Marc A. Kastner, Takahiro Komamizu, Daisuke Deguchi
"Detection of Birds in a 3D Environment Referring to Audio-Visual Information"
In Proceedings of the 18th IEEE International Conference on Advanced Video and Signal-based Surveillance (2022) - Vijay John, Yasutomo Kawanishi
"Audio and Video-Based Emotion Recognition Using Multimodal Transformers"
In Proceedings of the 26th International Conference on Pattern Recognition (2022). - Yasutomo Kawanishi
"Label-Based Multiple Object Ensemble Tracking with Randomized Frame Dropping"
In Proceedings of the 26th International Conference on Pattern Recognition (2022). - Tomohiro Fujita, Yasutomo Kawanishi
"Toward Surroundings-aware Temporal Prediction of 3D Human Skeleton Sequence"
In Proceedings of the 26th ICPR Workshop: Towards a Complete Analysis of People: From Face and Body to Clothes (2022). -
Motoharu Sonogashira, Masaaki Iiyama, Yasutomo Kawanishi,
"Towards Open-Set Scene Graph Generation with Unknown Objects"
IEEE Access, Vol.10, pp.11574-11583 (2022) ( open access ) -
Mahmud Dwi Sulistiyo, Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Takatsugu Hirayama, Hiroshi Murase.:
"ColAtt-Net: In Reducing the Ambiguity of Pedestrian Orientations on Attribute-aware Semantic Segmentation Task"
IEEJ Transactions on Electronics, Information and Systems, Vol. 16, Issue 2, (2021). -
Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase.:
"Ω-GAN: Object Manifold Embedding GAN for Image Generation by Disentangling Parameters into Pose and Shape Manifolds"
In Proceedings of the International 25th International Conference on Pattern Recognition (2020). -
Hiroki Tatemichi, Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase.:
"Median-shape Representation Learning for Category-level Object Pose Estimation in Cluttered Environments"
In Proceedings of the International 25th International Conference on Pattern Recognition (2020). -
Saki Iwata, Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase.:
"LFIR2Pose: Pose Estimation from an Extremely Low-Resolution FIR Image Sequence"
In Proceedings of the International 25th International Conference on Pattern Recognition (2020). -
Hitoshi Nishimura, Kazuyuki Tasaka, Yasutomo Kawanishi, Hiroshi Murase.:
"Multiple Human Tracking with Alternately Updating Trajectories and Multi-Frame Action Features"
ITE Transactions on Media Technology and Applications, Vol. 8, No.4, pp. 269-279, (2020). -
Hitoshi Nishimura, Kazuyuki Tasaka, Yasutomo Kawanishi, Hiroshi Murase.:
"Multiple Human Tracking using an Omnidirectional Camera with Local Rectification and World Coordinates Representation"
IEICE Transactions on Information and Systems, Vol. E103-D, No. 6, pp.1745-1361, (2020). -
Naoki Nishida, Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase, Jun Piao.:
"SOANets: Encoder-Decoder based Skeleton Orientation Alignment Network for White Cane User Recognition from 2D Human Skeleton Sequence"
In Proceedings of the 15th International Conference on Computer Vision Theory and Applications, pp. 435-443, 2020. -
Yasutomo Kawanishi, Hiroshi Murase, Jianfeng Xu, Kazuyuki Tasaka, Hiromasa Yanagihara.:
"Which Content is he/she Reading? --Reading Content Estimation using an Indoor Surveillance Camera--"
In Proceedings of the 24th International Conference on Pattern Recognition, pp. 1731-1736, (2018). -
Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase.:
"Trajectory Ensemble: Multiple Persons Consensus Tracking across Non-overlapping Multiple Cameras over Randomly Dropped Camera Networks"
In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshop, pp. 56-62, (2017). -
Brahmastro Kresnaraman, Yasutomo Kawanishi, Daisuke Deguchi, Tomokazu Takahashi, Yoshito Mekada, Ichiro Ide, Hiroshi Murase.:
"Human Wearable Attribute Recognition using Probability-Map-based Decomposition of Thermal Infrared Images"
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol.E100-A Issue 3, pp.854-864, (2017).
Links
Yasutomo Kawanishi
Multimodal Data Recognition Research Team(RIKEN)
Contact Information
yasutomo.kawanishi [at] riken.jp