Sign in
Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS
May 26, 2022
|
44 views
AI Coffee Break with Letitia
Follow
Details
📺 Watch on YouTube 😃
Have you ever wondered where the problems with multimodal integrations of vision and language are? This is the first part of Ms. Coffee Bean’s quest to uncovering what’s going wrong with multimodal vision and language integration. ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕ Patreon:
https://www.patreon.com/AICoffeeBreak
Ko-fi:
https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 📺 Ms. Coffee Bean explains PROBING:
https://youtu.be/fL22NAtMNYo
📺 Ms. Coffee Bean defines MULTIMODALITY:
https://youtu.be/jReaoJWdO78
Outline: * 00:00 Visual Question Answering * 01:04 Visual Dialog Demo * 02:30 The symptom * 04:06 Multimodal stress test 1 * 06:35 Multimodal stress test 2 Papers: 📄 Patro, Badri, and Vinay P. Namboodiri. "Differential attention for visual question answering." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7680-7688. 2018.
https://openaccess.thecvf.com/content_cvpr_2018/papers/Patro_Differential_Attention_for_CVPR_2018_paper.pdf
📄 Shekhar, Ravi, Ece Takmaz, Raquel Fernández, and Raffaella Bernardi. "Evaluating the Representational Hub of Language and Vision Models." IWCS 2019: 211.
https://www.aclweb.org/anthology/W19-0418.pdf
📄 Caglayan, O., Madhyastha, P., Specia, L., & Barrault, L. (2019, June). Probing the Need for Visual Context in Multimodal Machine Translation. In Proceedings of the 2019 Conference of the North (pp. 4159-4170). Association for Computational Linguistics.
https://arxiv.org/pdf/1903.08678.pdf
Intro music: Discovery Hit by Kevin MacLeod is licensed under a Creative Commons Attribution license (
https://creativecommons.org/licenses/by/4.0/
) Source:
http://incompetech.com/music/royalty-free/index.html?isrc=USUAN1300023
Artist:
http://incompetech.com/
Video contains emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0 ----------------------------------------------------- 🔗 Links: YouTube:
https://www.youtube.com/AICoffeeBreak
Twitter:
https://twitter.com/AICoffeeBreak
Reddit:
https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean #multimodality #multimodal #MachineLearning #AI #research #ComputerVision #NLP
00:00
Visual Question Answering
01:04
Visual Dialog Demo
02:30
The symptom
04:06
Multimodal stress test 1
06:35
Multimodal stress test 2
Category: Research Talk
Comments
loading...
Reactions
(0)
| Note
📝 No reactions yet
Be the first one to share your thoughts!
Reactions
(0)
Note
loading...
Recommended
51:14
SREcon15 - SRE Hiring
USENIX
| Apr 9, 2015
38:12
SREcon15 - Incident Analysis
USENIX
| Apr 13, 2015
51:23
SREcon15 - From Zero to Hero: Recommended Practices for Training your Ever-Evolving SRE Teams
USENIX
| Apr 13, 2015
52:34
SREcon15 - Architecting and Launching the Halo 4 Services
USENIX
| Apr 13, 2015
43:05
SREcon15 - Being Afraid—How Paranoia at Dropbox Protects Your Data
USENIX
| Apr 13, 2015