{"id":778,"date":"2026-04-24T10:51:23","date_gmt":"2026-04-24T10:51:23","guid":{"rendered":"https:\/\/iberspeech.tech\/?page_id=778"},"modified":"2026-04-24T11:02:20","modified_gmt":"2026-04-24T11:02:20","slug":"keynotes","status":"publish","type":"page","link":"https:\/\/iberspeech.tech\/2026\/keynotes\/","title":{"rendered":"KEYNOTES"},"content":{"rendered":"<p>[et_pb_section fb_built=&#8221;1&#8243; custom_padding_last_edited=&#8221;off|desktop&#8221; admin_label=&#8221;Hero&#8221; _builder_version=&#8221;4.24.2&#8243; background_enable_color=&#8221;off&#8221; custom_margin=&#8221;|||&#8221; custom_padding=&#8221;50px||0px|||&#8221; custom_padding_tablet=&#8221;130px||130px|&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_row _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; width=&#8221;100%&#8221; max_width=&#8221;1280px&#8221; custom_padding=&#8221;|15px|3px|15px|false|false&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.24.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_text _builder_version=&#8221;4.24.2&#8243; _module_preset=&#8221;default&#8221; header_font=&#8221;|||on|||||&#8221; custom_margin=&#8221;||1px|||&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<h1>KEYNOTE SPEAKERS<\/h1>\n<p>[\/et_pb_text][et_pb_divider divider_weight=&#8221;3px&#8221; _builder_version=&#8221;4.24.2&#8243; _module_preset=&#8221;default&#8221; width=&#8221;25%&#8221; module_alignment=&#8221;left&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_divider][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_team_member name=&#8221;Professor Jo\u00e3o Magalh\u00e3es&#8221; position=&#8221;Full Professor at the Computer Science Dep. at Universidade NOVA de Lisboa and national co-Director of the CMU Portugal partnership.&#8221; image_url=&#8221;https:\/\/iberspeech.tech\/2026\/wp-content\/uploads\/2026\/04\/Joao-Magalhaes-profile.jpeg&#8221; _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; hover_enabled=&#8221;0&#8243; global_colors_info=&#8221;{}&#8221; header_font_size=&#8221;28px&#8221; sticky_enabled=&#8221;0&#8243; position_font_size=&#8221;18px&#8221; custom_margin=&#8221;0px||||false|false&#8221;]<\/p>\n<p><span data-teams=\"true\"><\/span><\/p>\n<p><span data-teams=\"true\">Jo\u00e3o Magalh\u00e3es holds a Ph.D. degree (2008) from Imperial College London, UK. His research aims to move vision and language AI closer to the way humans understand it and communicate. He has made scientific contributions to the fields of multimedia search and summarization, multimodal conversational AI, data mining and multimodal information representation. He is currently coordinating the creation of the sovereign LLM AMALIA, and, in the past, has coordinated and participated in several research projects (national, EU-FP7 and H2020) where he pursues robust and generalizable methods in different domains. He is regularly involved in review panels, organization of international conferences and program committees. His work and the work of his group has been awarded, or nominated for, several honours and distinctions, most notably the 1st prize in the Amazon Alexa Taskbot Challenge 2022. He was the General Chair of ECIR 2020 and ACM Multimedia 2022, Honorary Chair for ACM Multimedia Asia 2021 and will be the PC chair of ACM Multimedia 2026.<\/span><\/p>\n<p>[\/et_pb_team_member][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_toggle title=&#8221;Title of the talk: Multimodal Conversational Assistance of Complex Manual Tasks&#8221; open=&#8221;on&#8221; toggle_icon=&#8221;&#x43;||divi||400&#8243; _builder_version=&#8221;4.27.4&#8243; _module_preset=&#8221;default&#8221; title_level=&#8221;h2&#8243; title_font=&#8221;|||||on|||&#8221; body_font_size=&#8221;15px&#8221; custom_margin=&#8221;||30px|||&#8221; hover_enabled=&#8221;0&#8243; global_colors_info=&#8221;{}&#8221; sticky_enabled=&#8221;0&#8243;]<\/p>\n<h4><span data-teams=\"true\">Abstract<\/span><\/h4>\n<p><span data-teams=\"true\">Conversational agents have become an integral part of our daily routines, aiding humans in various tasks. Helping users in real-world manual tasks is a complex and challenging paradigm, where it is necessary to leverage multiple information sources, provide several multimodal stimuli, and be able to correctly ground the conversation in a helpful and robust manner. In this talk I will describe TWIZ, a conversational AI assistant that is helpful, multimodal, knowledgeable, and engaging, and designed to guide users towards the successful completion of complex manual tasks. To achieve this, we focused our efforts on three main research questions: (1) Humanly-Shaped Conversations, by providing information in a knowledgeable way; (2) Multimodal Stimulus, making use of various modalities including voice, images, and videos; and (3) Zero-shot Conversational Flows, to improve the robustness of the interaction to unseen scenarios. TWIZ is an assistant capable of supporting a wide range of unseen tasks &#8212; it leverages Generative AI methods to deliver several innovative features such as creative cooking, video navigation through voice, and the robust PlanLLM, a Large Language Model trained for dialoguing about complex manual tasks.<\/span><\/p>\n<p>[\/et_pb_toggle][\/et_pb_column][\/et_pb_row][\/et_pb_section][et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;4.16&#8243; _module_preset=&#8221;%22default%22&#8243; custom_padding=&#8221;%220px||2px|||%22&#8243; global_colors_info=&#8221;%22{}%22&#8243;][\/et_pb_section][et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;4.16&#8243; _module_preset=&#8221;%22default%22&#8243; custom_padding=&#8221;%220px||2px|||%22&#8243; global_colors_info=&#8221;%22{}%22&#8243;][\/et_pb_section]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>KEYNOTE SPEAKERS Jo\u00e3o Magalh\u00e3es holds a Ph.D. degree (2008) from Imperial College London, UK. His research aims to move vision and language AI closer to the way humans understand it and communicate. He has made scientific contributions to the fields of multimedia search and summarization, multimodal conversational AI, data mining and multimodal information representation. He [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_et_pb_use_builder":"on","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-778","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.0 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>KEYNOTES - Iberspeech<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/iberspeech.tech\/2026\/keynotes\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"KEYNOTES - Iberspeech\" \/>\n<meta property=\"og:description\" content=\"KEYNOTE SPEAKERS Jo\u00e3o Magalh\u00e3es holds a Ph.D. degree (2008) from Imperial College London, UK. His research aims to move vision and language AI closer to the way humans understand it and communicate. He has made scientific contributions to the fields of multimedia search and summarization, multimodal conversational AI, data mining and multimodal information representation. He [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/iberspeech.tech\/2026\/keynotes\/\" \/>\n<meta property=\"og:site_name\" content=\"Iberspeech\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-24T11:02:20+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/iberspeech.tech\/2026\/keynotes\/\",\"url\":\"https:\/\/iberspeech.tech\/2026\/keynotes\/\",\"name\":\"KEYNOTES - Iberspeech\",\"isPartOf\":{\"@id\":\"https:\/\/iberspeech.tech\/2026\/#website\"},\"datePublished\":\"2026-04-24T10:51:23+00:00\",\"dateModified\":\"2026-04-24T11:02:20+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/iberspeech.tech\/2026\/keynotes\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/iberspeech.tech\/2026\/keynotes\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/iberspeech.tech\/2026\/keynotes\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/iberspeech.tech\/2026\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"KEYNOTES\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/iberspeech.tech\/2026\/#website\",\"url\":\"https:\/\/iberspeech.tech\/2026\/\",\"name\":\"Iberspeech\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/iberspeech.tech\/2026\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"KEYNOTES - Iberspeech","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/iberspeech.tech\/2026\/keynotes\/","og_locale":"en_US","og_type":"article","og_title":"KEYNOTES - Iberspeech","og_description":"KEYNOTE SPEAKERS Jo\u00e3o Magalh\u00e3es holds a Ph.D. degree (2008) from Imperial College London, UK. His research aims to move vision and language AI closer to the way humans understand it and communicate. He has made scientific contributions to the fields of multimedia search and summarization, multimodal conversational AI, data mining and multimodal information representation. He [&hellip;]","og_url":"https:\/\/iberspeech.tech\/2026\/keynotes\/","og_site_name":"Iberspeech","article_modified_time":"2026-04-24T11:02:20+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/iberspeech.tech\/2026\/keynotes\/","url":"https:\/\/iberspeech.tech\/2026\/keynotes\/","name":"KEYNOTES - Iberspeech","isPartOf":{"@id":"https:\/\/iberspeech.tech\/2026\/#website"},"datePublished":"2026-04-24T10:51:23+00:00","dateModified":"2026-04-24T11:02:20+00:00","breadcrumb":{"@id":"https:\/\/iberspeech.tech\/2026\/keynotes\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/iberspeech.tech\/2026\/keynotes\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/iberspeech.tech\/2026\/keynotes\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/iberspeech.tech\/2026\/"},{"@type":"ListItem","position":2,"name":"KEYNOTES"}]},{"@type":"WebSite","@id":"https:\/\/iberspeech.tech\/2026\/#website","url":"https:\/\/iberspeech.tech\/2026\/","name":"Iberspeech","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/iberspeech.tech\/2026\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/pages\/778","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/comments?post=778"}],"version-history":[{"count":8,"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/pages\/778\/revisions"}],"predecessor-version":[{"id":1828,"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/pages\/778\/revisions\/1828"}],"wp:attachment":[{"href":"https:\/\/iberspeech.tech\/2026\/wp-json\/wp\/v2\/media?parent=778"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}