Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents