In a multimodal conversation, the way users communicate with a system depends on the available interaction channels and the situated context (., conversation focus, visual feedback). These dependencies form a rich set of constraints from various perspectives such as temporal alignments between different modalities, coherence of conversation, and the domain semantics. There is strong evidence that competition and ranking of these constraints is important to achieve an optimal interpretation. .