CLaRK is an XML-based software system for corpora development. It incorporates several technologies: XML technology; Un i code ; Regular Cascaded Grammars; Constraints over XML Documents. The basic components of the system are: a tagger, a concordancer, an extractor, a grammar processor, a constraint engine.