A hybrid paragraph level page segmentation