Document Type

Journal Article


Digital Information Research Foundation


Faculty of Health, Engineering and Science


School of Computer and Security Science




This article was originally published as: Huang Z., Leng J. (2014). Text extraction in natural scenes using region-based method. Journal of Digital Information Management, 12(4), 246-254. Original article available here


Text in images is a very important clue for image indexing and retrieving. Unfortunately, it is a challenging work to accurately and robustly extract text from a complex background image. In this paper, a novel region-based text extraction method is proposed. In doing so, the candidate text regions are detected by 8-connected objects detection algorithm based on the edge image. Then the non-text regions are filtered out using shape, texture and stroke width rules. Finally, the remaining regions are grouped into text lines. Since stroke width is the intrinsic and particular characteristics of the text, the accuracy of the non-text filter are notably promoted. The improved Stroke Width Transform in the paper is less computing complexities and more accurate. Experimental results on sample ICDAR competition Dataset and our dataset show that the proposed method has the best performance compared with other five methods.

Access Rights

Free to read