高级检索+

基于概率选择融合和上下文增强的场景文本图像超分辨率

Super-resolution of scene text image via probabilistically selective fusion and context enhancement

  • 摘要: 场景文本图像超分辨率(Scene Text Image Super-Resolution,STISR)是指增强自然场景中拍摄到的文本图像分辨率的过程。在远距离、运动中或光线条件较差的情况下拍摄文本图像,文本常常难以辨认。然而,大多数现有方法都没有充分考虑到场景文本图像的独特性。例如,它们通常无法同时兼顾文本特定的先验知识与适合场景文本图像超分辨率的技术手段。为了解决这些问题,本文介绍了基于概率选择融合和上下文增强的网络,它包括三个关键组件:概率选择融合(Probabilistic Select Fusion,PSF)模块、文本图像融合(Text-Image Fusion,TIF)模块和上下文增强残差(Context-Enhanced Residual,CER)模块。PSF模块通过概率选择增强了文本识别器文本先验的鲁棒性,随后TIF模块将文本特征和视觉特征融合在一起,最后,CER模块专门针对细微文字细节和高频信息进行细化。在TextZoom 数据集上进行的实验,结果表明,本文方法的性能优于现有主流方法,也验证了本文方法在场景文本图像超分辨率应用中的有效性。

     

    Abstract: Scene Text Image Super-Resolution (STISR) refers to enhance the resolution of text images captured in natural scenes. Text images taken from a distance,in motion,or under poor lighting conditions are often difficult to discern. However,most existing methods have not adequately considered the distinctive characteristics of scene text images. For instance,they often fail to simultaneously integrate text-specific prior knowledge with techniques tailored for super-resolution in scene text images. To address these issues,we introduce a network based on Probabilistically Selective Fusion and Context-Enhanced mechanisms in this paper,which consists of three key components:the Probabilistically Selective Fusion (PSF) module,the Text-Image Fusion (TIF) module,and the Context-Enhanced Residual (CER) module. The PSF module enhances the robustness of text priors from text recognizers through probabilistic selection,followed by the TIF module that fuses textual and visual features. Finally,the CER module refines fine text details and high-frequency information. Experimental results on the TextZoom dataset show that our method outperforms existing mainstream methods,validating its effectiveness in the applications of the super-resolution of scene text image.

     

/

返回文章
返回