Encoder-decoder network with RMP for tongue segmentation

Kusakunniran W.Borwarnginn P.Karnjanapreechakorn S.Thongkanchorn K.Ritthipravat P.Tuakta P.Benjapornlert P.Mahidol University2023-05-192023-05-192023-05-01Medical and Biological Engineering and Computing Vol.61 No.5 (2023) , 1193-120701400118https://repository.li.mahidol.ac.th/handle/123456789/81552Tongue and its movements can be used for several medical-related tasks, such as identifying a disease and tracking a rehabilitation. To be able to focus on a tongue region, the tongue segmentation is needed to compute a region of interest for a further analysis. This paper proposes an encoder-decoder CNN-based architecture for segmenting a tongue in an image. The encoder module is mainly used for the tongue feature extraction, while the decoder module is used to reconstruct a segmented tongue from the extracted features based on training images. In addition, the residual multi-kernel pooling (RMP) is also applied into the proposed network to help in encoding multiple scales of the features. The proposed method is evaluated on two publicly available datasets under a scenario of front view and one tongue posture. It is then tested on a newly collected dataset of five tongue postures. The reported performances show that the proposed method outperforms existing methods in the literature. In addition, the re-training process could improve applying the trained model on unseen dataset, which would be a necessary step of applying the trained model on the real-world scenario. [Figure not available: see fulltext.].EngineeringEncoder-decoder network with RMP for tongue segmentationArticleSCOPUS10.1007/s11517-022-02761-32-s2.0-8514680943617410444