This research delves into linguistic patterns within a medical QA corpus focused on skin diseases. Our primary goal is to analyze these patterns and establish linguistic resources for the automated extraction of diagnostic information, notably disease and symptom expressions. A thorough examination of the corpus revealed three key linguistic patterns in user utterances: varied disease and symptom descriptions, specific query-related linguistic structures, and expressions giving supplementary background information. From these patterns, we classified 12 distinct query types. Furthermore, we identified three vital query-related expressions concerning skin diseases: WHAT, WHY, and HOW-CURE. These linguistic patterns were encapsulated using the Local Grammar Graph (LGG) schema, designed to efficiently produce training datasets for medical chatbots' Natural Language Understanding (NLU) modules. Validating our approach, a medical counseling chatbot named LIMA, trained using our dataset, achieved an F1-score of 0.908, underscoring the effectiveness and reliability of our proposed method.
의료상담 챗봇,자연언어이해,언어자원,질병 및 의학증상 표현,질의 화행 표현,부분문법그래프(LGG)