Last modified: 2018-03-09
Abstract
Objectives. To support documentation, storage, and retrieval of information, various terminologies have been created in the last twenty years to assist domain experts in these activities. The majority of them, each developed for specific purposes and medical subdomains, are targeted at specialized medicine, focusing on clinical aspects. Only a few of them cover the General Practice / Family Medicine (GP/FM) domain (e.g., the International Classification of Primary Care - ICPC). As ICPC fails to capture some non-clinical issues (e.g., organizational and managerial aspects of GP/FM) the Q-Codes taxonomy has been developed to extend ICPC, encompassing those contextual professional issues. The aim of this work is to show the value of Q-Codes and its usefulness for the semantic annotation of Second Opinion Requests (SOR) of rural Brazilian primary healthcare providers.
The Q-Codes taxonomy consists of 182 terms distributed among 8 domains, each containing between 2 and 4 levels of granularity. The 8 domains include: “QC: Patient’s Category” (age, gender issues); “QD: Family Doctor’s Issue” (communication, medico legal); “QE: Medical Ethics” (bioethics, professional ethics); “QH: Planetary Health” (environmental health, biological hazards); “QP: Patient Issue” (patient safety, quality of healthcare); “QR: Research” (research methods, tools); “QS: Structure of Practice” (primary care setting, provider); and “QT: Knowledge Management” (training).
In caring for their patients, rural healthcare teams in Brazil need second opinions from specialists. The Brazilian telehealth system was developed to meet these needs. When a rural healthcare team needs a second opinion, they send their questions through the telehealth system to the specialists in the urban telehealth centers. The appropriate health professional provides a second opinion or an answer through the telehealth system back to the rural healthcare team. The questions and corresponding answers (Q/A pairs) are collected for information sharing and reuse. Classifications, such as the Q-Codes, are needed to assist in these activities. With the Q-Codes, information sharing can be performed through interoperability.
Methods. A data set containing 5,580 question-answer pairs for the years 2010-2012, in the Brazilian-Portuguese language, was obtained from an urban telehealth center. Webinars and tele-ECG Q/A pairs were eliminated, giving 1,669. Among these, 550 questions (~33% from each of the three years) were randomly selected and deidentified for inclusion into the sample data set.
Each selected question was read to determine its semantic meaning, and coded using both the ICPC and Q-Codes classification systems. Based on this meaning, some general guidelines and the definitions of individual Q-Codes, each question was manually assigned between 0 and 5 Q-Codes.
When the question provided an age of the patient, the appropriate age group was assigned from the "Patient's Category" domain (QC). When the question pertained to gender issues, such as pregnancy or birth control, it was assigned appropriate concepts from the "Patient's Category" domain (QC). When the question represented a need for information not referring to a specific patient, it was assigned a concept from the "Knowledge Management" domain (QT). Finally, if the question represented disease prevention and multimorbidity, it was assigned the appropriate concepts from the "Family Doctor's Issue" domain (QD).
Main Results. As of the writing of this abstract, 100 (18%) of the 550 questions from the sample data set have been attempted to be semantically annotated with Q-Codes. Out of the 100 attempts, 98 (98%) were successful. Unsuccessful attempts (2%) were due to the lack of semantic meaning in the question.
For the successfully annotated questions, between 1 and 3 Q-Codes were assigned. Nearly three-fifths (56%) of the questions were assigned 2 Q-Codes; 41% were assigned 1 Q-Code; and 3% were assigned 3 Q-Codes. There were seven instances where the question was assigned at least one Q-Code, while being unable to be coded with any ICPC codes.
A cumulative total of 159 Q-Codes were assigned to the 98 questions. More precisely, 97.5% of these Q-Codes were assigned at the sub-subcategory level, with 1.25% of the Q-Codes being assigned at both the subcategory and the sub-sub-subcategory level.
The vast majority of Q-Code assignments were almost equally split between the QC (42%) and QD (37%) domains. Domain QT covered 21% of the assignments, while a single assignment was made to the domain QP. Four domains were not assigned: QE, QH, QR and QS.
Six of the top 10 Q-Codes assigned belong to the Patient's Category domain (QC), 3 to the Doctor’s Issue domain (QD), and 1 to the Knowledge Management domain (QT).
In terms of interoperability, the use of Q-codes and its mappings to index resources (e.g., MeSH, DeCS) allows the direct access to scientific literature on the domain of interest. This is the case, for example, of the resources of BIREME (Latin American and Caribbean Center on Health Sciences Information), which are accessed through the LILACS ("Literatura Latinoamericana e Caribe em Ciencias de la Saude") database, which in turn is indexed by DeCS (“Descriptors em Ciencia e Saude”), Health Sciences Descriptors, available in the English, Spanish, French and Brazilian Portuguese languages, and developed from MeSH. By using the Q-Codes mappings to DeCS, information from LILACS can be made more accessible to the rural healthcare teams, thus, meeting their information needs.
Conclusions. The use of Q-Codes to annotate Second Opinion Requests from rural Brazilian primary healthcare providers has been tested to show its feasibility in facilitating communications and coding among rural healthcare providers when expressing non-clinical and contextual issues. Preliminary results show that Q-Codes looks to add value capturing information that otherwise would be lost in case of using only clinical coding systems such as ICPC.
One limitation of this work can be seen in the manual annotation, which reduces the ability to index large data sets, quality and number of the resulting annotations; is time consuming; and requires number of actions, including inter-annotator agreement. To overcome these limitations, a future work is the use of semi-automated annotation methods to assign Q-Codes to large data sets quickly and more efficiently. Other improvements can be the involvement of a second annotator to validate the dataset; and the use of DeCS to index the SOR to test its feasibility.