We created an application that does search log analysis of search logs containing medical queries. The application uses the KConnect annotation service to annotate the queries, which allows the analysis to be done based on medical concepts as well as the actual terms typed in.
For the examples below, we analysed an extract of the Trip medical search engine query logs (https://www.tripdatabase.com). They contain logs from both anonymised registered users (380 000 query log entries from 2010 to 2015) and unregistered users (916 000 query log entries for a period of one year from January 2014 to February 2015). Below is an example of a single log entry of an unregistered user:
|Session ID, Timestamp, Query, Document ID, URL clicked, URL title
P0pqhw45edag, 2014-03-09 07:35:07.443, pregnancy corticosteroids congenital malformations, 5008517, http://www.uktis.org//docs/Corticosteroids.pdf, Corticosteroids
We expand each query in the search log with semantic annotations of the query text using the KConnect annotation service. Concepts found in the text are annotated with their UMLS UID, semantic type, and one of the following semantic classes: Anatomy, Disease, Drug, and Investigation. The example above results in the following annotations:
Once the query logs are analysed, the user is provided with various interactive visualisations:
We now describe two of the available visualisations.
Most searched medical concept
We implemented visualizations based on medical concepts, which are useful because they group terms that are synonyms into a single concept. Moreover, we used the semantic classes related to medical concepts to better group the visualisations. Thus, the visualization tool allows us to submit complex queries such as identifying the most commonly searched Anatomy keywords by dentists (where the latter information comes from data stored for registered users):
We can also investigate how concepts interact with each other, for example, by visualising the most commonly searched treatments entered with a given keyword based on the query logs. Below we show the most common concepts of type Disease related to the keyword “pregnancy.”
In order to generate this visualisation, we take advantage of the semantic annotations of concepts. Below is shown how, in a query, the keyword pregnancy is related to the treatment “corticosteroids” and the disease “congenital malformations”.
We expanded the query annotation to four further languages by simply replacing the KConnect English annotation pipeline used above by the KConnect annotation pipelines available in other languages: French, Hungarian, Swedish, and Czech.
For medical search log analysis, an interactive visualisation interface, using the KConnect annotation service, allows decision makers to get a rapid overview of the queries to a search engine and provides interaction capabilities to allow deeper drilling into the results for further insight.