↓ Skip to main content

Simplifying drug package leaflets written in Spanish by using word embedding

Overview of attention for article published in Journal of Biomedical Semantics, September 2017
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (65th percentile)
  • Good Attention Score compared to outputs of the same age and source (75th percentile)

Mentioned by

twitter
5 X users

Citations

dimensions_citation
12 Dimensions

Readers on

mendeley
32 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Simplifying drug package leaflets written in Spanish by using word embedding
Published in
Journal of Biomedical Semantics, September 2017
DOI 10.1186/s13326-017-0156-7
Pubmed ID
Authors

Isabel Segura-Bedmar, Paloma Martínez

Abstract

Drug Package Leaflets (DPLs) provide information for patients on how to safely use medicines. Pharmaceutical companies are responsible for producing these documents. However, several studies have shown that patients usually have problems in understanding sections describing posology (dosage quantity and prescription), contraindications and adverse drug reactions. An ultimate goal of this work is to provide an automatic approach that helps these companies to write drug package leaflets in an easy-to-understand language. Natural language processing has become a powerful tool for improving patient care and advancing medicine because it leads to automatically process the large amount of unstructured information needed for patient care. However, to the best of our knowledge, no research has been done on the automatic simplification of drug package leaflets. In a previous work, we proposed to use domain terminological resources for gathering a set of synonyms for a given target term. A potential drawback of this approach is that it depends heavily on the existence of dictionaries, however these are not always available for any domain and language or if they exist, their coverage is very scarce. To overcome this limitation, we propose the use of word embeddings to identify the simplest synonym for a given term. Word embedding models represent each word in a corpus with a vector in a semantic space. Our approach is based on assumption that synonyms should have close vectors because they occur in similar contexts. In our evaluation, we used the corpus EasyDPL (Easy Drug Package Leaflets), a collection of 306 leaflets written in Spanish and manually annotated with 1400 adverse drug effects and their simplest synonyms. We focus on leaflets written in Spanish because it is the second most widely spoken language on the world, but as for the existence of terminological resources, the Spanish language is usually less prolific than the English language. Our experiments show an accuracy of 38.5% using word embeddings. This work provides a promising approach to simplify DPLs without using terminological resources or parallel corpora. Moreover, it could be easily adapted to different domains and languages. However, more research efforts are needed to improve our approach based on word embedding because it does not overcome our previous work using dictionaries yet.

X Demographics

X Demographics

The data shown below were collected from the profiles of 5 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 32 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 32 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 5 16%
Student > Master 5 16%
Student > Doctoral Student 3 9%
Professor 2 6%
Researcher 2 6%
Other 6 19%
Unknown 9 28%
Readers by discipline Count As %
Computer Science 10 31%
Linguistics 4 13%
Pharmacology, Toxicology and Pharmaceutical Science 2 6%
Medicine and Dentistry 2 6%
Social Sciences 1 3%
Other 3 9%
Unknown 10 31%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 10 October 2017.
All research outputs
#6,925,833
of 23,003,906 outputs
Outputs from Journal of Biomedical Semantics
#129
of 364 outputs
Outputs of similar age
#111,112
of 321,103 outputs
Outputs of similar age from Journal of Biomedical Semantics
#5
of 20 outputs
Altmetric has tracked 23,003,906 research outputs across all sources so far. This one has received more attention than most of these and is in the 69th percentile.
So far Altmetric has tracked 364 research outputs from this source. They receive a mean Attention Score of 4.6. This one has gotten more attention than average, scoring higher than 64% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 321,103 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 65% of its contemporaries.
We're also able to compare this research output to 20 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 75% of its contemporaries.