Word Sense Discrimination Using Statistic Analysis of Texts
For years, computer programs have been working to obtain information about certain entities such as persons, organizations or scientific concepts from the Web or from other sources. However, they have many challenges yet to overcome, for instance when texts refer to different entities that share the same name (e.g., a mouse can be an electronic device or a living creature). This article presents a method to solve this problem based on the frequency analysis of the words that are found in the vicinity of a target word. Each sense of the polysemous word or term will be represented as a different group of other vocabulary units that show a tendency to appear together with the target word in each of its different senses. The interest of the proposal is that it does not require previous knowledge about the language of the corpus or any other formof knowledge from the external world.
How to Cite
The scientific journals of Hipatia Press, from October 5, 2013 to the present day, following the recommendations of the Budapest Open Access Initiative, publish:
Under a Creative Commons Reconocimiento (CC BY) ) license (unless otherwise indicated). With this license the authors retain ownership of the rights to their article, but allow anyone to share: (download, reprint, distribute and/or copy) and adapt (remix, transform, reuse, modify) for any purpose, including commercial , always citing the original source. Thus no permission from the authors or publishers is required. Consult the informative version and the legal text of the license.
- Attribution — You must give proper credit, provide a link to the license, and indicate if changes have been made. You may do so in any reasonable way, but not in any way that suggests that you or your use is endorsed by the author.
THE VISUAL PAGES.
They are published under a Creative Commons Attribution-NonCommercial-NoDerivatives license (CCBY-NC_ND). (Unless otherwise stated)
- Attribution — You must give proper credit, provide a link to the license, and indicate if changes have been made. It can be done in any reasonable way, but not in a way that suggests endorsement by the author.
- Non-Commercial—You may not use the material for commercial purposes.
- No Derivatives — If you remix, transform, or build on the material, the modified material may not be distributed.