Converting Content to a Chat-Bot

The best way to learn or understand information is through asking questions. A QnA format of learning is more effective than reading long form content, especially when time is critical. More over searching for an information and reading multiple documents just to know one thing is inefficient. Instead we can directly ask a question regarding the problem and the system can retrieve the information for us which is more time efficient in this fast moving world.

Question  answering system is the Field of information retrieval that automatically answers the question asked by humans. It accepts, analyses and decomposes fixed domain factual questions given in Natural Language. The system is based on Machine Learning Algorithms along with advanced Natural language processing. There are many domains where it can be of  significant value like health care remedies etc.. Text cleaning , NLP, Similarity matrix as well as latent semantics analysis are important parts which are required to build the system. The main challenges that are faced in Question Answering are Lexical Gap, Ambiguity and Multilingualism. In Natural language the same question can be expressed in different ways and same phrases may have different meaning as well as Question Answering is expected to recognise a language and get results.

In this article we describe the results from two techniques: 1.. Similarity distance using words and 2. Latent Semantic Analysis (LSA).

Similarity distance

In statistics a similarity measure is a real values function that quantifies the similarity between two objects. Similarity measure is done by taking distance metrics. Simple matching coefficient counts both mutual presences(when an attribute is present in both objects) as matches and compares it to the total number of attributes in both the sets. Jaccard coefficient is a good metric for similarity.

Image result for sentence similarity


Latent Semantics Analysis:

LSA is a method for extracting and representing the contextual-usage meaning of words by statistical computation applies to a large corpus text. It uses bag of word model, which result in a term-document matrix (occurrence of terms in a document). LSA learns latent topics by performing a matrix decomposition on the document-term matrix using Singular value decomposition . LSA is typically used as a dimension reduction technique.

Image result for latent semantic analysis explained

An Example

Question : How to reduce stress?
We have retrieved the answer to this question form a medical website using the above two techniques. Below are the answers:

1. Similarity Distance Method: The similarity distance method gives the following answer. As you can see the technique searched for exact words in the title.

How to Reduce Stress
Patricia Farrell, PhD: Well, we’ve had experiments that have shown things like guided imagery certainly have a great effect on people. We have seen people who have cardiac conditions, people who have major physical illnesses to use these kinds of techniques and it has been very effective to really help them with the stress, bring down their blood pressure, bring down so many things, enable them to get restful sleep which is very important, and don’t forget, stress is directly related to your immune system. Patricia Farrell, PhD (cont.): If you allow the stress to build to a point where it is truly high, it is truly going to be damaging on your immune system. It is going to leave you vulnerable to illness. It actually affects the ability of the body to fight off, you’ve heard people say oh, you’ve got a cold, well remember last week what happened, you really, that was a terrible situation you were in.

2. Latent Semantic Analysis: The LSA method gives the following answer. As you can see the technique searched for underlying meaning (semantics) in the title.

Managing Stress (Searches for underlying meaning)
Everyone experiences stress. Our bodies are designed to feel stress and react to it. It keeps us alert and ready to avoid danger. But, when stress persists, the body begins to break down and illnesses, like headaches and migraines , can occur. The key to coping with stress is identifying stressors in your life, learning ways to reduce stress , and managing stress . Stress is your reaction to any change that requires you to adjust or respond. It’s important to remember that you can control stress, because stress comes from how you respond to stressful events. Stress can be caused by anything that requires you to adjust to a change in your environment. Your body reacts to these changes with physical, mental, and emotional responses. We all have our own ways of coping with change, so the causes of stress can be different for each person. When you are not sure of the exact cause of your stress, it may help to know the warning signs of stress. Once you can identify these signs, you can learn how your body responds to stress. Then you can take steps to reduce it.