The motivation behind this project is to help visually impaired person to percept their surroundings better. The person must use a blindstick or a spectacles equipped with a camera, a mic and a speaker which will be acting as a node. Whenever the person wants to percept their surrounding they must press a button and the camera will capture a picture. This captured image is sent to a remote server where the inference, regarding what is going on in the image, will be generated using CLIP model which is an image captioning model and will be sent back to the person. The inference will be spoken out and he/she can hear it to better understand its surroundings. Later on he/she can ask some questions related to the things that is heard and again that speech will be converted to text and sent to server. The server will answer the questions using the ViLT model which is a Vision and Language(V&L) model and sent back to the person to read out to them.
This repository will require you to install python 3.6.8.
Follow the below steps to successfully setup this repository to your local machine
- clone the respository using
git clone https://github.com/Scientist90s/smart_blindstick
- Install java which is required for one of the dependency's installation
- Navigate to node folder and install node dependencies using
pip3 install -r requirements_node.txt
- Navigate to server folder and install server dependencies using
pip3 install -r requirements_server.txt
- open server.py and change your acquired ip address at line 25
self.host = "xxx.xxx.xxx.xxx"
- Run
python3 server.py
- open node.py and change the value of read_image to
True
on line 35 if you are not going to use cameraself.read_image = True
- open another terminal and run
python3 node.py
- Make sure the server.py file is running
- The html page is using bulma.io so if any errors occur related to bulma make sure that is installed.
- In main.js change the url to your host machine (image_endpoint variable, in two functions)
- In speech.js change the url to your host machine (only one)