Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 2.01 KB

File metadata and controls

10 lines (6 loc) · 2.01 KB

Fraud-ad-detection-using-Natural-language-processing

Millions of ads are posted every single day on Craigslist worldwide to a large extent anonymously. Millions of housing listings are posted in a month. It is difficult to check the listings for the people looking for a new house. As per trends 6% of housing ads are spams. However, they can’t run all around the world policing and prosecuting people.

This project intends to solve the house hunt problem by sending the updates of new listings as per the selection criteria of the user by filtering spam in housing listings. Classified ad sites routinely process hundreds of thousands to millions of posted ads, and only a small percentage of those may be fraudulent. Online scammers often go through a great amount of effort to make their listings look legitimate. Examples include copying existing advertisements from other services, tunneling through local proxies, and even paying for extra services using stolen account information.

This project would try to provide value to both its client(Craigslist) & its users by solving some of the key issues. High volumes of rental scams damages the reputation of Craigslist & increases its user drop-off rates. Users have to spend hours finding legitimate ads and it takes a lot of time & resources to select a genuine listing from thousands of existing listings.

This project consists of applying data analysis and text analysis concepts & techniques towards the detection of online, classified fraud for housing ad listings and building an automated notification system to send new listings as per user’s search keywords. Traditional data mining is used to extract relevant attributes from an online classified advertisements database and machine learning algorithms are applied to discover patterns and relationships of fraudulent activity. With our proposed approach, we will demonstrate the effectiveness of applying data mining techniques towards the detection of fraud in online classified advertisements for housing ads in major cities.