-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Put crawlers into
crawler
package & rename them. Adjust paths in RE…
…ADME
- Loading branch information
1 parent
b3e22ef
commit e645375
Showing
7 changed files
with
40 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,32 +1,37 @@ | ||
|
||
# Collection of Java-based web crawlers | ||
|
||
[![Java CI with Maven](https://github.com/andrei-punko/java-crawlers/actions/workflows/maven.yml/badge.svg)](https://github.com/andrei-punko/java-crawlers/actions/workflows/maven.yml) | ||
|
||
## Prerequisites | ||
|
||
- Maven 3 | ||
- JDK 21 | ||
|
||
## How to build | ||
|
||
``` | ||
mvn clean install | ||
``` | ||
|
||
## Common crawler functionality | ||
- Your crawler should extend [WebCrawler](crawler-engine/src/main/java/by/andd3dfx/crawler/engine/WebCrawler.java) | ||
base crawler class | ||
- DTO class which describes collected data should implement | ||
[CrawlerData](crawler-engine/src/main/java/by/andd3dfx/crawler/dto/CrawlerData.java) marker interface | ||
|
||
- Your crawler should extend [WebCrawler](crawler-engine/src/main/java/by/andd3dfx/crawler/engine/WebCrawler.java) | ||
base crawler class | ||
- DTO class which describes collected data should implement | ||
[CrawlerData](crawler-engine/src/main/java/by/andd3dfx/crawler/dto/CrawlerData.java) marker interface | ||
|
||
## Crawler for Orthodox torrent tracker [pravtor.ru](http://pravtor.ru) | ||
Check [SearchUtil](pravtor.ru-crawler/src/main/java/by/andd3dfx/pravtor/util/SearchUtil.java) | ||
|
||
Check [PravtorWebCrawler](pravtor.ru-crawler/src/main/java/by/andd3dfx/pravtor/crawler/PravtorWebCrawler.java) | ||
in `pravtor.ru-crawler` module for details | ||
|
||
To make search - run [run-search.bat](pravtor.ru-crawler/run-search.bat) script. | ||
To make search - use [run-search.bat](pravtor.ru-crawler/run-search.bat) script. | ||
Collected data will be placed into [result.xls](pravtor.ru-crawler/sandbox/result.xls) file in `sandbox` folder | ||
|
||
## Crawler for vacancies aggregator [rabota.by / hh.ru](http://rabota.by) | ||
Check [SearchUtil](rabota.by-crawler/src/main/java/by/andd3dfx/sitesparsing/rabotaby/SearchUtil.java) | ||
## Crawler for vacancies aggregator [rabota.by](http://rabota.by) (it's localized version of [hh.ru](http://hh.ru) in Belarus) | ||
|
||
Check [RabotabyWebCrawler](rabota.by-crawler/src/main/java/by/andd3dfx/rabotaby/crawler/RabotabyWebCrawler.java) | ||
in `rabota.by-crawler` module for details | ||
|
||
To make search - run `main()` method of [MainApp](rabota.by-crawler/src/main/java/by/andd3dfx/sitesparsing/rabotaby/MainApp.java) | ||
To make search - run `main()` method of [MainApp](rabota.by-crawler/src/main/java/by/andd3dfx/rabotaby/MainApp.java) | ||
class with populated output path in command line param |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters