- 사용자가 질문으로 요청하면 답변이 있는 뉴스 기사를 스크랩 해주는 서비스
- 기사를 읽고 다시 보고 싶은 기사 스크랩 기능
- 실시간 질문에 대한 답변 기사 AI 스크랩 기능
- 일정 시간마다 기사 목록 업데이트 후 질문에 대한 답변 기사 AI 스크랩 기능
박별이 | 이준수 | 최웅준 | 추창한 |
---|---|---|---|
github | github | github | github |
박별이 | 이준수 | 최웅준 | 추창한 | |
---|---|---|---|---|
Data collection make test dataset and analysis |
common | common | common | common |
Code refactoring | Retrieval | post_processing train |
extraction_pre_process generation_pre_process generation_compute_metrics configuration building tiny dataset |
Retrieval |
User flow/Data flow | User Flow Data Flow |
training pipeline | User Flow Data Flow |
|
Modeling | Apply BM 25 | build train dataset model training |
train with tiny dataset training reader model error analysis on generation model |
Apply BM 25 |
Prototyping | reader model demo | ODQA model / Batch Serving | ||
Frontend | web design sign in sign up news scrap |
article_form performance improvement with UI policy |
homepage_news title list ai scrap news title list my scrap news title list |
performance improvement with UI policy |
Backend | build sqlite schema sign in sign up news scrap |
user_input | homepage_news title list with wiki_news_db ai scrap news title list with ai_scrap_db my scrap news title list with user_scrap_db |
build layered architecture design get article page and user_input with real time service batch serving |
- Recommended python version 3.8.5
$ conda create -n venv python=3.8.5 pip
$ conda activate venv
$ cd $ROOT/final-project-level3-nlp-19/code
$ poetry install
$ poetry shell
code
├──routers/
├──schema/
├──services/
├──templates/
├──AIPaperboy.py
└──model train file (.py)
4 folder for serving
- routers: Controller
- schema: Model
- sevices: Project's functions
- templates: HTML & CSS file
$ cd $ROOT/final-project-level3-nlp-19/code
$ python train_copy.py --output_dir ./outputs --run_extraction True --run_generation False --do_train --do_eval \
--evaluation_strategy 'steps' --eval_steps 60 --logging_steps 60 --per_device_eval_batch_size 16 \
--per_device_train_batch_size 16 --save_strategy "no" --fp16 True --fp16_full_eval True --num_train_epochs 9 --report_to "wandb" \
--overwrite_output_dir
$ python inference_copy.py --output_dir ./outputs/test_dataset/ --dataset_name ../data/test_dataset/ --model_name_or_path ./models/train_dataset/ --do_predict --overwrite_cache --overwrite_output_dir
$ cd $ROOT/final-project-level3-nlp-19/code
$ python AIPaperboy.py --output_dir ./outputs/test_dataset/ --model_name_or_path ./models/train_dataset/ --dataset_name ../data/test_dataset/ --do_predict