Scraping, downloading, extracting and embedding.
- Scraping: Try to get the PDF link from a given query.
- Downloading: Download the PDF file from the scraped url.
- Extracting: Extract text content from the PDF file as markdown.
- Embedding: Embed and upsert the results to the vector db.