Update downloader.sh #18

fabriziosalmi · 2024-10-24T12:27:54Z

Helo my Lord.... I'm trying to contribute to this gem, then I wanna check if PR are welcome just by proposing a simple update for a single downloader, if appreciated I can cover all downloaders quickly :)

List of improvements:

Dependency check: Ensures curl, dig, jq, and mktemp are installed before proceeding.
Error handling in downloads: Added retries and error messages for failed downloads.
Background download: Parallelized downloads for faster execution using & and wait.
Error handling in dig calls: Added fallback || true to prevent script failure if DNS resolution fails.
Recursive SPF record resolution: Improved robustness with additional error handling.
Temp directory with trap: Ensured that the temporary directory is always cleaned up upon script exit.
Improved sorting: Used sort -u for efficient sorting and deduplication in one step.
Output directory creation: Ensured the output directory is created before saving the results.
File size check: Added validation to check if the output files are empty or failed to generate.
Logging: Enhanced logging for error messages and successful file creations.

List of improvements: - Dependency check: Ensures curl, dig, jq, and mktemp are installed before proceeding. - Error handling in downloads: Added retries and error messages for failed downloads. - Background download: Parallelized downloads for faster execution using & and wait. - Error handling in dig calls: Added fallback || true to prevent script failure if DNS resolution fails. - Recursive SPF record resolution: Improved robustness with additional error handling. - Temp directory with trap: Ensured that the temporary directory is always cleaned up upon script exit. - Improved sorting: Used sort -u for efficient sorting and deduplication in one step. - Output directory creation: Ensured the output directory is created before saving the results. - File size check: Added validation to check if the output files are empty or failed to generate. - Logging: Enhanced logging for error messages and successful file creations.

lord-alfred

Thanks for the great MR, but I don't understand why so many changes, as this script works in github workers and everything works fine. After these optimizations it looks like the script can run on any machine, but there is no such purpose.

lord-alfred · 2024-10-28T20:21:45Z

google/downloader.sh

 set -euo pipefail
 set -x

+# Check for required dependencies


dependencies check not need for this script, because its run only in github workers

lord-alfred · 2024-10-28T20:22:56Z

google/downloader.sh

+  fi
+done
+
+# Create a temporary directory and ensure cleanup on exit


I think its not need:

/tmp availiable every time

worker live less 1 minute

lord-alfred · 2024-10-28T20:24:05Z

google/downloader.sh


+# Function to download files with retries and error handling


retries not need too, ~ 3 years this repo exists and I've never had this problem

lord-alfred · 2024-10-28T20:25:28Z

google/downloader.sh

-# get from public ranges
-curl -s https://www.gstatic.com/ipranges/goog.txt > /tmp/goog.txt
-curl -s https://www.gstatic.com/ipranges/cloud.json > /tmp/cloud.json
+# Parallel downloads with retries


How much process cores available in github workers? Why would this be necessary when downloading files smaller than 1MB?

In the free plan if I am not wrong you may use up to 2 cores :)

lord-alfred · 2024-10-28T20:26:53Z

google/downloader.sh

-# Public GoogleBot IP ranges
-# From: https://developers.google.com/search/docs/advanced/crawling/verifying-googlebot
-curl -s https://developers.google.com/search/apis/ipranges/googlebot.json > /tmp/googlebot.json
+# Fetch Google netblocks using dig command


This seems like it could be useful

lord-alfred · 2024-10-28T20:27:48Z

google/downloader.sh

+fetch_netblocks() {
+  local idx=2
+  local txt
+  txt="$(dig TXT _netblocks.google.com +short @8.8.8.8 || true)"


|| true will hide network problems from us that we might have noticed, in which case we just won't know anything?

Uhm.. maybe the best can be to collect errors as artifacts for further investigation.. but as you already said.. it works and I completely agree with you, this PR is not needed :) Keep it as my thanks message for your immensely useful repo. I use it every day since years :)

lord-alfred · 2024-10-28T20:29:31Z

google/downloader.sh

 get_dns_spf() {
   dig @8.8.8.8 +short txt "$1" |
   tr ' ' '\n' |
-   while read entry; do
+   while read -r entry; do


-r do not allow backslashes to escape any characters

What's that for?

lord-alfred · 2024-10-28T20:31:07Z

google/downloader.sh

-grep ':' /tmp/netblocks.txt >> /tmp/google-ipv6.txt
+# Sort and deduplicate results, and ensure target directory exists
+output_dir="google"
+mkdir -p "$output_dir"


directory already exists, this file in it

lord-alfred · 2024-10-28T20:34:17Z

google/downloader.sh

+# Sort and deduplicate results, and ensure target directory exists
+output_dir="google"
+mkdir -p "$output_dir"
+sort -u "$temp_dir/google-ipv4.txt" > "$output_dir/ipv4.txt"


sort -V:

-V, --version-sort natural sort of (version) numbers within text

Without this sorting, it's gonna look bad, see #5 - Q3

lord-alfred requested changes Oct 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update downloader.sh #18

Update downloader.sh #18

fabriziosalmi commented Oct 24, 2024

lord-alfred left a comment

lord-alfred Oct 28, 2024

lord-alfred Oct 28, 2024

lord-alfred Oct 28, 2024

lord-alfred Oct 28, 2024

fabriziosalmi Oct 28, 2024

lord-alfred Oct 28, 2024

lord-alfred Oct 28, 2024

fabriziosalmi Oct 28, 2024 •

edited

Loading

lord-alfred Oct 28, 2024

lord-alfred Oct 28, 2024

lord-alfred Oct 28, 2024


		# Function to download files with retries and error handling

Update downloader.sh #18

Are you sure you want to change the base?

Update downloader.sh #18

Conversation

fabriziosalmi commented Oct 24, 2024

lord-alfred left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabriziosalmi Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabriziosalmi Oct 28, 2024 •

edited

Loading