Skip to content

Commit

Permalink
fix: jekyll-offline works again, no issue with URI:Module and the cod…
Browse files Browse the repository at this point in the history
…e is simplified
  • Loading branch information
Nawretard committed May 3, 2024
1 parent 961e1c2 commit e231ebb
Show file tree
Hide file tree
Showing 8 changed files with 161 additions and 188 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
demo_offline
72 changes: 44 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,75 @@
# Jekyll Offline - Turn any Jekyll site into an offline application with relative links

[Jekyll Offline](https://dohliam.github.io/jekyll-offline) creates a copy of a specified Jekyll-based website and rewrites all of the internal links as relative URLs so that the site can be viewed from a local machine without requiring access to the Internet.
### Personal note
This repository is a fork waiting to (maybe) be merged to the first repository [Jekyll Offline](https://dohliam.github.io/jekyll-offline) created.

Usually, Jekyll sites can be viewed either by uploading the generated site files to a remote server or locally using the `jekyll serve` command, which hosts the site temporarily on a local server. However, it is not always possible or practical to do this (for example, on a phone or other mobile device). As static sites, Jekyll sites are well-suited to running offline and usually present no special challenges (with the exception of some [known issues](#issues)) other than that all of the resources and links on a typical web page are relative to the root of the server, rather than the user's computer file structure.
It fixes the undefined method `encode' for URI:Module from the other repository (it was deprecated).

So long as the Jekyll-generated site code is available, the Jekyll executable itself is not even required for this script to work.
The code is a bit simplified so it should be easier to maintain, but I could have broken things, feel free to add issues and pull requests if you want to correct problems.

Also I didn't understand every argument of the original config file, so maybe there are less features.

## Jekyll Offline
Creates a copy of a specified Jekyll-based website and rewrites all internal links as relative URLs so that the site can be viewed from a local machine without the need for Internet access.

All we need is the `_site` folder generated by Jekyll during a build for this script to work.

Typically, Jekyll sites can be viewed either by uploading the generated site files to a remote server or locally using the `jekyll serve` command, which temporarily hosts the site on a local server.

As Jekyll is a static site generator, page references can easily be replaced by local references. This ensures that local pages are always linked offline.

With Jekyll Offline, we modify all resources and links according to the url of the Jekyll-based website, so external links (http requests not linked to the original website or e-mails) keep working.

## Requirements

The script (`jekyll_offline.rb`) can be used directly and does not need to be installed. There are no prerequisites other than [Ruby](https://www.ruby-lang.org/).
The script (`main.rb`) can be used directly and does not need to be installed. There are no prerequisites other than [Ruby](https://www.ruby-lang.org/).

If you have the source code for a Jekyll site that has not been generated yet, you will need to [install Jekyll](https://jekyllrb.com/) first and then build the site:

```
cd my_jekyll_site/
jekyll build
```

You can then point your Jekyll Offline configuration file at the resulting `_site` folder to convert it to a fully functoning offline site (for details, see the [Usage](#usage) section below).
That's it !

## Usage

### Demo
Clone or download the repository and enter the following command in a terminal from within the repo main directory:

./jekyll_offline.rb demo.yml
`./main.rb demo.yml`

You just created a new folder : `demo_offline` in the same directory.

This will create a new folder named `demo_offline` in the same directory. This folder contains the offline version of the default Jekyll demo site. You can visit it by opening the file `START_HERE.html` within the `demo_offline` folder in a web browser.
This folder has the same structure as your source directory and is an offline version of the default Jekyll demo site.

To create your own new offline site, simply adjust the variables in the `config.yml` file and run the script again:
###

./jekyll_offline
To create your own new offline site, add your variables in the `config.yml`:
- `source` : the path to the _site
- `target` : you will create your site offline version inside this directory
- `site_url` : url of the original site (it doesn't have to be up, we only remove this url inside the hrefs of `<link/>` html elements)

The configuration file is assumed to be `config.yml` by default, so it does not need to be specified unless you are using a different file.
Now you can run the script :
`./main`

Note that the `config.yml` file should point to the generated `_site` folder of a Jekyll website, and not the unprocessed Jekyll source code.
The configuration file is`config.yml` by default, you don't need to specify it unless you use another file.

**Warning:** the `source` attribute path has to be the generated `_site` folder and not the unprocessed Jekyll source code.
If you don't have it yet, simply use `jekyll build` at the root of your directory.

## Library

The methods in the `lib_rellinks.rb` library may be useful for relativizing links more generally in HTML pages other than Jekyll sites.
The methods in the `html_to_offline.rb` library may be useful for relativizing links more generally in HTML pages other than Jekyll sites.

It is extremely useful to have an offline version of a website that can work without an Internet connection or a local server, so it was quite surprising to find that a library to do this did not already exist.

This script has been used to create fully-functional offline versions of the [Global Storybooks](https://globalstorybooks.net) websites.

## Issues

There are some unexpected challenges with `file://` URIs that make them different from URIs loaded from a server (even one running on localhost).

* So-called "clean URLs" (where e.g., pages named `index.html` can be accessed by following a link to the parent directory without needing to add `index.html` at the end) are a feature of the webserver, and thus do not work with `file://` URIs.
* For example, if you have a file located at `/blog/index.html` and you link to it using `/blog`, it will work fine on a webserver, but on a local filesystem you will be taken to an index page listing all the files in the directory instead.
* Jekyll Offline handles this by rewriting these links so that they point to the actual file (e.g., `index.html`) instead of the parent directory.
* To be truly portable, all local links must be relative to some arbitrary "root" level that represents the top level of the site
* This means that the distance between each link on the site and the root directory must be calculated independently, which is what Jekyll Offline does.

When crawling an online website, some of these issues can be resolved using a tool like Wget with option `--convert-links` for example. However, Jekyll Offline has been designed for cases where the entire website is already available locally, and simply needs to be adjusted slightly to run offline.

There are also some known issues remaining to be resolved:

* Due to [this 7 year-old unresolved bug](https://bugzilla.mozilla.org/show_bug.cgi?id=760436) in Firefox, local fonts will not load on a page unless they are placed in the same directory as the page. This can be quite problematic for sites with multiple pages, however as noted in the linked bug report, these sites should still work fine in other browsers.
* Note that this also means that Font Awesome and similar icon fonts will not work properly in Firefox. One way to work around this is to extract the icons you need and embed them into each page. This is of course impractical for large fonts.
* As suggested in the bug report, it may be possible to resolve this by setting `security.fileuri.strict_origin_policy` to `false` in `about:config`.
* There is currently an issue with converting sites in different locations than the script itself. While this is being resolved, it is recommended to place the `_site` folder and corresponding YAML configuration file in the same folder as (or a subfolder of) the `jekyll_offline.rb` script itself.
A lot of code has changed, so maybe the previous issues don't apply anymore. If you encounter an issue, create an issue and maybe someone will resolve it !

## Contributing

Expand All @@ -67,3 +78,8 @@ If you encounter any problems while converting a Jekyll site, please open an iss
## License

MIT.

### Personal note:
I'm novice so I don't know what is allowed.

Based on [This article](https://www.gnu.org/philosophy/open-source-misses-the-point.en.html) I would like to license this fork under GPLv3 if it's possible.
13 changes: 3 additions & 10 deletions config.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,5 @@
# Default configuration for offline jekyll site generation

:absolute_base: "https://address.of.site.com/" # The full URL of the site as it might appear in any absolute links on the website. (these will be converted to relative links)
:relative_base: "/blog" # Optional - use this if the original site is located in a subdirectory
:source_dir: "~/Downloads/spiffy_website/_site/" # This should point to the _site folder of the generated Jekyll website
:out_dir: "~/my_website_offline/" # The desired output directory where you would like to generate the offline version of the site
# :custom_filter: "fn:|fnref:" # Optional: specify a custom string to filter -- URLS containing this string will not be rewritten

# boilerplate variables
:site_title: "My Website Title"
:site_url: "https://address.of.site.com/" # URL of the original site
:site_logo: "" # Image to use for site logo on intro page
:source: "~/workspace/root/_site" # This should point to the _site folder of the generated Jekyll website
:target: "~/Downloads" # The desired output directory where you would like to generate the offline version of the site
:site_url: "http://localhost:4000" # URL of the original site (http://localhost:4000/)
12 changes: 3 additions & 9 deletions demo.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
# Demo configuration for offline jekyll site

:absolute_base: "https://address.of.site.com/"
:relative_base: "/demo"
:source_dir: "demo/_site/"
:out_dir: "demo_offline/"

# boilerplate variables
:site_title: "Demo Site"
:site_url: "https://dohliam.github.io/offline-jekyll/"
:site_logo: ""
:source: "demo/_site/"
:target: ""
:site_url: "http://yourdomain.com/"
66 changes: 66 additions & 0 deletions html_to_offline.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Converts a html file to a offline html file
# Input :
# - page_path: html file path
# - main_folder_path: (kinda useless, but now we have the page relative path)
# - site_url: remove this from every link
# Output :
# - page_content, hopefully our html file adapted for local use
require 'pathname'

def convert_to_offline(page_path, main_folder_path, site_url) # Flemme d'améliorer la signature, déso
page_relative_path = page_path.gsub(/^#{main_folder_path}/, "") # Path relative to the working directory

page_content = File.read(page_path)

page_content = page_content.gsub(/(href|src)=["'](.*?)["']/) do |link|
href = $1
address = $2

unless is_custom_filter?() || (is_url?(address) && !address.start_with?(site_url))
# add index.html to folder links that end with "/"
address += "index.html" if address.end_with?("/")

# add /index.html to folder links that end like "/something" with no extension
address += "/index.html" unless has_extension?(address)

# remove the site_url from paths
address.sub!(/^#{Regexp.escape(site_url)}/, "")

# remove the multiple "/"
address.gsub!(/\/+/, '/')

# create the relative path from the page to the link address
address = construct_relative_path(page_relative_path, "/") + address

# remove the first "/"
address = address.sub(/^\//, "");
end
href + "=" + "'#{address}'"
end
page_content
end

def is_custom_filter?()
@config[:custom_filter] && href.match(/#{@config[:custom_filter]}/)
end

def is_url?(address)
# Check if the address starts with common web references
return true if address.match?(/\A(http|https|ftp|mailto)/)
false
end


def has_extension?(file_path)
!File.extname(file_path).empty?
end

def construct_relative_path(from_address, to_address)
from_path = Pathname.new(from_address)
to_path = Pathname.new(to_address)

relative_path = to_path.relative_path_from(from_path.dirname).to_s

# Adjust the relative path to include "../" if necessary
relative_path.empty? ? '.' : relative_path
end
75 changes: 0 additions & 75 deletions jekyll_offline.rb

This file was deleted.

66 changes: 0 additions & 66 deletions lib_rellinks.rb

This file was deleted.

Loading

0 comments on commit e231ebb

Please sign in to comment.