Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observable change of execution order was introduced in #946 if two matches write same file #1000

Open
BinderDavid opened this issue Sep 1, 2023 · 7 comments

Comments

@BinderDavid
Copy link
Contributor

BinderDavid commented Sep 1, 2023

It is a bit difficult to summarize the problem in the description, but I have provided a reproducer below. The problem is that hakyll does not produce the correct build results upon the first invocation, but if we force the rebuild of the problematic files (by using site watch and changing something in the file), then hakyll produces the correct result.

The bug seems to have been introduced between 4.15.1.1 which does not exhibit the bug, and 4.16.1.0 which exhibits the bug. I will try to do a git bisect to identify the commit which is responsible.

Related Issue: haskellfoundation/error-message-index#459

How to reproduce

  • Checkout the error-message-index at commit fa5adac344b3c6659ed2af89488ef5bdf404a9d9 which uses hakyll ^>= 4.16.1.0 in its cabal file.
  • Build the site by going into the message-index subdirectory and executing cabal run -- site watch
  • Open the site in the browser and look at the error message for GHCup-00110: The generated HTML page is missing the header, title and footer sections.
  • Change the preamble of the file messages/GHCup-00110/index.md, for example by changing the title or removing one of the other fields.
  • Hakyll should now rebuild the page: The generated error page for the error GHCup-00110 now has the correct header, title and footer.

When comparing two versions it is important to clean the _cache directory between runs. It is otherwise possible that the faulty version reuses the cache from a run of the correct version, and you can no longer observe the error.

@BinderDavid
Copy link
Contributor Author

BinderDavid commented Sep 1, 2023

Result of git bisect:

david@anaxes ~/GitRepos/hakyll (master)$ git bisect start
david@anaxes ~/GitRepos/hakyll (master)$ git bisect bad
david@anaxes ~/GitRepos/hakyll (master)$ git bisect good v4.15.1.1
Bisecting: 30 revisions left to test after this (roughly 5 steps)
[5f00056269a636b26b2199d319f913a10169b07c] Examples: add blog.galowicz.de (#963)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ git bisect good
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[13b4697b7aa4e7715a88d7912799703de094ac5b] fix: Twitter Card use `name` instead `property` (#971)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ git bisect good
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[388565c6aa9a5c5241e94304f559290a784a0b65] Add example website and tutorials (#990)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ git bisect good
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[9367a88b5be69e16048240893890282b8e70b759] CI: re-enable GHC 9.6.1 on Windows (#997)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ git bisect good
Bisecting: 1 revision left to test after this (roughly 1 step)
[ec3365aac9ab6d6c6d5e1bb8f90ae6e4bb84ee4a] Allow aeson 2.2 (#999)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ git bisect good
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[9696a859da5d5227edb79f08a85739a6f2d4d92a] Improve async runtime scaling (#946)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ git bisect bad
9696a859da5d5227edb79f08a85739a6f2d4d92a is the first bad commit
commit 9696a859da5d5227edb79f08a85739a6f2d4d92a
Author: Jasper Van der Jeugt <[email protected]>
Date:   Wed Aug 23 19:24:20 2023 +0200

    Improve async runtime scaling (#946)

 hakyll.cabal                       |   1 -
 lib/Hakyll/Core/Logger.hs          |  78 +++---
 lib/Hakyll/Core/Runtime.hs         | 512 ++++++++++++++++++++++++-------------
 stack.yaml.lock                    |   2 +-
 tests/Hakyll/Core/Runtime/Tests.hs |  38 ++-
 tests/TestSuite/Util.hs            |   2 +-
 6 files changed, 398 insertions(+), 235 deletions(-)
david@anaxes ~/GitRepos/hakyll ((no branch, bisect started on master))$ 

The bug was therefore introduced in #946

@BinderDavid
Copy link
Contributor Author

BinderDavid commented Sep 1, 2023

The bug is independent on whether the threaded runtime is used or not. If I remove the ghc-options: -threaded -rtsopts -with-rtsopts=-N line in message-index.cabal, then the problem persists.

@BinderDavid BinderDavid changed the title Reproducible bug: hakyll needs to rebuild file for correct result Reproducible bug: hakyll needs to rebuild file for correct result (Introduced in #946) Sep 1, 2023
BinderDavid added a commit to haskellfoundation/error-message-index that referenced this issue Sep 1, 2023
We discovered a bug in the new scheduler for hakyll:
jaspervdj/hakyll#1000
For this reason, we are reverting back to hakyll
version 4.16.0.0 and disable the threaded runtime.
@jaspervdj
Copy link
Owner

I was able to reproduce this easily, thanks for the excellent instructions.

The issue seems in this snippet in message-index/site.hs:

  match "messages/*/*/index.md" $
    version "nav" $ do
      route $ setExtension "html"
      compile getResourceBody

  match "messages/*/*/index.md" $ do
    route $ setExtension "html"
    compile $ do
      files <- getExampleFiles
      thisMessage <-

For both versions, we have a route set that writes the final item to messages/*/*/index.html. So depending on which one triggers last, we get either a raw getResourceBody or an HTML file.

Deleting it from the first one fixed the issue for me -- we don't actually want the nav version written anywhere since that's just for navigation purposes. I think there are possibly other places in this file where we should fix this.

--- a/message-index/site.hs
+++ b/message-index/site.hs
@@ -57,7 +57,6 @@ main = hakyll $ do

   match "messages/*/*/index.md" $
     version "nav" $ do
-      route $ setExtension "html"
       compile getResourceBody

   match "messages/*/*/index.md" $ do

I think it's confusing behaviour and I would prefer Hakyll to throw an error if you write to the same file twice; so I will leave this open to track that.

@BinderDavid
Copy link
Contributor Author

BinderDavid commented Sep 1, 2023

Thanks, I have haskellfoundation/error-message-index#461 up. So the error is on our side, and we just could not observe the erroneous result before since we relied on a specific execution order that we shouldn't have depended on.

Edit: And I changed the title to reflect the fact that it is not a bug in hakyll :)

@BinderDavid BinderDavid changed the title Reproducible bug: hakyll needs to rebuild file for correct result (Introduced in #946) Observable change of execution order was introduced in #946 if two matches write same file Sep 1, 2023
@frasertweedale
Copy link
Contributor

In my sites I create a "recent" version of the files, which I iterate to produce "recent posts" lists.

match "posts/*" $ version "recent" $ do
  route $ setExtension "html"          
  compile ...

I use the $url$ field for hyperlinks in the "recent posts" template. urlField reads from the route. Removing the route declaration makes this empty, and the links are broken.

Is there a way to set the nominal route without writing the file and triggering this new warning? Suggestions or alternative approaches welcome!

@jaspervdj
Copy link
Owner

@frasertweedale It depends on what you're trying to do.

If it's about selecting a subset of the posts, you shouldn't need a different version for this. You can do something like recentFirst and then take the N first items from the list. Or you can filter on some metadata field, e.g.: https://github.com/jaspervdj/jaspervdj/blob/4d519d5d21c38c8896d3ddf6067aad006d82a0ec/src/Main.hs#L316

If the content of the file needs to be different (e.g because it's included in a list), you can do this by saving a snapshot at that time: https://github.com/jaspervdj/jaspervdj/blob/4d519d5d21c38c8896d3ddf6067aad006d82a0ec/src/Main.hs#L66

Otherwise... maybe I would try adding a new Context field to the "recent" version, which loads the "normal" version and retrieves the route for that?

Sorry to have broken your website!

@frasertweedale
Copy link
Contributor

frasertweedale commented Oct 10, 2023

@jaspervdj thanks for the pointers.

I resolved it by creating a new Context for retrieving the route of the associated "no version" identifier:

urlFieldNoVersion :: String -> Context a
urlFieldNoVersion key = field key $ \i -> do
  let ident = setVersion Nothing (itemIdentifier i)
      empty' = fail $ "No route url found for item " <> show ident
  fmap (maybe empty' toUrl) $ getRoute ident

I add it to the context:

  <> urlFieldNoVersion "url0"

and updated the template:

     $for(posts)$
         <li>
-            <a href="$url$">$fancyTitle$</a>
+            <a href="$url0$">$fancyTitle$</a>
         </li>
     $endfor$

edit: ah, but the atom feed links are still broken, because the shipped templates use $url$

edit 2: this too I have now resolved, by re-snapshotting the content from the "recent" item in the compiler for the no-version, then updating the atom Rules to load those snapshots, which have routes and therefore the $url$ in the template works again.

edit 3: here's the full diff, in case it is useful to someone: frasertweedale/blog-fp@8eb6866

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants