Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate the generation changelogs in text format #480

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

soham30rane
Copy link

Fixes #471 as discussed on sage-devel 2024-11-13

Necessary steps to take before achieving proper functionality:

  • Create Personal Access Token and add it to repo > Settings > Secrets and Variables > Actions > Repository Secrets
    with name PERSONAL_ACCESS_TOKEN
  • Add another secret with name CHANGELOG_TRIGGER_SECRET.
  • Also add both the secrets to sage repo with the exact same names and values.

I am also making a pr to sage repo to define a workflow which will in turn trigger generate_changelog.yml on every release.

Sample results of the script. (Executed it for release tag 10.5, which considered all pre-releases uptil now)
sage-10.5.txt

file.write(f"Sage {ver} was released on {date_of_release}. It is available from:\n\n")
file.write(f" * https://www.sagemath.org/download-source.html\n\n")
file.write(f"Sage (http://www.sagemath.org/) is developed by volunteers and\n")
file.write(f"combines hundreds of open source packages.\n\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a slight deviation from the tradition, but it may be better to use (http://www.sagemath.org) with no trailing slash for a new start.

Copy link
Author

@soham30rane soham30rane Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here!

file.write(f"Of those, {len(first_contribs)} made their first contribution to Sage:\n\n")
for c in all_contribs:
file.write(f" - {c}{' [First Contribution]' if c in first_contribs else ''}\n")
pr_count = sum([len(all_info[tag]) for tag in all_info])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[First contribution] to align with other bracketed phrases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here!

@kwankyu
Copy link
Contributor

kwankyu commented Nov 24, 2024

Can I see an example run of the workflow?

@dimpase
Copy link
Member

dimpase commented Nov 24, 2024

I wonder about the need for PAT here. Can it be explained?

@soham30rane
Copy link
Author

soham30rane commented Nov 24, 2024

Can I see an example run of the workflow?

This commit is result of an example run of the workflow.
I triggered it mannually with the release tag of 10.0.
You could have a look here. The previous trigger failed because I forgot to add PERSONAL_ACCESS_TOKEN to the repo secrets.

Actually the workflow in this repo is expected to be triggered by this workflow to be located in main sage repo

@soham30rane
Copy link
Author

I wonder about the need for PAT here. Can it be explained?

PAT is required for using Github REST API

Need for Github REST API: Everytime a release occurs on main sage repo, it triggers the generation of changelog in this repo. To generate changelog, the script uses info from the release notes (which is fetched via Github API).
But even if we somehow manage to get the release note info available to the script without using the API, it is not enough, because it doesn't contain the reviewer info and multiple authors info for PRs.
Hence Github API is needed to individually fetch info about each of the PRs.

@dimpase
Copy link
Member

dimpase commented Nov 24, 2024

But isn't PAT only needed for the user running this particular Actions workflow?
This user need not be the user who submitted their PR. In our workflow, it can be a maintainer.

This action can be run by someone who has rights to approve the PR changing contributor's info. Other users don't seem to need a PAT, right?

@soham30rane
Copy link
Author

Other users don't seem to need a PAT, right?

Yes, you are right, only a single PAT needs to be created (by a maintainer) and stored in repository secrets.

This action can be run by someone who has rights to approve the PR changing contributor's info

Yes! And I just realized that we don't need CHANGELOG_TRIGGER_SECRET to verify the source of trigger, because the PAT is already ensuring that only those people who have rights to approve the PR can trigger the workflow via API call. Should I go ahead and do the related cleanup?

@dimpase
Copy link
Member

dimpase commented Nov 24, 2024

I admit don't have a complete understanding of Actions permissions model - please proceed as you see fit.

git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add .
git commit -m "Added changelog for release: ${{ inputs.release_tag }}"
git push
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the user.name and user.email be simpler?

We may use a specific file name (src/changelogs/sage-......txt?) instead of . for safety?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the user.name and user.email be simpler?

Yes we could do that, but if we change the email then we won't see the black github-actions icon github-actions in the commits. Instead we would see it in gray disabled mode.

We may use a specific file name (src/changelogs/sage-......txt?) instead of . for safety?

Yes, will fix that


It fetches release data, extracts relevant pull request (PR) information, and generates a detailed changelog
and add its to src/changelogs.
The script uses the GitHub REST API to collect information about contributors, PR authors, and reviewers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add it to


def get_latest_tags():
url = f"{BASE_URL}/tags?per_page=1000" # If per_page is not specified then very few tags are fetched
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get recent tags in the first page, I think 100 would be enough.

#35637: Dima Pasechnik: Remark that WSL needs a lot of RAM [Reviewed by Tobias Diez]
#35638: Matthias Koppe: `build/pkgs/python3/spkg-configure.m4`: Add depcheck for zlib [Reviewed by Dima Pasechnik]
#35460: Kwankyu Lee: Update developer guide for workflows on github [Reviewed by Matthias Koppe, Sebastian Oehms]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous changelogs do not have an empty line at the end.

global first_contribs
all_contribs = set([git_to_name.get(c, f"@{c}") for c in all_contribs])
first_contribs = set([git_to_name.get(c, f"@{c}") for c in first_contribs])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we can retrieve the github user's real name via github api from his/her github account name. If this is the case, then we may create a new entry in conf/contributors.xml and then use the real name instead of the github account name in the created changelog. For example, we may create this entry

<contributor
 name="Marie Bonboire"
 github="marizee"/>

and use "Marie Bonboire" instead of @marizee. The entries in contributors.xml are listed in alphabetical order in the last name. We may use the fact to find the right place to put the new entry in.

But this is out of the scope of this PR. You may decide to do this extra work or not. Just let me know.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is doable, but some users might not have their real name on their Github Profile, in such case we seem to have no other option than to user their github username in the format @github-username.
I will work on it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, we can retrieve the location of the user (if available on their GitHub profile). Then, for all contributors who have their GitHub username in conf/contributors.xml but don't have location information, we can add it.

I think it would be better to tackle this in a separate script, so I may put together another PR specifically for adding the location data

@kwankyu
Copy link
Contributor

kwankyu commented Nov 25, 2024

... only a single PAT needs to be created (by a maintainer) and stored in repository secrets.

It seems that the sagemath organization owner (not the admin of sagemath/sage repo) can create a PAT that allows (read)-access to sagemath/sage repo, and then the admin of sagemath/website should store the PAT as a secret of sagemath/sage repo.

Do you confirm this?

@soham30rane
Copy link
Author

Yes, I confirm that the sagemath organization owner can create a PAT, that allows read-access to sagemath/sage (since sagemath/sage is a public repo, giving 'read access to public repos' is also enough). BUT the PAT needs to be stored as a secret for both repos sagemath/website and sagemath/sage.

I will try to elaborate on the use of PAT here: (Its used in two parts)

  1. In create_changelog.py which will be located in sagemath/website. Here its used to retrieve PR related info using Github API

  2. In this workflow which will be located in sagemath/sage. Here its used to trigger generate_changelog.yml workflow after each release via Github API which will generate the changelog.

So for the 1st use case any valid PAT with 'read-access to public repos' will work. (And it should be stored in sagemath/website)
For 2nd use case, the PAT used has to belong to person who has rights to trigger a workflow run in sagemath/website (Obviously the org owner will have those rights). (And it should be stored in sagemath/sage)

This is the reason we need to store PAT in secrets of both repos sagemath/website and sagemath/sage. So we can store the PAT created by org owner (that allows read access to public repos) in secrets of both repos.

@kwankyu
Copy link
Contributor

kwankyu commented Nov 26, 2024

For public repos, no PAT may be necessary (or so my AI says :-) We could experiment soon.

@soham30rane
Copy link
Author

I tried running the workflow without a PAT, and it worked perfectly! Sorry for the confusion earlier, also thank you for your patience.
That said, for the second use case I mentioned, a PAT is still necessary. If you'd prefer to avoid creating a PAT, you could just sideline it and manually trigger generate_changelog.yml after every release. Whatever works best for the org!

@soham30rane
Copy link
Author

Okay, so the last time I triggered the workflow on my testing repo I had put a limit on the number of PRs fetched (just for the sake of saving time) and it seemed to work perfectly fine.
Today I ran the workflow after removing that limit and it gave me 'rate limit reached'.
I skimmed through their official documentation, and here they say The primary rate limit for unauthenticated requests is 60 requests per hour. This is not enough as there are more than 60 PRs in any stable release

@kwankyu
Copy link
Contributor

kwankyu commented Nov 27, 2024

So with a PAT, the rate limit is 5000 PRs? This would be big enough.

@soham30rane
Copy link
Author

Yeah, 5000 would be more than enough.

@haraldschilly
Copy link
Member

I'm not sure what's going on and how this is supposed to work, but I just opened the settings and saw there are also "Organization secrets" (and there is one for PYPI).

When I open the management form for that secret, I can add one there and tell it exactly which repositories should have access. So, I'm only guessing, but maybe you want the same secret (with the same value) to appear in two repositories? Then we could make this such an org secret for two repositories.

@kwankyu
Copy link
Contributor

kwankyu commented Nov 27, 2024

We can proceed in this way: Harald, as you are one of the organization owners, you can designate me as an admin of sagemath/website. Then I can work with the author to put his code in the repo, and then we can test how it works. And then if we need a PAT (perhaps but I am not sure yet), then you, as an organization owner, could make one for us. Then I can put that PAT as a secret to either repo sagemath/website or sagemath/sage as necessary.

@dimpase
Copy link
Member

dimpase commented Nov 27, 2024

I think it's better to have different PATs for different repos, just as with the usual passwords

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update https://www.sagemath.org/development-map.html using GitHub API
4 participants