Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: repository cloner #659

Merged
merged 9 commits into from
Aug 20, 2024
Merged

feat: repository cloner #659

merged 9 commits into from
Aug 20, 2024

Conversation

sgaist
Copy link
Collaborator

@sgaist sgaist commented Jul 31, 2024

Describe your changes

This PR implements support for the CodeRepository management.

The added functionalities:

  • Cloner helper program that supports both username/password and ssh key based authentication
  • Creation of init containers to clone the repositories.

The current implementation creates one init container per CodeRepository entry.

Known limitations:

  • While the cloner helper supports ssh remotes, it will fail when run in a container as the remote will not be part of the known hosts. One solution would be to provide the known_hosts file through a ConfigMap mounted in the init containers. This would allow users to add private registries in a secure fashion.
  • When restarting the pod, clone will fail as the target already exists.

For that last point two possible things can be done:

  • Test if the folder exists and if so don't clone and call it a success
  • Add an annotation to the pod that can be checked and not re-create the clone containers.

Tests are currently a skeleton and must be improved.

Issue ticket number and link

Fixes: #612
Part of SwissDataScienceCenter/renku#3679

Changes this field means that the content of the storage
should be modified but how to know what is the right thing
to do between update, removal, etc. Recreating the full
session is the cleaner option.
The cloner currently only needs authentication information.
As the same suggest it will be used to clone repositories
associate to a session.
Current implementation uses one init container per repository.
@sgaist sgaist requested review from olevski and a team as code owners July 31, 2024 15:38
Copy link
Member

@olevski olevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sgaist, this looks really good.

A few things:

  • Please check if cloning with username and password saves the credentials in the repository's git configuration in plain text. If I am not mistaken this is what happens when you clone with the regular git cli and use credentials. If the credentials are stored you can modify the repo's git config with functions from the go-git package.
  • I think we should rename AuthConfig to Config or CloneConfig because I think we will need to add more stuff there in the future. For example we may want to support cloning via a proxy.
  • Can you please check if the PlainClone function actually respects/loads the global/user git configs at all. My impression is that it does not. I looked through the code for it and it seems like there are no calls to load the user or global git configs. It just loads the config inside the repository after the repository is initialized. If the global/user configs are not loaded at all then we should change the CRD spec and simply not ask for the git config to be used when cloning.

If you want just open issues for all the three things above here. And we can tackle them after this is merged.

Comment on lines +164 to +167
// For 'git' this should contain either:
// The username and password
// The private key and its corresponding password
// An empty value can be used when cloning from public repositories using the http protocol
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means that we would expect specific keys to have specific values in the secret. Or that the secret will have a specific format. If yes, then we should document these values here.

cloner/cmd/clone.go Outdated Show resolved Hide resolved
cloner/cmd/clone.go Outdated Show resolved Hide resolved
cloner/cmd/clone.go Show resolved Hide resolved
cloner/cmd/clone.go Outdated Show resolved Hide resolved
@olevski
Copy link
Member

olevski commented Aug 9, 2024

One more thing... We can actually clone a repository by using the git config. You just have to use the lower level functions from go-git. See here for example: https://gist.github.com/olevski/606c6168b6072cabadaca09e89744414

But lets leave this as an optional change for another time. I just wanted to have a reference to it somewhere.

sgaist and others added 3 commits August 20, 2024 10:46
Better explanation for string.FieldsFunc use

Co-authored-by: Tasko Olevski <[email protected]>
go-git already provides the tools required.
@sgaist
Copy link
Collaborator Author

sgaist commented Aug 20, 2024

Thanks @sgaist, this looks really good.

A few things:

* Please check if cloning with username and password saves the credentials in the repository's git configuration in plain text. If I am not mistaken this is what happens when you clone with the regular git cli and use credentials. If the credentials are stored you can modify the repo's git config with functions from the `go-git` package.

From tests, the auth information are not stored in the cloned repository.

* I think we should rename `AuthConfig` to `Config` or `CloneConfig` because I think we will need to add more stuff there in the future. For example we may want to support cloning via a proxy.

Good point and done

* Can you please check if the `PlainClone` function actually respects/loads the global/user git configs at all. My impression is that it does not. I looked through the code for it and it seems like there are no calls to load the user or global git configs. It just loads the config inside the repository after the repository is initialized. If the global/user configs are not loaded at all then we should change the CRD spec and simply not ask for the git config to be used when cloning.

You are correct. There's a long talk about it here: src-d/go-git#760

@olevski olevski self-requested a review August 20, 2024 11:45
@sgaist sgaist merged commit e86a23d into main Aug 20, 2024
18 checks passed
@sgaist sgaist deleted the feature/repository-cloner branch August 20, 2024 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code cloning
2 participants