Thoughts on using a configuration management framework? #37

jmahlik · 2023-08-25T15:57:52Z

It's pretty hard to get this up and running in an account that has restricted internet access.

I had fork and refactor almost all of the bash scripts. This was quite a challenge as they are a little unwieldy (I mean it is bash after all). So, I had a thought based on how I handle setting up dev environments on linux boxes.

Moving the install/run functionality to a declarative configuration management system would make maintaining, extending and using the project easier.

What would your thoughts be on managing the installs and configurations via something like Ansible? I recommended ansible since it's lightweight and easy to work with. Its a python package. So only need python which we already have. But it could be any config system.

The user experience could remain the same, the bash scripts would be shims around the config manager. Likely, it could be simplified. Not so many steps to get up and running, you just run a command and it gets the system in the desired state, instead of having to nohup a bunch of bash scripts.

It'd be easier to:

Allow options like install urls for the dependencies
Not rely on the working directory to source bash files
Avoid multiple re-installs to make it easier to run in a lifecycle config
Extend it by modifying or including additional config

I'd be willing to contribute work towards this since maintaining a copy of the bash scripts is quite painful. Already in the process of exploring a playbook for starting the ssh helper.

ivan-khvostishkov · 2023-08-29T16:59:20Z

Hi, Justin, great to see your interest and the willingness to contribute!

Would you elaborate a little bit more on the problem that you're trying to solve?

Also, have you already check the section in the FAQ: I'm running SageMaker in a VPC. Do I need to make extra configuration?? It shows example Dockerfiles where everything is coming pre-installed, so you don't need to "configure" anything extra, change URLs etc.

jmahlik · 2023-08-30T21:04:35Z

The particular use case is connecting sagemaker studio's jupyter server app to the kernel gateway apps to enable interactive plotting libraries that need a web server running. Similar to the web vnc example.

I did see the dockerfiles. Building them an environment without direct internet access isn't possible (same issue as running the scripts directly).

A couple specific things I thought a config manager could help address:

If one has to patch the bash files i.e. to change a download/install location, it has to be done in place or move all of them to a different directory since the bash scripts source each other based on the directory of the script. Let's say one wanted to keep the artifacts on s3 so we aren't reliant on github to download a binary. It's hard to pull that off currently.
The other thing I ran in to was repeated apt/yum installs even though things were already installed. Which made it hard to run in the timeout of a lifecycle script.
The scripts don't error on failure, they continue execution. So you're not really sure if parts completed successfully until it hits the end with hard to debug errors from prior failed scripts. I had to add set -euo pipefail to all of them to debug though the failing parts.

ivan-khvostishkov · 2023-08-31T09:29:05Z

Thank you, @jmahlik , I will take a look in your concerns. As to the last point, what version of the library do you use on the client and on the remote? The pipefail option has been added to some scripts in the latest version. Which ones you think still need this option turned on?

DrJeckyl · 2023-08-31T20:16:59Z

I ended up doing the same thing @jmahlik.
I forked the code, refactored to my needs and built all the pre-requisites into a custom image for sagemaker studio. Then a lifecycle config simply registers the instance and sets the SSHOwner tag etc.

On the local side, I also refactored some of the code in to a Python install to integrate with VSCode for our Windows users.

ivan-khvostishkov · 2023-09-01T10:44:01Z

Hi, @DrJeckyl , do you also have no Internet access during the build of the custom image and require to download tools like AWS CLI and SSM Agent from internal locations?

DrJeckyl · 2023-09-18T14:40:18Z

No @ivan-khvostishkov - We use a code pipeline with internet access when building the custom images.
However, a lifecycle config is needed to set the Owner tags when a kernel is launched. We had to modify the sm-ssh-ide, sm-init-ssm and sm-start-ssh.

Admittedly, we are a few versions behind and should update to see what's different now.

ivan-khvostishkov · 2024-07-24T16:17:16Z

Hi, @DrJeckyl , did you have chance to try the latest version 2.2.0 of SSH Helper to see if the pipefail command helps you to debug the scripts?

If you have Internet during build pipeline, then you can just add to your docker file this command:

RUN  sm-ssh-ide configure

It will download and install all libraries so later when you run the lifecycle config script it will detect that everything is already configured and won't try to install anything from Internet. In this case you don't need to patch the locations of the libraries.

I understand that you want to patch the lifecycle script with the specific value for LOCAL_USER_ID, but I don't yet understand how Ansible can help you in this case? The better option, in my opinion, would be to fetch the values from Systems Manager Parameter Store.

Of course, you need to modify the scripts a little bit to call the Systems Manager API, and you are encouraged to do so, because this repository is the sample code.

But is there any logic that you propose to be the part of the main branch? If we add a new lifecycle configuration script that fetches the user IDs from Parameter Store, will it help to resolve the your issue?

Let me know your thoughts.

ivan-khvostishkov · 2024-07-24T16:21:59Z

@jmahlik Following up on your original post, could you please help me to understand in which part you propose to run Ansible? As part of the lifecycle configuration script or as part of sm-ssh-ide script, etc.?

You mentioned that you're already in the process of creating the playbook, have you succeed in it? It would be great if you share your learnings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on using a configuration management framework? #37

Thoughts on using a configuration management framework? #37

jmahlik commented Aug 25, 2023

ivan-khvostishkov commented Aug 29, 2023

jmahlik commented Aug 30, 2023

ivan-khvostishkov commented Aug 31, 2023

DrJeckyl commented Aug 31, 2023

ivan-khvostishkov commented Sep 1, 2023

DrJeckyl commented Sep 18, 2023

ivan-khvostishkov commented Jul 24, 2024

ivan-khvostishkov commented Jul 24, 2024

Thoughts on using a configuration management framework? #37

Thoughts on using a configuration management framework? #37

Comments

jmahlik commented Aug 25, 2023

ivan-khvostishkov commented Aug 29, 2023

jmahlik commented Aug 30, 2023

ivan-khvostishkov commented Aug 31, 2023

DrJeckyl commented Aug 31, 2023

ivan-khvostishkov commented Sep 1, 2023

DrJeckyl commented Sep 18, 2023

ivan-khvostishkov commented Jul 24, 2024

ivan-khvostishkov commented Jul 24, 2024