-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS IMDS Token is not being reused until it expires #1990
Comments
This new change introduces security concerns as it's writing the session token to a file under /var/run and care must be taken to keep this safe and ensure permissions set correctly for read/write i.e. umask, etc. Ideally the session token ttl time should be much lower i.e. ~10 min or less and not using the max ttl of 6 hours. I have a fork of the aws-vpc-move-ip, which introduces some changes and attempts to get the instance ID from local file and falls back to the IMDS if that file is not accessible e.g. non-nitro based EC2 instance, etc. |
We can just chmod 600 the file after saving the token. |
We can also retrieve the instance ID from the The sleep and wait time should also be bumped from the current 3 and 1 to something a bit higher in event the IMDS is temporarily unavailable e.g. ~10s. |
Nice. I'll make a patch with the suggestions next week. |
I've made a patch based on your suggestions: #1995 |
What's the actual difference between reading it from a file and making a request to the IMDS metadata? Before merging this I would like to hear from the requester @harshkiprofile what's the use-case and what's the level of optimization achieved from this change. Were any issues observed by fetching a token every time? Writing a plain text token (in fact, any kind of secret) to a file is generally a bad idea, even if you set exclusive permissions. |
The decision to reuse the token by storing it locally is based on the fact that during AWS EC2 hypervisor maintenance events, the IMDS metadata API call to fetch token can sometimes become temporarily unavailable, even if the EC2 instance itself remains operational. If a new token were fetched with every request during such events, metadata token retrieval could fail. By storing the metadata token locally (temporarily), we ensure uninterrupted access to metadata, regardless of IMDS availability during maintenance windows, thus enhancing system reliability. This approach prioritizes fault tolerance over pure performance optimization. In the event of AWS maintenance, local token storage allows continued metadata access, reducing downtime and mitigating the impact of transient IMDS issues. While storing tokens or secrets in files does introduce potential risks, @oalbrigt has implemented restricted file permissions to minimize these concerns. Performance-wise, no significant issues have been observed with fetching a token for each request. However, in environments where transient IMDS issues occur frequently, the local token storage method provides increased robustness, eliminating the need for constant retries to acquire a token. That was the rationale behind this change. I would love to hear your feedback on this one! I will explore on what can we done to strengthen the security as I believe caching the token improves stability. |
@gguifelixamz The requests adds up when you have multiple resources. Say you have one or more fence agents (e.g. for different zones), then one or more of awseip, awsvip, aws-vpc-move-ip, and aws-vpc-route53. We can revert it if we find a reason to undo it. If not we can do improvements based on your suggestions or pull requests. Next release should be in january/february, so we got time to fix any issues before then. |
The session token is used to make subsequent calls to the IMDS service. Having it stored locally would still result in calls to the IMDS failing if the service was unavailable...
The IMDS != AWS EC2 API. The other functions in the script that make use of the aws-cli, will still result in the aws-cli requesting a session token from IMDS, retrieving credentials from the IMDS and then making the actual call to the AWS EC2 API.
See above, do not confuse the IMDS and AWS EC2 API as they are not the same thing. |
PPS limit != EC2 API As a user of resource agents, I encountered several challenges in my environment that led me to propose this suggestion. Implementing the recommendation has significantly enhanced the stability of the environment. |
@gguifelixamz I've added a commit to be able to refresh the token e.g. if the file is corrupt for some reason: |
The script currently generates an IMDS token with a 6-hour (21,600 seconds) validity, but it lacks logic to reuse the token until it expires. Instead, it fetches a new token on every call, which could be optimized.
The text was updated successfully, but these errors were encountered: