Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per cluster member resource reservations #14247

Closed
wants to merge 32 commits into from

Conversation

MggMuggins
Copy link
Contributor

@MggMuggins MggMuggins commented Oct 8, 2024

As described in LX089.

The general approach (as with the other limits validation) is to query the current state of the cluster from the database, apply the delta (create an instance, update the cluster config or update a cluster member config), and then validate the resulting configuration.

This is performed in at most only two round-trips to the DB:

  • GetMemberConfigWithGlobalDefault gives a column for global and member-specific config for each cluster member
  • GetClusterMemberInstanceConfig gives the value of each limit for each instance, including null values when the config key is not set

I don't love the specialized types for each query, but without them the results are not super manageable. I also feel like the permissions checking code could be cleaner.

I'd like to hear thoughts on organization:

  • These limits aren't enforced at the project level the same way others are, but I've included them in the lxd/project/limits package because other code there is similar.
  • The db query functions are relevant to cluster members but could fit in any of a few files

TODO:

  • Refactor GetClusterInstanceConfig (this was the bug I noted in our standup on Tuesday Oct 8)
  • Only request GET /cluster/members/{name}/state when cluster update includes reservation keys
  • Reservations are not yet taken into account during instance placement
  • More tests
  • Pass ClusterMemberState to scriptlet so that the scriptlet doesn't need to re-request from the rest of the cluster
  • Docs? Not sure if we should call this feature out in a place other than the config keys themselves.

@github-actions github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels Oct 8, 2024
Copy link

github-actions bot commented Oct 8, 2024

Heads up @mionaalex - the "Documentation" label was applied to this issue.

@@ -1111,6 +1111,85 @@ func whereClause(conditions []inCondition, args *[]any) string {
return b.String()
}

// OverridableConfig is a config key that could be set on either a specific
// cluster member or for the whole cluster.
type OverridableConfig struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this name; ClusterConfigValue? MemberConfigValue?

memberName implies that this function will work for any cluster member,
not just the local one

Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
This reduces a significant amount of unsightly code duplication in the
config and limits query functions

Signed-off-by: Wesley Hershberger <[email protected]>
@MggMuggins MggMuggins force-pushed the resource-reservation branch 4 times, most recently from 5de16c6 to fd0ad60 Compare October 11, 2024 16:41
@MggMuggins
Copy link
Contributor Author

@tomponline This should be ready for a first round. I'm not clear on the cause of the test failure but initially it looks unrelated to my changes... I'll investigate further when I'm back to work on Wednesday.

@MggMuggins MggMuggins marked this pull request as ready for review October 11, 2024 17:43
// ---
// type: integer
// shortdesc: Number of CPUs to reserve for the LXD server
"limits.reserve.cpu": validate.IsAny,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this be an optional integer?

@@ -356,7 +356,7 @@ func GetImageSpaceBudget(globalConfig *clusterConfig.Config, tx *db.ClusterTx, p

// Check that we would not violate the project limits or restrictions if we
// were to commit the given instances and profiles.
func checkRestrictionsAndAggregateLimits(globalConfig *clusterConfig.Config, tx *db.ClusterTx, info *projectInfo) error {
func checkRestrictionsAndAggregateLimits(globalConfig *clusterConfig.Config, info *projectInfo) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this function or any of the functions it calls use the DB? If so, I'd prefer to keep the tx arg and plumb it down so that it uses a single tx.

// GetMemberConfigWithGlobalDefault returns a mapping of
// [clusterMemberName][key]cfg, with filters clusterMemberNames and keys. If
// either slice is nil/empty, all clusterMembers/keys are returned.
func (c *ClusterTx) GetMemberConfigWithGlobalDefault(ctx context.Context, clusterMemberNames []string, keys []string) (map[string]map[string]OverridableConfig, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function just used for access the effective config, or are you using it to display the config to the end user?

If the former, have you considered returning a map of effective config values for each cluster member, rather than requiring the caller to consult the ClusterValue and the MemberValue.

Also we dont use sqli.NullString for config values, unset keys are just not present in the map.

lxd/db/node.go Outdated
// GetClusterMemberInstanceConfig returns a map from clusterMemberName -> InstanceKey
// with optional filters clusterMemberNames and keys. If either slice is nil/empty,
// the results will include all cluster members/config keys.
func (c *ClusterTx) GetClusterMemberInstanceConfig(ctx context.Context, clusterMemberNames []string, keys []string) (map[string][]InstanceKey, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesnt look workable to me because your not expanding the effective instance config from profile(s).

I dont think you can do this in a single query, you might need to use InstanceList() instead with a custom handler func.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Focusing on this as it's the largest functional deficiency; the current commits are a bit of a mess but my feeling is that InstanceList won't quite work here. In order to add an instance (with expanded config) we need a list of all profiles that will be applied to that instance, which we can't get from InstanceList. Still thinking on this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use InstanceList for this, we do it commonly. It doesn't expand it automatically but does provide profile info so you can expand if needed. See existing examples

Copy link
Member

@tomponline tomponline Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g.

var globalConfigDump map[string]any
if s.GlobalConfig != nil {
    globalConfigDump = s.GlobalConfig.Dump()
}

err = s.DB.Cluster.Transaction(context.TODO(), func(ctx context.Context, tx *db.ClusterTx) error {
    return tx.InstanceList(ctx, func(dbInst db.InstanceArgs, p api.Project) error {
        dbInst.Config = instancetype.ExpandInstanceConfig(globalConfigDump, dbInst.Config, dbInst.Profiles)
        return nil
    }, filter)
})
if err != nil {
    return nil, err
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the existing instances & expanding their config isn't the problem; when we're validating a new/changed instance, InstanceList won't call our hook (for new instances), and isn't guaranteed to return the correct set of profiles (for instance update). Right now I'm thinking to just do another query to collect the correct set of profiles, even though InstanceList has likely already done so 🫤

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im confused, can you book a quick meeting to discuss as i dont see what the issue is.

This doesn't include a way to patch existing instances or add new ones
to the aggregate.

Signed-off-by: Wesley Hershberger <[email protected]>
...and a few helpers.

This function is used when checking instance creation/update and during
instance placement.

Signed-off-by: Wesley Hershberger <[email protected]>
...during createFromBackup

Signed-off-by: Wesley Hershberger <[email protected]>
By requiring a sysinfo for AllowInstanceCreation, it's required to either:
- Make an HTTP request to get the sysinfo from the target member (which
  may not yet be known during the first tx)
- Use a separate tx on the target member to grab the local sysinfo.

Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Forwarding the update request to the cluster member that's being updated
allows me to only use LocalSysInfo instead of requiring a
GET /1.0/cluster/members/{name}/state

Signed-off-by: Wesley Hershberger <[email protected]>
A call to this function is similar to `NewNotifier(..., notify.NotifyAll)`
with two key differences:
 - The hook function knows the name of the cluster member it's making a
   request against. This is important because we need to member name to
   construct a `GET /1.0/cluster/members/{name}/state` query
 - It makes requests to a customizable set of cluster members instead of
   all the cluster members; this is relevant for instance placement where
   the list of allowed cluster members may already be shorter due to
   cluster group constraints.

Signed-off-by: Wesley Hershberger <[email protected]>
config.Dump() creates and passes back a new map, so there's no need to
do it a second time.

Signed-off-by: Wesley Hershberger <[email protected]>
Requires `GET /cluster/members/{name}/state` in order to check if resource
reservations are violated by the new config.

Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
This doesn't cover instance creation since there won't be an instance
in the DB to patch. Still working here...

Signed-off-by: Wesley Hershberger <[email protected]>
@tomponline tomponline changed the title Resource reservations Per cluster member resource reservations Oct 17, 2024
// ClusterSysInfo returns a map from clusterMemberName -> sysinfo for every member
// of the cluster. This requires an HTTP API call to the rest of the cluster.
// Fails with ClusterUnavailableError if any requested member is offline.
func ClusterSysInfo(s *state.State, members []db.NodeInfo) (map[string]api.ClusterMemberSysInfo, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than duplicate the NewNotifer logic, the changes you've needed here sound quite useful to add to NewNotifier and the returned Notifier function definition anyway.

So I'm thinking we should update NewNotifier to:

type Notifier func(member db.NodeInfo, hook func(lxd.InstanceServer) error) error

func NewNotifier(state *state.State, networkCert *shared.CertInfo, serverCert *shared.CertInfo, policy NotifierPolicy, members ...[]db.NodeInfo) (Notifier, error) {

This would make members optional, and if not supplied then all members would be loaded as they are today.

Also there looks to be some optimisations we can make in NewNotifier that you've used here:

  1. Avoid the call to tx.GetNodeOfflineThreshold(ctx) and s.GlobalConfig.OfflineThreshold() instead.
  2. Change if localClusterAddress == "" { to if s.ServerClustered {.

That way we get to use the more resilient down detection in NewNotifier here too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably still need the ClusterSysInfo function for convenience, but it can use the updated NewNotifier under the hood.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had considered it but decided against it while drafting this since updating all the usages was going to take a while. Agree though, I'll put together a PR.

@@ -1677,24 +1677,24 @@ func AllowClusterUpdate(ctx context.Context, tx *db.ClusterTx, sysinfo map[strin
return nil
}

aggregates, err := getClusterMemberAggregateLimits(ctx, tx, newConfig, requiredLimits)
clusterMemberNames := []string{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you're appending to a slice there's no need to initialise it first, so can just do var clusterMemberNames []string

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you've identified and made some cleanups that arent directly related to this feature.

But I think it would be clearer to include them in a separate earlier PR so we can lay the ground work for the feature change itself separately.

So lets include in a separate PR:

  • Any cleanup not related to the feature itself.
  • The NewNotifier changes.
  • The SysInfo change to add logical CPUs.
  • The change the cluster member config update to redirect to the member in question.

@MggMuggins
Copy link
Contributor Author

MggMuggins commented Oct 18, 2024

Going to close this as the code that isn't in #14294 will get largely rewritten

@MggMuggins MggMuggins closed this Oct 18, 2024
tomponline added a commit that referenced this pull request Oct 21, 2024
This includes preparatory changes as described in #14247:

- Notifier improvements
  - Use `state.ServerClustered`
  - Don't query for `OfflineThreshold` unless needed (see commit msg)
  - Allow passing `db.NodeInfo` to `NewNotifier`
- Pass `db.NodeInfo` to the notifier hook to reduce the need for extra
http calls for basic info (like the target server's name)
- Member state updates
  - Add `CPUThreads` to `ClusterMemberSysInfo` (name tbd)
  - Refactor `MemberState` to expose local `ClusterMemberSysinfo`
- Implement helper to collect member state from all cluster members,
using the notifier improvements above
- Other odds and ends

Performance improvements for project limits will be in a separate PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants