Per cluster member resource reservations #14247

MggMuggins · 2024-10-08T23:11:51Z

As described in LX089.

The general approach (as with the other limits validation) is to query the current state of the cluster from the database, apply the delta (create an instance, update the cluster config or update a cluster member config), and then validate the resulting configuration.

This is performed in at most only two round-trips to the DB:

GetMemberConfigWithGlobalDefault gives a column for global and member-specific config for each cluster member
GetClusterMemberInstanceConfig gives the value of each limit for each instance, including null values when the config key is not set

I don't love the specialized types for each query, but without them the results are not super manageable. I also feel like the permissions checking code could be cleaner.

I'd like to hear thoughts on organization:

These limits aren't enforced at the project level the same way others are, but I've included them in the lxd/project/limits package because other code there is similar.
The db query functions are relevant to cluster members but could fit in any of a few files

TODO:

Refactor GetClusterInstanceConfig (this was the bug I noted in our standup on Tuesday Oct 8)
Only request GET /cluster/members/{name}/state when cluster update includes reservation keys
Reservations are not yet taken into account during instance placement
More tests
Pass ClusterMemberState to scriptlet so that the scriptlet doesn't need to re-request from the rest of the cluster
Docs? Not sure if we should call this feature out in a place other than the config keys themselves.

github-actions · 2024-10-08T23:12:04Z

Heads up @mionaalex - the "Documentation" label was applied to this issue.

Signed-off-by: Wesley Hershberger <[email protected]>

MggMuggins · 2024-10-11T14:48:49Z

lxd/db/node.go

@@ -1111,6 +1111,85 @@ func whereClause(conditions []inCondition, args *[]any) string {
 	return b.String()
 }

+// OverridableConfig is a config key that could be set on either a specific
+// cluster member or for the whole cluster.
+type OverridableConfig struct {


I don't love this name; ClusterConfigValue? MemberConfigValue?

memberName implies that this function will work for any cluster member, not just the local one Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

This reduces a significant amount of unsightly code duplication in the config and limits query functions Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

MggMuggins · 2024-10-11T17:43:30Z

@tomponline This should be ready for a first round. I'm not clear on the cause of the test failure but initially it looks unrelated to my changes... I'll investigate further when I'm back to work on Wednesday.

tomponline · 2024-10-15T10:16:18Z

lxd/api_cluster.go

+		// ---
+		//  type: integer
+		//  shortdesc: Number of CPUs to reserve for the LXD server
+		"limits.reserve.cpu": validate.IsAny,


shouldnt this be an optional integer?

tomponline · 2024-10-15T10:18:17Z

lxd/project/limits/permissions.go

@@ -356,7 +356,7 @@ func GetImageSpaceBudget(globalConfig *clusterConfig.Config, tx *db.ClusterTx, p

 // Check that we would not violate the project limits or restrictions if we
 // were to commit the given instances and profiles.
-func checkRestrictionsAndAggregateLimits(globalConfig *clusterConfig.Config, tx *db.ClusterTx, info *projectInfo) error {
+func checkRestrictionsAndAggregateLimits(globalConfig *clusterConfig.Config, info *projectInfo) error {


does this function or any of the functions it calls use the DB? If so, I'd prefer to keep the tx arg and plumb it down so that it uses a single tx.

tomponline · 2024-10-15T12:04:16Z

lxd/db/node.go

+// GetMemberConfigWithGlobalDefault returns a mapping of
+// [clusterMemberName][key]cfg, with filters clusterMemberNames and keys. If
+// either slice is nil/empty, all clusterMembers/keys are returned.
+func (c *ClusterTx) GetMemberConfigWithGlobalDefault(ctx context.Context, clusterMemberNames []string, keys []string) (map[string]map[string]OverridableConfig, error) {


Is this function just used for access the effective config, or are you using it to display the config to the end user?

If the former, have you considered returning a map of effective config values for each cluster member, rather than requiring the caller to consult the ClusterValue and the MemberValue.

Also we dont use sqli.NullString for config values, unset keys are just not present in the map.

tomponline · 2024-10-15T12:10:33Z

lxd/db/node.go

+// GetClusterMemberInstanceConfig returns a map from clusterMemberName -> InstanceKey
+// with optional filters clusterMemberNames and keys. If either slice is nil/empty,
+// the results will include all cluster members/config keys.
+func (c *ClusterTx) GetClusterMemberInstanceConfig(ctx context.Context, clusterMemberNames []string, keys []string) (map[string][]InstanceKey, error) {


This doesnt look workable to me because your not expanding the effective instance config from profile(s).

I dont think you can do this in a single query, you might need to use InstanceList() instead with a custom handler func.

Focusing on this as it's the largest functional deficiency; the current commits are a bit of a mess but my feeling is that InstanceList won't quite work here. In order to add an instance (with expanded config) we need a list of all profiles that will be applied to that instance, which we can't get from InstanceList. Still thinking on this...

You can use InstanceList for this, we do it commonly. It doesn't expand it automatically but does provide profile info so you can expand if needed. See existing examples

e.g.

var globalConfigDump map[string]any if s.GlobalConfig != nil { globalConfigDump = s.GlobalConfig.Dump() } err = s.DB.Cluster.Transaction(context.TODO(), func(ctx context.Context, tx *db.ClusterTx) error { return tx.InstanceList(ctx, func(dbInst db.InstanceArgs, p api.Project) error { dbInst.Config = instancetype.ExpandInstanceConfig(globalConfigDump, dbInst.Config, dbInst.Profiles) return nil }, filter) }) if err != nil { return nil, err }

Getting the existing instances & expanding their config isn't the problem; when we're validating a new/changed instance, InstanceList won't call our hook (for new instances), and isn't guaranteed to return the correct set of profiles (for instance update). Right now I'm thinking to just do another query to collect the correct set of profiles, even though InstanceList has likely already done so 🫤

im confused, can you book a quick meeting to discuss as i dont see what the issue is.

This doesn't include a way to patch existing instances or add new ones to the aggregate. Signed-off-by: Wesley Hershberger <[email protected]>

...and a few helpers. This function is used when checking instance creation/update and during instance placement. Signed-off-by: Wesley Hershberger <[email protected]>

…reation Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

...during createFromBackup Signed-off-by: Wesley Hershberger <[email protected]>

By requiring a sysinfo for AllowInstanceCreation, it's required to either: - Make an HTTP request to get the sysinfo from the target member (which may not yet be known during the first tx) - Use a separate tx on the target member to grab the local sysinfo. Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

Forwarding the update request to the cluster member that's being updated allows me to only use LocalSysInfo instead of requiring a GET /1.0/cluster/members/{name}/state Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

A call to this function is similar to `NewNotifier(..., notify.NotifyAll)` with two key differences: - The hook function knows the name of the cluster member it's making a request against. This is important because we need to member name to construct a `GET /1.0/cluster/members/{name}/state` query - It makes requests to a customizable set of cluster members instead of all the cluster members; this is relevant for instance placement where the list of allowed cluster members may already be shorter due to cluster group constraints. Signed-off-by: Wesley Hershberger <[email protected]>

config.Dump() creates and passes back a new map, so there's no need to do it a second time. Signed-off-by: Wesley Hershberger <[email protected]>

Requires `GET /cluster/members/{name}/state` in order to check if resource reservations are violated by the new config. Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

This doesn't cover instance creation since there won't be an instance in the DB to patch. Still working here... Signed-off-by: Wesley Hershberger <[email protected]>

Signed-off-by: Wesley Hershberger <[email protected]>

tomponline · 2024-10-17T10:25:24Z

lxd/cluster/member_state.go

+// ClusterSysInfo returns a map from clusterMemberName -> sysinfo for every member
+// of the cluster. This requires an HTTP API call to the rest of the cluster.
+// Fails with ClusterUnavailableError if any requested member is offline.
+func ClusterSysInfo(s *state.State, members []db.NodeInfo) (map[string]api.ClusterMemberSysInfo, error) {


Rather than duplicate the NewNotifer logic, the changes you've needed here sound quite useful to add to NewNotifier and the returned Notifier function definition anyway.

So I'm thinking we should update NewNotifier to:

type Notifier func(member db.NodeInfo, hook func(lxd.InstanceServer) error) error func NewNotifier(state *state.State, networkCert *shared.CertInfo, serverCert *shared.CertInfo, policy NotifierPolicy, members ...[]db.NodeInfo) (Notifier, error) {

This would make members optional, and if not supplied then all members would be loaded as they are today.

Also there looks to be some optimisations we can make in NewNotifier that you've used here:

Avoid the call to tx.GetNodeOfflineThreshold(ctx) and s.GlobalConfig.OfflineThreshold() instead.

Change if localClusterAddress == "" { to if s.ServerClustered {.

That way we get to use the more resilient down detection in NewNotifier here too.

We probably still need the ClusterSysInfo function for convenience, but it can use the updated NewNotifier under the hood.

I had considered it but decided against it while drafting this since updating all the usages was going to take a while. Agree though, I'll put together a PR.

tomponline · 2024-10-17T10:28:10Z

lxd/project/limits/permissions.go

@@ -1677,24 +1677,24 @@ func AllowClusterUpdate(ctx context.Context, tx *db.ClusterTx, sysinfo map[strin
 		return nil
 	}

-	aggregates, err := getClusterMemberAggregateLimits(ctx, tx, newConfig, requiredLimits)
+	clusterMemberNames := []string{}


When you're appending to a slice there's no need to initialise it first, so can just do var clusterMemberNames []string

tomponline

I like that you've identified and made some cleanups that arent directly related to this feature.

But I think it would be clearer to include them in a separate earlier PR so we can lay the ground work for the feature change itself separately.

So lets include in a separate PR:

Any cleanup not related to the feature itself.
The NewNotifier changes.
The SysInfo change to add logical CPUs.
The change the cluster member config update to redirect to the member in question.

MggMuggins · 2024-10-18T14:10:33Z

Going to close this as the code that isn't in #14294 will get largely rewritten

This includes preparatory changes as described in #14247: - Notifier improvements - Use `state.ServerClustered` - Don't query for `OfflineThreshold` unless needed (see commit msg) - Allow passing `db.NodeInfo` to `NewNotifier` - Pass `db.NodeInfo` to the notifier hook to reduce the need for extra http calls for basic info (like the target server's name) - Member state updates - Add `CPUThreads` to `ClusterMemberSysInfo` (name tbd) - Refactor `MemberState` to expose local `ClusterMemberSysinfo` - Implement helper to collect member state from all cluster members, using the notifier improvements above - Other odds and ends Performance improvements for project limits will be in a separate PR.

github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels Oct 8, 2024

MggMuggins force-pushed the resource-reservation branch from 7b343bd to cd31875 Compare October 9, 2024 23:25

lxd/cluster: Document/validate limits.reserve.{cpu,memory}

5329730

Signed-off-by: Wesley Hershberger <[email protected]>

MggMuggins force-pushed the resource-reservation branch from cd31875 to 4849fba Compare October 10, 2024 22:48

MggMuggins added 4 commits October 11, 2024 09:37

doc: Run make update-metadata

0dc3fd8

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Remove unused tx param

caf4602

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Remove unused tx param

80e8817

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/cluster: Split LocalSysInfo from MemberState

ef1d388

Signed-off-by: Wesley Hershberger <[email protected]>

MggMuggins force-pushed the resource-reservation branch from 4849fba to 5d7e20a Compare October 11, 2024 14:37

MggMuggins commented Oct 11, 2024

View reviewed changes

MggMuggins force-pushed the resource-reservation branch from 5d7e20a to 5289af0 Compare October 11, 2024 14:51

MggMuggins added 8 commits October 11, 2024 10:04

lxd/cluster: Remove unused memberName parameter

553877a

memberName implies that this function will work for any cluster member, not just the local one Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Update MemberState usage

29a6ecf

Signed-off-by: Wesley Hershberger <[email protected]>

shared/api: Add CPUThreads to ClusterMemberSysInfo

9148443

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/cluster: Populate CPUThreads

10b514e

Signed-off-by: Wesley Hershberger <[email protected]>

doc: Run make update-api

5a800ac

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/db: Implement rudimentary query builder

b416d85

This reduces a significant amount of unsightly code duplication in the config and limits query functions Signed-off-by: Wesley Hershberger <[email protected]>

lxd/db/node: Implement GetMemberConfigWithGlobalDefault

01c6c55

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/db/node: Test GetMemberConfigWithGlobalDefault

713f773

Signed-off-by: Wesley Hershberger <[email protected]>

MggMuggins force-pushed the resource-reservation branch 4 times, most recently from 5de16c6 to fd0ad60 Compare October 11, 2024 16:41

MggMuggins marked this pull request as ready for review October 11, 2024 17:43

MggMuggins requested a review from tomponline October 11, 2024 17:43

tomponline reviewed Oct 15, 2024

View reviewed changes

MggMuggins added 19 commits October 17, 2024 01:30

WIP getClusterMemberAggregateLimits

1c9037f

This doesn't include a way to patch existing instances or add new ones to the aggregate. Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Implement CheckReservationsWithInstance

a9236ed

...and a few helpers. This function is used when checking instance creation/update and during instance placement. Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Check resource reservations during AllowInstanceC…

455b231

…reation Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Update AllowInstanceCreation tests

7dfea32

Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Pass server name & sysinfo to AllowInstanceCreation

2ec28ac

...during createFromBackup Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Check reservations during AllowInstanceUpdate

1b7029f

Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Update AllowInstanceUpdate usage

bff566a

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Implement AllowClusterMemberUpdate

376a2d1

Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Use AllowClusterMemberUpdate

19a7b7a

Forwarding the update request to the cluster member that's being updated allows me to only use LocalSysInfo instead of requiring a GET /1.0/cluster/members/{name}/state Signed-off-by: Wesley Hershberger <[email protected]>

lxd/project/limits: Implement AllowClusterUpdate

51b84f6

Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Remove uneeded iteration

4f2c7aa

config.Dump() creates and passes back a new map, so there's no need to do it a second time. Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Use AllowClusterUpdate

85354f4

Requires `GET /cluster/members/{name}/state` in order to check if resource reservations are violated by the new config. Signed-off-by: Wesley Hershberger <[email protected]>

lxd: Check reservations during instance placement

4fa4bd1

Signed-off-by: Wesley Hershberger <[email protected]>

test/clustering: Resource reservation tests

592d889

Signed-off-by: Wesley Hershberger <[email protected]>

lxd/db/node: Fix linter errors

b2d9555

Signed-off-by: Wesley Hershberger <[email protected]>

WIP Patch instances during getClusterMemberAggregateLimits

a420e8b

This doesn't cover instance creation since there won't be an instance in the DB to patch. Still working here... Signed-off-by: Wesley Hershberger <[email protected]>

WIP Use getPatchedClusterMemberAggregateLimits

6b68db7

Signed-off-by: Wesley Hershberger <[email protected]>

MggMuggins force-pushed the resource-reservation branch from fd0ad60 to 6b68db7 Compare October 17, 2024 06:34

tomponline changed the title ~~Resource reservations~~ Per cluster member resource reservations Oct 17, 2024

tomponline reviewed Oct 17, 2024

View reviewed changes

tomponline requested changes Oct 17, 2024

View reviewed changes

MggMuggins mentioned this pull request Oct 17, 2024

Notifier update; MemberState update #14294

Merged

MggMuggins closed this Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per cluster member resource reservations #14247

Per cluster member resource reservations #14247

MggMuggins commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024

MggMuggins Oct 11, 2024

MggMuggins commented Oct 11, 2024

tomponline Oct 15, 2024

tomponline Oct 15, 2024

tomponline Oct 15, 2024

tomponline Oct 15, 2024

MggMuggins Oct 17, 2024

tomponline Oct 17, 2024

tomponline Oct 17, 2024 •

edited

Loading

MggMuggins Oct 17, 2024

tomponline Oct 17, 2024

tomponline Oct 17, 2024

tomponline Oct 17, 2024

MggMuggins Oct 17, 2024

tomponline Oct 17, 2024

tomponline left a comment

MggMuggins commented Oct 18, 2024 •

edited

Loading

Per cluster member resource reservations #14247

Per cluster member resource reservations #14247

Conversation

MggMuggins commented Oct 8, 2024 • edited Loading

TODO:

github-actions bot commented Oct 8, 2024

Choose a reason for hiding this comment

MggMuggins commented Oct 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomponline Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomponline left a comment

Choose a reason for hiding this comment

MggMuggins commented Oct 18, 2024 • edited Loading

MggMuggins commented Oct 8, 2024 •

edited

Loading

tomponline Oct 17, 2024 •

edited

Loading

MggMuggins commented Oct 18, 2024 •

edited

Loading