Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when creating a bunch of Ceph monitors #265

Open
willifehler opened this issue Sep 12, 2024 · 6 comments
Open

Issue when creating a bunch of Ceph monitors #265

willifehler opened this issue Sep 12, 2024 · 6 comments
Labels

Comments

@willifehler
Copy link

Hey there,

I'm running into issues while creating more than 3 monitors.

8056 TASK [lae.proxmox : Create additional Ceph monitors] *****************************************************************************************************************************************************************************************************************************************************************************************************************
 8077 fatal: [...]: FAILED! => {"changed": true, "cmd": ["pveceph", "mon", "create"], "delta": "0:00:09.344072", "end": "2024-09-12 07:18:36.763082", "msg": "non-zero return code", "rc": 255, "start": "2024-09-12 07:18:27.419010", "stderr": "cfs-lock 'file-ceph_conf' error: got lock request timeout", "stderr_lines": ["cfs-lo
 8078 ck 'file-ceph_conf' error: got lock request timeout"], "stdout": "trying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...
 8079 \ntrying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...\ntrying to acquire cfs lock 'file-ceph_conf' ...", "stdout_lines": ["trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph
 8080 _conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ...", "trying to acquire cfs lock 'file-ceph_conf' ..."]}
 8081 changed: [...]

Is this already a known issue?

Cheers - Willi

@lae lae added the bug label Sep 12, 2024
@lae
Copy link
Owner

lae commented Sep 12, 2024

It is not. However it's not something I can easily reproduce since I don't have access to/funds for the hardware.

Given the excerpt I don't know how much more info it'll provide, but can you rerun your playbook with ANSIBLE_STDOUT_CALLBACK=debug ansible-playbook -v? In addition to that, please see if you can check the systemd journal on the affected hosts for any other relevant information.

@lae
Copy link
Owner

lae commented Sep 12, 2024

Also double check that your cluster is healthy. This looks like it might be an issue with Proxmox quorum instead of an issue with Ceph.

@willifehler
Copy link
Author

Hey @lae,

I'm not exactly sure about the code yet but from my point of view maybe a pause while creating each monitor could might help.

If I do rerun the Playbook all monitors will be created.

Can you try to test this with nested VMs?

Cheers - Willi

@lae
Copy link
Owner

lae commented Sep 12, 2024

If you think timing is the issue, can you try modifying the role installed locally to add throttle: 1 to the creation task in ceph.yml? https://github.com/lae/ansible-role-proxmox/blob/develop/tasks/ceph.yml#L25

As for VMs, I don't exactly have the RAM/disk space to test a PVE cluster with 25 nodes, much less 5....

@zenntrix
Copy link
Collaborator

I have the resources to test this, could you please provide more details first though? I.e

How many servers, is this a fresh install etc?

@willifehler
Copy link
Author

willifehler commented Sep 13, 2024

Hey there,

I have 12 servers but I'm afraid I can't test this with Bare Metal soon again. I will try to use VMs.

Cheers - Willi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants