Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPI BaremetalIronicAPITimeout #5500

Closed
SunnyGu74 opened this issue Dec 19, 2021 · 5 comments
Closed

IPI BaremetalIronicAPITimeout #5500

SunnyGu74 opened this issue Dec 19, 2021 · 5 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. platform/baremetal IPI bare metal hosts platform

Comments

@SunnyGu74
Copy link

SunnyGu74 commented Dec 19, 2021

Version

$ ./openshift-baremetal-install version
./openshift-baremetal-install 4.8.24
built from commit 7123680
release image quay.io/openshift-release-dev/ocp-release@sha256:0708475f51e969dd9e6902d958f8ffed668b1b9c8d63b6241e7c9e40d9548eee

Platform:

baremetal

  • IPI

What happened?

After running following command
./openshift-baremetal-install --dir ./cfg --log-level debug create cluster

I can't create cluster, get following error.

DEBUG module.masters.ironic_node_v1.openshift-master-host[0]: Still creating... [59m51s elapsed]
ERROR
ERROR Error: could not contact Ironic API: timeout reached
ERROR
ERROR on ../../../tmp/openshift-install-860785756/masters/main.tf line 1, in resource "ironic_node_v1" "openshift-master-host":
ERROR 1: resource "ironic_node_v1" "openshift-master-host" {
ERROR
ERROR
ERROR
ERROR Error: could not contact Ironic API: timeout reached
ERROR
ERROR on ../../../tmp/openshift-install-860785756/masters/main.tf line 1, in resource "ironic_node_v1" "openshift-master-host":
ERROR 1: resource "ironic_node_v1" "openshift-master-host" {
ERROR
ERROR
ERROR
ERROR Error: could not contact Ironic API: context deadline exceeded
ERROR
ERROR on ../../../tmp/openshift-install-860785756/masters/main.tf line 1, in resource "ironic_node_v1" "openshift-master-host":
ERROR 1: resource "ironic_node_v1" "openshift-master-host" {
ERROR
ERROR
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: error(Bar

Version

$ ./openshift-baremetal-install version
./openshift-baremetal-install 4.8.24
built from commit 7123680
release image quay.io/openshift-release-dev/ocp-release@sha256:0708475f51e969dd9e6902d958f8ffed668b1b9c8d63b6241e7c9e40d9548eee

Platform:

baremetal

  • IPI (automated install with openshift-install. If you don't know, then it's IPI)
  • UPI (semi-manual installation on customised infrastructure)

What happened?

After running following command
./openshift-baremetal-install --dir ./cfg --log-level debug create cluster

I can't create cluster, get following error.

DEBUG module.masters.ironic_node_v1.openshift-master-host[0]: Still creating... [59m51s elapsed]
ERROR
ERROR Error: could not contact Ironic API: timeout reached
ERROR
ERROR on ../../../tmp/openshift-install-860785756/masters/main.tf line 1, in resource "ironic_node_v1" "openshift-master-host":
ERROR 1: resource "ironic_node_v1" "openshift-master-host" {
ERROR
ERROR
ERROR
ERROR Error: could not contact Ironic API: timeout reached
ERROR
ERROR on ../../../tmp/openshift-install-860785756/masters/main.tf line 1, in resource "ironic_node_v1" "openshift-master-host":
ERROR 1: resource "ironic_node_v1" "openshift-master-host" {
ERROR
ERROR
ERROR
ERROR Error: could not contact Ironic API: context deadline exceeded
ERROR
ERROR on ../../../tmp/openshift-install-860785756/masters/main.tf line 1, in resource "ironic_node_v1" "openshift-master-host":
ERROR 1: resource "ironic_node_v1" "openshift-master-host" {
ERROR
ERROR
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: error(BaremetalIronicAPITimeout) from Infrastructure Provider: Unable to the reach provisioning service. This failure can be caused by incorrect network/proxy settings, inability to download the machine operating system images, or other misconfiguration. Please check access to the bootstrap host, and for any failing services.

What you expected to happen?

Complete IPI Installation

How to reproduce it (as minimally and precisely as possible)?

I tried many times, always got the same error.

Anything else we need to know?

Use VMware VM to simulate baremetal IPI installation, use vbmc4sphere as BMC.

Here is install-config.yaml
apiVersion: v1
baseDomain: sunnylab.com
metadata:
name: ocpi01
networking:
machineCIDR: 10.0.0.0/8
clusterNetwork:

  • cidr: 11.128.0.0/14
    hostPrefix: 23
    networkType: OVNKubernetes
    serviceNetwork:
  • 172.30.0.0/16
    compute:
  • name: worker
    replicas: 3
    controlPlane:
    name: master
    replicas: 3
    platform:
    baremetal: {}
    platform:
    baremetal:
    apiVIP: 10.13.121.30
    ingressVIP: 10.13.121.30
    provisioningNetworkInterface: ens192
    provisioningNetworkCIDR: 172.22.101.39/24
    hosts:
    - name: ocpi01-m01
    role: master
    bmc:
    address: ipmi://10.13.101.5:6231
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:e4:cb
    password: password [0/1951]
    bootMACAddress: 00:50:56:a5:e4:cb
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-m02
    role: master
    bmc:
    address: ipmi://10.13.101.5:6232
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:dd:05
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-m03
    role: master
    bmc:
    address: ipmi://10.13.101.5:6233
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:9a:c3
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-w01
    role: worker
    bmc:
    address: ipmi://10.13.101.5:6234
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:23:ad
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-w02
    role: worker
    bmc:
    address: ipmi://10.13.101.5:6235
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:d2:2d
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-w03
    role: worker
    bmc:
    address: ipmi://10.13.101.5:6236
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:48:e4
    rootDeviceHints:
    deviceName: "/dev/sda"
    bootstrapOSImage: http://10.13.121.39:8080/rhcos-48.84.202109241901-0-qemu.x86_64.qcow2.gz?sha256=50377ba9c5cb92c649c7d9e31b508185241a3c204b34dd991fcb3cf0adc53983
    clusterOSImage: http://10.13.121.39:8080/rhcos-48.84.202109241901-0-openstack.x86_64.qcow2.gz?sha256=e0a1d8a99c5869150a56b8de475ea7952ca2fa3aacad7ca48533d1176df503ab
    pullSecret: ''
    sshKey: ''

$ ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master provisioning state UP group default qlen 1000
link/ether 00:50:56:a5:d4:ed brd ff:ff:ff:ff:ff:ff
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master baremetal state UP group default qlen 1000
link/ether 00:50:56:a5:4b:13 brd ff:ff:ff:ff:ff:ff
4: baremetal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:50:56:a5:4b:13 brd ff:ff:ff:ff:ff:ff
inet 10.13.121.39/8 brd 10.255.255.255 scope global dynamic noprefixroute baremetal
valid_lft 31702sec preferred_lft 31702sec
inet6 fe80::6224:ba4b:1d84:f154/64 scope link noprefixroute
valid_lft forever preferred_lft forever
5: provisioning: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:50:56:a5:d4:ed brd ff:ff:ff:ff:ff:ff
inet 172.22.101.39/24 brd 172.22.101.255 scope global noprefixroute provisioning
valid_lft forever preferred_lft forever
inet6 fd00:1101::1/64 scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::ec9e:28b5:315:89d8/64 scope link noprefixroute
valid_lft forever preferred_lft forever
6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:33:86:22 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:33:86:22 brd ff:ff:ff:ff:ff:ff
8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master baremetal state UNKNOWN group default qlen 1000
link/ether fe:54:00:4a:8d:b0 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe4a:8db0/64 scope link
valid_lft forever preferred_lft forever
9: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master provisioning state UNKNOWN group default qlen 1000
link/ether fe:54:00:76:f4:58 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe76:f458/64 scope link
valid_lft forever preferred_lft forever

$ sudo virsh list
Id Name State

1 ocpi01-pf8pb-bootstrap running

$ sudo virsh dominfo ocpi01-pf8pb-bootstrap
Id: 1
Name: ocpi01-pf8pb-bootstrap
UUID: b4bfb6f6-3322-4cd6-8717-a6ea0270eb75
OS Type: hvm
State: running
CPU(s): 4
CPU time: 43939.2s
Max memory: 6291456 KiB
Used memory: 6291456 KiB
Persistent: yes
Autostart: disable
Managed save: no
Security model: none
Security DOI: 0

ReferencesemetalIronicAPITimeout) from Infrastructure Provider: Unable to the reach provisioning service. This failure can be caused by incorrect network/proxy settings, inability to download the machine operating system images, or other misconfiguration. Please check access to the bootstrap host, and for any failing services.

What you expected to happen?

Complete IPI Installation

How to reproduce it (as minimally and precisely as possible)?

I tried many times, always got the same error.

Anything else we need to know?

Use VMware VM to simulate baremetal IPI installation, use vbmc4sphere as BMC.

Here is install-config.yaml
apiVersion: v1
baseDomain: sunnylab.com
metadata:
name: ocpi01
networking:
machineCIDR: 10.0.0.0/8
clusterNetwork:

  • cidr: 11.128.0.0/14
    hostPrefix: 23
    networkType: OVNKubernetes
    serviceNetwork:
  • 172.30.0.0/16
    compute:
  • name: worker
    replicas: 3
    controlPlane:
    name: master
    replicas: 3
    platform:
    baremetal: {}
    platform:
    baremetal:
    apiVIP: 10.13.121.30
    ingressVIP: 10.13.121.30
    provisioningNetworkInterface: ens192
    provisioningNetworkCIDR: 172.22.101.39/24
    hosts:
    - name: ocpi01-m01
    role: master
    bmc:
    address: ipmi://10.13.101.5:6231
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:e4:cb
    password: password [0/1951]
    bootMACAddress: 00:50:56:a5:e4:cb
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-m02
    role: master
    bmc:
    address: ipmi://10.13.101.5:6232
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:dd:05
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-m03
    role: master
    bmc:
    address: ipmi://10.13.101.5:6233
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:9a:c3
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-w01
    role: worker
    bmc:
    address: ipmi://10.13.101.5:6234
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:23:ad
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-w02
    role: worker
    bmc:
    address: ipmi://10.13.101.5:6235
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:d2:2d
    rootDeviceHints:
    deviceName: "/dev/sda"
    - name: ocpi01-w03
    role: worker
    bmc:
    address: ipmi://10.13.101.5:6236
    username: admin
    password: password
    bootMACAddress: 00:50:56:a5:48:e4
    rootDeviceHints:
    deviceName: "/dev/sda"
    bootstrapOSImage: http://10.13.121.39:8080/rhcos-48.84.202109241901-0-qemu.x86_64.qcow2.gz?sha256=50377ba9c5cb92c649c7d9e31b508185241a3c204b34dd991fcb3cf0adc53983
    clusterOSImage: http://10.13.121.39:8080/rhcos-48.84.202109241901-0-openstack.x86_64.qcow2.gz?sha256=e0a1d8a99c5869150a56b8de475ea7952ca2fa3aacad7ca48533d1176df503ab
    pullSecret: ''
    sshKey: ''

$ ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master provisioning state UP group default qlen 1000
link/ether 00:50:56:a5:d4:ed brd ff:ff:ff:ff:ff:ff
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master baremetal state UP group default qlen 1000
link/ether 00:50:56:a5:4b:13 brd ff:ff:ff:ff:ff:ff
4: baremetal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:50:56:a5:4b:13 brd ff:ff:ff:ff:ff:ff
inet 10.13.121.39/8 brd 10.255.255.255 scope global dynamic noprefixroute baremetal
valid_lft 31702sec preferred_lft 31702sec
inet6 fe80::6224:ba4b:1d84:f154/64 scope link noprefixroute
valid_lft forever preferred_lft forever
5: provisioning: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:50:56:a5:d4:ed brd ff:ff:ff:ff:ff:ff
inet 172.22.101.39/24 brd 172.22.101.255 scope global noprefixroute provisioning
valid_lft forever preferred_lft forever
inet6 fd00:1101::1/64 scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::ec9e:28b5:315:89d8/64 scope link noprefixroute
valid_lft forever preferred_lft forever
6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:33:86:22 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:33:86:22 brd ff:ff:ff:ff:ff:ff
8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master baremetal state UNKNOWN group default qlen 1000
link/ether fe:54:00:4a:8d:b0 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe4a:8db0/64 scope link
valid_lft forever preferred_lft forever
9: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master provisioning state UNKNOWN group default qlen 1000
link/ether fe:54:00:76:f4:58 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe76:f458/64 scope link
valid_lft forever preferred_lft forever

$ sudo virsh list
Id Name State

1 ocpi01-pf8pb-bootstrap running

$ sudo virsh dominfo ocpi01-pf8pb-bootstrap
Id: 1
Name: ocpi01-pf8pb-bootstrap
UUID: b4bfb6f6-3322-4cd6-8717-a6ea0270eb75
OS Type: hvm
State: running
CPU(s): 4
CPU time: 43939.2s
Max memory: 6291456 KiB
Used memory: 6291456 KiB
Persistent: yes
Autostart: disable
Managed save: no
Security model: none
Security DOI: 0

References

@staebler staebler added the platform/baremetal IPI bare metal hosts platform label Dec 19, 2021
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 21, 2022
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 20, 2022
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this as completed May 20, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 20, 2022

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rakeshk121
Copy link

Facing the same issue, raised it here: openshift-metal3/dev-scripts#1586

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. platform/baremetal IPI bare metal hosts platform
Projects
None yet
Development

No branches or pull requests

4 participants