-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infiniband boot problem #100
Comments
Two things:
|
I remember I noticed that Right now I switched back to 3.7 and it works. |
One more thing: I remember I did the following: I added |
Sounds like it was indeed chainloading then, which is problematic. I would need a tcpdump -s0 from the master during the boot of the node to figure out what's going on. Also need to know which FlexBoot version is in use, and what dhcpd-template.conf looks like.
Yes init specifically expects the 64-bit HWADDR when it finds an IB interface. As aside, looking at Connect-X 3 Flexboot releases, I'm hopeful that 3.4.752 http://www.mellanox.com/related-docs/prod_software/FlexBoot-3_4_752_for_ConnectX3_release_notes.pdf, includes support for the new ${hwaddr} variable. Curiously it appears this FlexBoot version and ConnectX-3 firmware is now only available via mymellanox instead of the normal website. |
I will try my best to investigate this. However, I will not be able to look at this within two weeks and I am not sure if a our cluster will be free for such tests after this time (people who paid for the hardware are getting more and more impatient to actually see it working). So, if I find some time window, I will do the tests, however maybe Irek Porebski [email protected], who apparently seems to have similar issue, will be able to help sooner. |
Hi,
The iPXE version is 1.0.0+. I am using Conect-X5 . mlx5_ib: Mellanox Connect-IB Infiniband Driver v2.2-1 (Feb 2014) There is no variable $hwaddr but there is $wwhwaddr which is MAC Thanks, |
Are you using RHEL's included OFED or Mellanox's OFED? |
Hi There is the tcpdump from the session |
I am using Centos so it is RedHat |
So you've not for example installed: http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers ? |
I was trying to use Mellanox OFED but the result was the same. |
no, I didn't installed this OFED |
Rebuild the bootstrap with: Find the ID of your bootstrap: Look at the contents of your bootstrap grep for mlx modules: |
there is result of the zcat: [root@headnode 15]# zcat initfs.gz | cpio -tdv | grep mlx |
Please try using the updated bootstrap.conf in PR #103. Uncomment the modprobe line for ib_ipoib. Rebuild the bootstrap via Then try running |
This helps the node starts booting just stop on VNFS Jan 26 15:24:30 headnode dhcpd: DHCPOFFER on 192.168.0.100 to ec:0d:9a:9c:81:e2 via ib0 [root@headnode log]# wwsh provision print gpunode-0-0 gpunode-0-0
I think I have som node name problem now |
But this is only working if I have disable onboard NIC. So only IB is available. Checking for network device: eth0 (ib0) Now is: So there is something which causing the Warewulf to use wrong network interface |
You can see it in my e-mails here: https://groups.google.com/a/lbl.gov/forum/#!msg/warewulf/cxeQ6KiAx6g/xPOYG7jeAgAJ |
Hi Ben I have still problem. Though it all right now but not. Thanks, |
@Irekporeb From the debug shell can you show us the contents of Then on the Warewulf server, Also from your screenshot above it looks like the bootstrap got past IP addressing, but failed to fetch the VNFS is this the current state? Or is this some intermediate workaround? |
I am have done "wwsh pxe update" and this looks like changed something
I had to modify DHCP config to this:
During the booting the node is using both MAC and client-identifier.
However I am not able to register the node with full client-identifier as it will use GUN instead of MAC and the links to kernel will be incorrect and node will not boot at all. I have configured only IB for this node.
My network file looks like this so it should have static IP after bootstrap
Sorry for late response but I am in different timezone |
Please set the hwaddr of the ib0 interface to Warewulf works with the full 64-bit GUID of IB interfaces, not the chopped 48-bit version (eg. two middle bytes 03:00 dropped). This change will fix your The reason for the symlink is we're still waiting on a firmware fix from Mellanox in Flexboot that will let us change that line in dhcpd-template.conf to:
Where ${hwaddr} is a iPXE shell variable that is representative of the 64-bit GUID on IB and 48-bit MAC on Ethernet. There shouldn't be any other workarounds needed right now. |
Unfortunately this is not working. This create dhpd.conf like this: host gpunode-0-0-ib0 { Jan 30 10:27:43 headnode dhcpd: DHCPREQUEST for 192.168.0.100 (192.168.0.1) from ff:00:00:00:00:00:02:00:00:02:c9:00:ec:0d:9a:03:00:9c:81:e2 via ib0 If I add I can boot but the network is not recognised. I will try to create new bootstrap like you suggest before and see if this helps. |
I'm trying to figure out why it's grabbing /warewulf/ipxe/bin-i386-pcbios/undionly.kpxe, which is the version of iPXE for a legacy BIOS (before UEFI or a UEFI in legacy mode). The ConnectX-5 card should be running FlexBoot, it's own version of iPXE. Upstream iPXE (undionly.kpxe in this case) doesn't send the dhcp-client-identifier with the prefix and GUID. Hence why the chainloaded undionly.kpxe cannot get a DHCP lease without the added This looks like FlexBoot:
This is the iPXE compiled for a legacy BIOS: Jan 30 10:27:44 headnode in.tftpd[87431]: Client 192.168.0.100 finished /warewulf/ipxe/bin-i386-pcbios/undionly.kpxe Questions:
It's quite possible that FlexBoot no longer identifies itself as "iPXE" via its DHCP request. so this line https://github.com/warewulf/warewulf3/blob/master/provision/etc/dhcpd-template.conf#L14, is no longer evaluating true for FlexBoot. |
Actually that gives me another idea to test my very last point. With the previous symlink workaround in place, eg Try a dhcpd-template.conf like:
|
Headnode with FlexBoot EUFI
|
if you want to take this conversaton offline I am happy to do it. |
I just got the confirmation that Dell will not support EUFI PXE for ConnectX-5 cards. So I have only BIOS option now. |
Yes, I think this is still a useful test. Please take a |
The Github issue is fine with me for this discussion. It's nice and searchable. |
there are screen shots and tcpdump This was done with not modification to dhcp-template.conf as we already establish that I am using BIOS PXE |
Screen shots are helpful.
It's not so much that we care if its in BIOS vs UEFI mode, we care why its chainloading undionly.kpxe instead of directly going to loading the ipxe/cfg/... file over http at this point. Please try a boot with the modified dhcpd-template.conf above, or alternatively adding The idea is ignore/replace the logic around loading the iPXE config only if user-class = iPXE, thus providing FlexBoot directly the ipxe/cfg/${mac} filename option. |
I have modify the dhcp-template. This time it didn't need the MAC to boot.
warewulf-3.pcap.001.zip |
Great. That solves the first mystery at least. Did the bootstrap fully provision the node? |
not, this get me to the debug shell with the same message "Network hardware not recognise" |
Could you step through this shell loop from the debug shell: https://github.com/warewulf/warewulf3/blob/master/provision/initramfs/init#L235 Double check some input variables before starting:
|
Just my 3 cents: The unnecessary loading of the (I am on a vacation right now and I have no access to server, so I cannot do any tests before I return: until then I can only comment on the tests I have already done). |
@Irekporeb Yep looks good. ifup is a shell function within /init, not Busybox's ifup, so the last failed line when you run ifup in your screenshot is expected. https://github.com/warewulf/warewulf3/blob/master/provision/initramfs/init#L73.
This will tell us if
|
There's been some cards I've needed to wait to initialize (add We should see this from the Edit: |
@macdems The unnecessary loading of the undionly.kpxe is certainly a problem, I've opened #104 to track it. We know why its happening at this point (lack of user-class option 76 in the DHCP discover/request from FlexBoot), and its unrelated to the network interface not being configured once in the bootstrap. |
@Irekporeb Do you have a screenshot of the I'm interested in if /sys/class/net/ib0 exists at the point when this loop runs. If so you should see HWADDR being populated. If not then we have an annoying race condition on the interface bring-up like Jason mentioned. |
Since we can't see the higher loop output, can you set `wwdebug=3` on the
node kargs. This will should drop it into a debug shell right before it
checks for /tmp/wwdev ... and exits in your case.
At this point check what's in:
/sys/class/net/
Also check the output of the values here...
THWADDR=`cat /sys/class/net/ib0/address`
echo ${#THWADDR}
echo `expr substr $THWADDR 37 23`
I have a feeling it's that the card isn't being initialized, or a driver is
taking a long time to load, by the time the bootstrap init gets to that
point.
…-J
On Tue, Jan 30, 2018 at 5:03 PM, Irekporeb ***@***.***> wrote:
1. No there is no message for checking network
2. the screen still looks like in the mentioned comment. "Network
Hardware not recognized"
3. I have set up wwdebug mode for gpunode-0-0. I can't see "ifup"
beeing called at all.
I recorded whole booting session as it is going fast but still can't
see it.
[image: boot-problem9]
<https://user-images.githubusercontent.com/35822015/35596427-998e2c42-0665-11e8-81cc-81bc21cc3fdf.PNG>
[image: boot-problem8]
<https://user-images.githubusercontent.com/35822015/35596433-a0f2022e-0665-11e8-8729-f24d2feff89b.PNG>
[image: boot-problem-sys_class_net]
<https://user-images.githubusercontent.com/35822015/35596447-a6cf799c-0665-11e8-89a1-2abd77d56258.PNG>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#100 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA5zomt5LIofpwxxei6aWuee7oj04ShZks5tP5_cgaJpZM4RsqXJ>
.
|
@jmstover The debug shell that you get dumped into with I'm thinking we might need to extend the
|
this is the message which I was able to catch just befor the Error message. I have also attached the booting video below. |
@jmstover There is the output of your command |
@Irekporeb Thanks for the video, its super helpful. Yep, the Bit of a long shot, but in bootstrap.conf add mlx5_core and mlx5_ib back into a modprobe line, eg:
Rebuild the bootstrap, via I doubt this will work, and if it doesn't than we'll need to add some code into provision/initramfs/init to allow for some retries before giving up. |
Double checking, the line you added to bootstrap.conf looks exactly like: |
O yes , there were missing "," between the modules. After fixing that I was able to boot to the node OS. |
Closing. I don't believe we have anything outstanding in this issue. #104 is still open to figure out how to distinguish FlexBoot's iPXE in dhcpd.conf. |
Warewulf 3.8 does not boot nodes over Infiniband. It does not even bootstrap!
3.7 works fine.
After some investigation I found out that the problem is with DHCP and the new
IPXE boot. This is what happens:
Mellanox FlexBoot is requesting DHCP, with a fixed dhcp-client-identifier.
This is properly handled if the correct HWPREFIX and HWADDR is set for the
node.
After assigning the IP address, Warewulf 3.7 simply loads bootstrap and
continues. However, in 3.8 there is an intermediate IPXE (it's IPXE, right?)
which once again asks for the DHCP address -- this time without providing
the dhcp-client-identifier. And this is the point where the boot fails!
The only solution I see, is to somehow force the second stage IPXE to provide
the same dhcp-client-identifier as the Mellanox FlexBoot does.
The text was updated successfully, but these errors were encountered: