-
Notifications
You must be signed in to change notification settings - Fork 864
WeeklyTelcon_20180220
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Jeff Squyres
- Brian
- David Bernholdt
- Geoffroy Vallee
- George
- Josh Ladd
- Artem
- Matthew Dosanjh (Sandia)
- Nathan Hjelm
- Todd Kordenbrock
- Thomas Naughton
--- A number of usuals not here today:
- Howard
- Edgar Gabriel
- akvenkatesh
- Josh Hursey
- Mohan
Review All Open Blockers
Review v2.x Milestones v2.1.3
- Merged last thing last night, and Howard will make RC2 v2.1.3 rc2 tomorrow.
- There has been some patches, so Howard will start an RC build of v2.1.3 rc1
-
Issue 4349
- Support for PowerBE was removed for v2.x (should be only be removed for major version).
- We can't find notes as to why it was removed for v2.x
- Nathan is testing this now.
- One person said they'd stay at v2.1.1 and go no further.
- Not getting overwhelming "yes we want it back".
- No one has yet volunteered to support Power BE, for v3.x and later.
- Support for PowerBE was removed for v2.x (should be only be removed for major version).
Review v3.0.x Milestones v3.0.1
- Issue 4338 - SLURM integration broken on v3.0.x and v3.1.x
- Not regression. Marked as blocker.
- Schedule
- Will build RC4 tonight.
Review v3.1.x Milestones v3.1.0
- Issue 4338 - SLURM integration broken on v3.0.x and v3.1.x
- Not regression. Marked as blocker.
- Issue missing ibrary versioning for a new common component.
- Will get into a new RC.
- SCHEDULE:
- Will build RC tomorrow.
-
Issue 4829
- George and Giles - iovec > 2GB single message.
- process VM readmem and writemem - glibc generates a syscall.
- iovec not behaving as man page indicates.
- Simple fix when we use CMA, set Vader's put and get limit to 2GB
- George posted PR 4832
- CMA only (CMA fails, and then copies in and copies out).
- Also yells at the user, slow but not a silent data coruption.
- not a regression, and shouldn't rush a fix.
- Affects VADER
- Paul Hargrove ran testing, and didn't find any issues.
Review Master Master Pull Requests
- Out of order issue (BLOCKER)
- Issue 4795
- Went into master, but no PR to v3.1.x
- Old bug, not a regression.
- TCP and usNIC - multilink issue
- numlinks parameter is clearly broken / untested.
-
Issue 4799
- begining in v2.1 not binding to core.
- Because binding to socket (by default), the pmix thread migrates to different core (on same socket)
- Ralph suggested to revise deffault binding based on this.
- Don't remember much decent when we moved default from core to socket.
- idea was we're embrasing threads and bind to socket is much more thread friendly,
- but comes with cost.
- Performance from 30ms down to 6ms for single PMIx Get.
- Open MPI right now has it's own service thread. Could that ALSO be impacted?
- PROBLEM is that this is the SAME issue for Open MPI progression threads.
- v4.0 at earliest.
- SHould put something into the README about this affect of binding to socket by default.
- Jeff will create an issue to add a blurb to README.
- Issue Issue4423
- When your PR has been accepted into a release branch, please go to the issue, and remove the target of the release branch that it was just merged into. Attempting to automate this in the future.
- In July we missed a Review who has commit access. We forgot, and will do again this Summer.
- Is it possible to call web-api to see if tests had run on a given git hash.
- New python database status
- It's ready to use, but no one is working on reporter piece.
- python client will run an work locally, but can't report back to server.
- There is a rest API that Josh H. Implemented.
- Current reporter has a bunch of hard-coded structure PHP.
- Issue MTT Issue614
- How does Cherry PI server get started on AWS?
- Howard going to update Chrry PI server at AWS next week
- Josh H had a better solution, but doesn't have cycles right now.
- Autogenerate AUTHORs list script for v3.0.x
- Brian has scripted the create tarball process: https://jenkins.open-mpi.org/jenkins/job/open-mpi.dist.create-tarball/
- Tagging is another area to script.
- Only tagging versions that are tested with MTT.
- want to be able to put check in tag script check MTT to see if MTT tests had been run on that commit.
- Copied tarballs from git into S3. They're in both locations now.
- Brian has to update some scripts before he removes from from git.
Review Master MTT testing
-
OLD Discuss abandoning openib btl.
- Nathan has a UCX BTL
- ETA on GPU in UCX - basic minus CUDA IPC is in test now.
- Any warning message if on iWarp?
- What's the roadmap for this? 3.x or 4.x?
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA