Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panedrlite should allow import panedr #48

Closed
orbeckst opened this issue Jul 7, 2022 · 8 comments · Fixed by #50
Closed

panedrlite should allow import panedr #48

orbeckst opened this issue Jul 7, 2022 · 8 comments · Fixed by #50

Comments

@orbeckst
Copy link
Member

orbeckst commented Jul 7, 2022

The current dev version of panedrlite is used as

import panedrlite as panedr

but both @jbarnoud (in PR #42) and @IAlibay (in PR #47 ) stated that the following should actually work for panedrlite

import panedr
@IAlibay
Copy link
Member

IAlibay commented Jul 7, 2022

So I've spent a large amount of time testing things out since yesterday, unfortunately we're hitting a lot of limitations when it comes to what setuptools & the PEPs of packaging allow.

As far as I can tell these are the feasible options we can have:

1. Keep things as they are:

  • This means that we have to have different import patterns for both packages, in my opinion that's problematic if we want both packages to be interchangeable.
  • Having panedrlite be the base package isn't wrong though, we just have to re-evaluate things under the context that it's the base package. It's kind of overkill though.

2. Only have one package and make pandas an extra_requires

  • This is the best option in my opinion. I know @jbarnoud doesn't agree, but this means the one package can be put in an install_requires, so if users have the package already they don't re-install because it wasn't the right flavour.
  • We can split things up on conda-forge and make two packages here without problems.

3. Have two setup.py in different directories copying files across

  • This is messy, but we could create a panedrlite directory with it's own setup.py/setup.cfg/pyproject.toml that copies over the source files on deployment. It works, but it's going to need extra attention.

4. two setup.py in one package directory

  • Technically you can do this right now (if you delete the contents of setup.cfg), but it's dangerous and you can't upgrade to pyproject.toml.

@IAlibay
Copy link
Member

IAlibay commented Jul 7, 2022

Option 2 is the one that is fully supported & will make downstream life easier imho. I know it breaks user expectations, but it doesn't affect the science just the need to install pandas

Option 3 can work but it'll be messy on paper, 4 is a terrible idea, 1 works but seems overkill for such a small package, if we do this we'll need to make things very clear to users and it'll be sure to confuse users.

Option 5 is to make panedr optional for MDA. Then we don't care about the pandas dependency in most cases.

@jbarnoud
Copy link
Collaborator

jbarnoud commented Jul 7, 2022

I gave it a try and it indeed looks more complex than I thought it was. This seems to be one of the rare case where conda is simpler than pip.

I do not like option 2 because it means that one downstream use case breaks the experience for all the others.

One solution I discarded earlier is to have 2 distinct packages with distinct code: one with a name like pyedr is the equivalent of what is now panedr-lite, and panedr. pyedr has all the code except for edr_to_df. panedr only has something like:

import pyedr
import pandas as pd

def edr_to_df(...):
    ...
    all_energies, all_names, times = read_edr(path, verbose=verbose)
    df = pd.DataFrame(all_energies, columns=all_names, index=times)
    return df

I discarded this option because it means having code in two different places (even though these places are in the same repo), compared to the metapackaged it looked more cumbersome. The metapackage not being trivial as I thought it was, the drawback of having code in two places loses importance.

@IAlibay
Copy link
Member

IAlibay commented Jul 7, 2022

It makes for a package that's all of ~ 5 lines of code, which is a bit excessive, but sure that way you could have panedr completely decoupled from the base package, panedr would keep its current version number and never really change and pyedr would be the thing that gets actively developed?

It does mean that bug fixes & changes because more opaque to users of panedr though.

@jbarnoud
Copy link
Collaborator

jbarnoud commented Jul 8, 2022

Do we expect much fixes and changes, though? EDR is a surprisingly stable format and the feature set of panedr is limited on purpose.

It is also possible to bump the version number of panedr if big changes happened in pyedr, just to force the update of pyedr. The visibility of the changes, at the end of the day, is mostly a matter of publicity.

@IAlibay
Copy link
Member

IAlibay commented Jul 12, 2022

Do we expect much fixes and changes, though? EDR is a surprisingly stable format and the feature set of panedr is limited on purpose.

It is also possible to bump the version number of panedr if big changes happened in pyedr, just to force the update of pyedr. The visibility of the changes, at the end of the day, is mostly a matter of publicity.

I expect at ~ two more versions of this code before a stable 1.0.

Anyways we should quickly come up with an agreement on the way forward. In the nicest way possible, as long as I don't have to take responsibility for working out the versioning / changelog-ing the two packages, I'm happy to go ahead and implement the option to split it into pyedr and panedr.

@BFedder @orbeckst @hmacdope any views here? We probably need to get this moving for MDA stuff to move on.

@orbeckst
Copy link
Member Author

I like the dichotomy of panedr and pyedr — it's pretty clear from the name what they do. It also means minimal disruption for anyone using panedr already.

If there a clear notes for panedr users to check the pyedr CHANGELOG and the release procedure for pyedr contains "bump panedr version" then it should be ok.

Do you want to maintain them in the same repo or make two separate ones? (If separate ones then I'd actually suggest to rename the current one to pyedr to keep the issue history together with the code and create a fresh panedr one.)

@IAlibay
Copy link
Member

IAlibay commented Jul 12, 2022

my first instinct would be to suggest keeping both packages in the same repo (similar to the work @BFedder has already done)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants