-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite topology tools with Construct to improve the readability and maintainability #755
Conversation
@marc-hb @mengdonglin are you good with the added dependency on a package? |
I agree one should never re-invent the wheel and implement a binary parser without relying on some library like Construct. On the other hand:
|
There's a decoder in alsa-utils (-d switch with alsatplg IIRC). We should look into using it instead of depending on an external package |
Is it in C? This PR proposes to "upgrade" our manual Python implementation to a higher level approach in order to reduce code size and maintenance. Downgrading to a low-level, memory management/corruption language would go the opposite direction. Python is slow but I don't think parsing topologies is in any critical path for us. Is alsa-utils a C-only project? If yes then that would justify keeping an implementation in a higher level language here for the lack of a better place. |
5a7125b
to
5d11bc7
Compare
Hi @plbossart, I'm trying to use pip to install the dependencies in GitHub Actions because apt seems to install the packages for Python whose path is different than setup-python use. You can see my modification in abf226d. |
|
Naming magic numbers does not require a new library. BTW I see the existing parser already uses https://docs.python.org/3/library/struct.html extensively (which will never "go down")
How much more maintainable? Can you please share some expected numbers like lines of code and examples of repetitive code not needed any more, number of classes and functions, cyclomatic complexity, ease of unit testing, error message clarity and any other metric of your choice. No precise number, just expected orders of magnitude (if there's no order of magnitude difference then it's probably not worth it) |
Hi @marc-hb, thanks for your feedback, and sorry for my late reply because I spent some time collecting information.
I think Kaitai could be a good choice if we need to reuse the parser in different languages. But what we need here is just a topology parser in Python and Kaitai have more heavy dependency and complex workflow (we describe the binary format in a
Sorry for my misleading words, what we actually make is a topology tool for testing and debuging, and parsing topology is just the first step. With this tool, we can not only dump some topology information but also render a topology graph. As #471 (comment) said, some parameters can only be extracted from topology, and what I want to do is just a replacement for the origin topology tool. Moreover, writing the parser ourselves allow us to parse some private data like schedule core, so it's a more flexible way than an official tool. Also, |
Thanks for the detailed explanations, very useful and convincing. I think what could finish convincing me is a before/after comparison on a small sample bit of code.
OK but parsing text is much lower maintenance than binaries and there is probably a wider choice of libraries to help with that. |
I'm not familiar with those metrics, so I'd like to give an example in an intuitive way: For parsing PCM struct, the legacy code: Lines 335 to 379 in 5bb6679
it contains many magic numbers as field offsets. If there is any change, we need to reevaluate those offsets and hope we won't miscalculate them in some steps. My new code with Construct: sof-test/tools/tplgtool-ng/tplgtool.py Lines 313 to 328 in 5d11bc7
it is simply corresponding to struct snd_soc_tplg_pcm in alsa-lib. If something changed, just keep the mapping rule and everything will work fine.
Well, I don't think parsing binary is much higher maintenance than text if we accept Construct. And I find the decoded configuration file seems lost some information or may need another step to retrieve them. So I think it could be a choice but I do not appreciate it. |
Thanks for the example. I don't think it's a very fair example though: the reason why there are hardcoded offsets everywhere right now is not really because of the difference between However I agree that construct code looks more readable in general. Also, if someone (you) makes the effort to fix the currently pretty bad code then he might as well go the extra mile and switch to a different and more readable solution. So +1 from me for the Construct dependency. It looks like a good project. (I briefly looked at but haven't reviewed the code in this PR) |
47d5d76
to
531371a
Compare
3abf82b
to
6ff8401
Compare
082457a
to
a6e20cf
Compare
All of the pylint warnings for sof-tplgreader.py aren't introduced by me. I'm focusing on tplgtool.py, so I only make minimal changes to sof-tplgreader.py for compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the new code looks much more readable! Great effort. I'm surprised that it is not that much shorter but that's probably because of all the pydocs (thanks!) and also a lot more whitespace (now I'm a bit curious about the characters count)
My main issue is with the almost complete rewrite in place and "Big Bang" migration. Instead of replacing the existing files, can you please create two new files sof-tplgreader2.py
and tplgtool2.py
? Or any other better name you can think of. This will provide a smooth transition period where anyone or anything stuck with a bug or small difference in the new tool can report the bug or incompatibility and then very quickly, locally and temporarily revert back to the old one to unblock themselves.
I checked and there are fewer than 15 references to the old names in sof-test. They don't need to be switched all at the same time: Continuous Integration.
The old scripts can be deleted later. This will break any references outside sof-test that we failed to spot but by that time the new scripts will have been well tested and all new bugs and most incompatibilities squashed.
To not lose the git history of sof-tplgreader.py
you can have a very first commit that merely copies to sof-tplgreader2.py
without changing it yet, this helps git follow the history.
@marc-hb Creating new files for smooth migration is a good idea. Because I'm focusing on new tplgtool in this PR and @aiChaoSONG encourage me to rewrite sof-tplgreader.py too in the future, I think we can leave sof-tplgreader.py unchanged this time and postpone the migration for sof-tplgreader.py until rewriting it. For a smooth transition, we can start to draw topology graphs with the new tplgtool2.py in our CI. |
3c0d446
to
a33d8de
Compare
* python3-graphviz for drawing topology graph * python3-construct for parsing topology file Signed-off-by: Yongan Lu <[email protected]>
contruct is a new dependency for topology parsing. Signed-off-by: Yongan Lu <[email protected]>
6a856e4
to
7034e53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes or explanations required, let's use duplex instead of both
Thanks, @YongAnLu00 for the tool, let's merge the tool, let's do bug fixes and features in further PR |
1. write a declarative topology file parser with Construct 2. TplgFormatter in legacy code is confusing, split its functions into more explicit parts: GroupedTplg and TplgGraph 2.1 GroupedTplg works like a simple wrapper to access different types of blocks more conveniently 2.2 TplgGraph build components graph for drawing and searching components through graph Signed-off-by: Yongan Lu <[email protected]>
deprecate legacy tplgtool.py and recommend user migrate to tplgtool2.py Signed-off-by: Yongan Lu <[email protected]>
Add new dependency info and tplgtool2.py description to README Signed-off-by: Yongan Lu <[email protected]>
I'd like to improve the topology parser to get more info like the widget core, but the legacy code contains too many magic number which is hard to understand and maintain. Here I will rewrite the topology tools with Construct which is a powerful declarative and symmetrical parser and builder for binary data to make the code more maintainable.
Signed-off-by: Yongan Lu [email protected]