-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SoundWire: fixes before changes to bind/unbind #3645
SoundWire: fixes before changes to bind/unbind #3645
Conversation
21c126a
to
8b527e9
Compare
@bardliao @shumingfan can you please take a look at the latest commit for rt711*.c? I don't get what the component remove was trying to do with the regcache status. |
@plbossart Please check PR #2640. That's why to add the .remove callback function. |
ok, thanks for the pointer. I think the 'right' solution is rather to use pm_runtime_resume_and_get(), that way there's no risk of initiating transactions on a suspend bus. |
8b527e9
to
5608c41
Compare
@shumingfan @bardliao please check the update and specifically the last commit, I squashed the two commits related to .set_jack_detect() and added references to previous solutions in the commit message |
5608c41
to
f4e8191
Compare
dev_dbg(&rt711->slave->dev, | ||
"%s hw_init not ready yet\n", __func__); | ||
return 0; | ||
ret = pm_runtime_resume_and_get(component->dev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking maybe we only need to get when hs_jack != NULL. The reason that we need to access the register is we want to get the jack's status. But when hs_jack == NULL, we will disable the jack detection feature and we can write the register to cache. Then codec driver will sync it in the device resume callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have multiple places where the state of the cache is modified, this should only happen on suspend and resume.
We can disable jack detection when the card is removed through, that's different to changing the cache settings.
For tests, it's rather common to disable the HDaudio links and codecs in the build. Since we already get a codec_mask parameter indicating that there are no codecs detected, it's straightforward to skip the HDMI dailink creation and create a card. Note that when disabling HDMI, a modified topology without HDMI pipelines needs to be provided as well. Signed-off-by: Pierre-Louis Bossart <[email protected]>
The bus sdw_drv_remove() and sdw_drv_shutdown() helpers are used conditionally, if the driver provides these routines. These helpers already test if the driver provides a .remove or .shutdown callback, so there's no harm in invoking the sdw_drv_remove() and sdw_drv_shutdown() unconditionally. In addition, the current code is imbalanced with dev_pm_domain_attach() called from sdw_drv_probe(), but dev_pm_domain_detach() called from sdw_drv_remove() only if the driver provides a .remove callback. Fixes: 9251345 ("soundwire: Add SoundWire bus type") Signed-off-by: Pierre-Louis Bossart <[email protected]>
f4e8191
to
917a1d8
Compare
Still the same problem with USB Audio Test https://sof-ci.01.org/linuxpr/PR3645/build155/devicetest failed on jf-cml-rvp-sdw-3 Test https://sof-ci.01.org/linuxpr/PR3645/build158/devicetest/ worked fine on jf-cml-rvp-sdw-3 Test https://sof-ci.01.org/linuxpr/PR3645/build166/devicetest/ failed on jf-cml-rvp-sdw-1 I cannot explain such differences by software changes due to this PR Three whole days lost on this. Gah. |
USB audio problem is not the problem in https://sof-ci.01.org/linuxpr/PR3645/build155/devicetest/?model=CML_RVP_SDW_ZEPHYR&testcase=check-capture-10sec. It's only distracting your attention from the real problem:
|
not quite @marc-hb, why on earth would sof-test attempt to record from a non-sof card? Even if the SOF card disappeared, something is very very misleading in the script actions and reports. |
@marc-hb take for example https://sof-ci.01.org/linuxpr/PR3645/build166/devicetest/?model=CML_RVP_SDW_ZEPHYR&testcase=check-playback-10sec. This is reported as PASS but plays on USB Audio. Does this seem right to you? |
I think the problem is in an earlier test that's shown as PASS: If you look at the logs there's a number of stack traces and I suspect the kmod load/unload causes all kinds of issues here. @marc-hb EDIT: I think you meant https://sof-ci.01.org/linuxpr/PR3645/build166/devicetest/?model=CML_RVP_SDW_ZEPHYR&testcase=check-sof-logger |
Because of an sof-test bug that I started describing thesofproject/sof-test#471 (comment) and that should probably be filed independently. Still, the lack of any SOF audio device is not an sof-test problem and IMHO the more pressing problem. sof-test is just being a very bad messenger here (what's new?) |
Green failure tentatively fixed in one-line sof-test PR |
SOFCI TEST |
https://sof-ci.01.org/linuxpr/PR3645/build168/devicetest/ worked fine on jf-cml-rvp-sdw-3 We must have a random bug somewhere... |
SOFCI TEST |
https://sof-ci.01.org/linuxpr/PR3645/build169/devicetest worked fine on jf-cml-rvp-sdw-1 Difficult to figure out what is going on, I am not comfortable with 2/5 failures. |
Internal tests show this warning
static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
struct delayed_work *dwork, unsigned long delay)
{
struct timer_list *timer = &dwork->timer;
struct work_struct *work = &dwork->work;
WARN_ON_ONCE(!wq);
WARN_ON_FUNCTION_MISMATCH(timer->function, delayed_work_timer_fn); <<<< HERE???
WARN_ON_ONCE(timer_pending(timer));
WARN_ON_ONCE(!list_empty(&work->entry)); That hints at a bad initialization of the workqueue. Gah. |
When binding/unbinding codec drivers, the following warnings are thrown: [ 107.266879] rt715-sdca sdw:3:025d:0714:01: Unbalanced pm_runtime_enable! [ 306.879700] rt711-sdca sdw:0:025d:0711:01: Unbalanced pm_runtime_enable! Add a remove callback for all Realtek/Maxim SoundWire codecs and remove this warning. Signed-off-by: Pierre-Louis Bossart <[email protected]>
In codec driver bind/unbind test, the following warning is thrown: DEBUG_LOCKS_WARN_ON(lock->magic != lock) ... [ 699.182495] rt711_sdca_jack_init+0x1b/0x1d0 [snd_soc_rt711_sdca] [ 699.182498] rt711_sdca_set_jack_detect+0x3b/0x90 [snd_soc_rt711_sdca] [ 699.182500] snd_soc_component_set_jack+0x24/0x50 [snd_soc_core] A quick check in the code shows that the 'calibrate_mutex' used by this driver are not initialized at probe time. Moving the initialization to the probe removes the issue. BugLink: thesofproject#3644 Signed-off-by: Pierre-Louis Bossart <[email protected]>
If the card registration fails, typically because of deferred probes, the device properties added for headset codecs are not removed, which leads to kernel oopses in driver bind/unbind tests. We already clean-up the device properties when the card is removed, this code can be moved as a helper and called upon card registration errors. Signed-off-by: Pierre-Louis Bossart <[email protected]>
Follow the same flow as rt711-sdca and initialize all mutexes at probe time. Signed-off-by: Pierre-Louis Bossart <[email protected]>
Realtek headset codec drivers typically check if the card is instantiated before proceeding with the jack detection. The rt700, rt711 and rt711-sdca are however missing a check on the card pointer, which can lead to NULL dereferences encountered in driver bind/unbind tests. Signed-off-by: Pierre-Louis Bossart <[email protected]>
The workqueues are initialized in the io_init functions, which isn't quite right. In some tests, this leads to warnings throw from __queue_delayed_work() WARN_ON_FUNCTION_MISMATCH(timer->function, delayed_work_timer_fn); Move all the initializations to the probe functions. Signed-off-by: Pierre-Louis Bossart <[email protected]>
…etect The .set_jack_detect() codec component callback is invoked during card registration, which happens when the machine driver is probed. The issue is that this callback can race with the bus suspend/resume, and IO timeouts can happen. This can be reproduced very easily if the machine driver is 'blacklisted' and manually probed after the bus suspends. The bus and codec need to be re-initialized using pm_runtime helpers. Previous contributions tried to make sure accesses to the bus during the .set_jack_detect() component callback only happen when the bus is active. This was done by changing the regcache status on a component remove. This is however a layering violation, the regcache status should only be modified on device probe, suspend and resume. The component probe/remove should not modify how the device regcache is handled. This solution also didn't handle all the possible race conditions, and the RT700 headset codec was not handled. This patch tries to resume the codec device before handling the jack initializations. In case the codec has not yet been initialized, pm_runtime may not be enabled yet, so we don't squelch the -EACCES error code and only stop the jack information. When the codec reports as attached, the jack initialization will proceed as usual. BugLink: thesofproject#3643 Fixes: 7ad4d23 ('ASoC: rt711-sdca: Add RT711 SDCA vendor-specific driver') Fixes: 899b125 ('ASoC: rt711: add snd_soc_component remove callback') Signed-off-by: Pierre-Louis Bossart <[email protected]>
917a1d8
to
2c68ce3
Compare
Running test 12997 to make sure there's no surprises. |
nothing bad detected in this test. @bardliao can you please re-check the changes? |
The patches in this PR correct obvious issues that appeared while introducing a locking mechanism and bind/unbind cycles. The fixes are rather straightforward and can be reviewed/merged before the validation of PR #3642 is complete.