-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support remote log level #4413
base: master
Are you sure you want to change the base?
Support remote log level #4413
Conversation
the build is gonna break of course because the dependencies are missing in pillar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch in general looks good to me, I have some comments
- in collect-info.sh for 'newlog' directory, we should skip the 'devUpload' directory, since those not yet uploaded files will be the duplicates for the keep directory
- similar in the edgeview in walkLogDirs() function, we need to skip the 'devUpload' directory to avoid duplicates
- in loguploader.go, there is 'failSendDir' and logic to move the failed to send to controller log files and keep them there. With this change, i don't think it is needed anymore, since everything will be in the keep directory. this can simplify some logic
- when doing kibana search, i do search for 'Alpine' string for locate when the device has rebooted, now we may not have this. maybe to always log at least the first 5 device log entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the description in https://github.com/lf-edge/eve/blob/master/docs/LOGGING.md with the intended new state of things?
580384e
to
c894bec
Compare
@eriknordmark I made the changes to the documentation, please have a look. I'm not sure if it's an overkill with the |
c894bec
to
5b2cfcf
Compare
5b2cfcf
to
8e0fafa
Compare
@naiming-zededa
but I would like to address the following in the next PR to not bloat the scope of this one.
|
@eriknordmark I think the architecture should be right from our discussions. Please give this PR another look now that I adapted the implementation accordingly. |
Btw before merging this I'll need to change the EVE API and pillar first, so that I can update the dependencies here. Please let me know if you are satisfied with the changes, then I'll prepare the respective PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks fine to me so please proceeed generating the other PRs/commits.
@naiming-zededa can you take a look at the newlogd code changes?
pkg/edgeview/src/edgeview_test.go
Outdated
"testing" | ||
"time" | ||
|
||
"github.com/stretchr/testify/assert" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we want testify? Can't we use gomega for tests like we do in other golang tests in EVE repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3435 (comment) ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's because I know how to use testify
and I don't know how to use gomega
😄
but if you want I'll learn how to use gomega
and switch to it. is it better than testify
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brining go-chi was useful, because they have route parameter parser built-in, so we didn't have to build our own. The only reason why we left parts of Gorilla then was because we used ws communication (happy to restart the conversation of refactoring it). Rest was completely replaced by go-chi. Basically I'm asking the same question here: what's the benefit of testify over gomega and why do we want to have two testing frameworks in the repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/gorilla/mux/blob/main/README.md?plain=1#L303 - isn't that the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/gorilla/mux/blob/main/README.md?plain=1#L303 - isn't that the same?
We didn't go for gorilla-mux, because at the time it didn't have maintainers, now they do have them. We could have just used standard http mux from golang, but then we would have to write our own route parameter parser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uncleDecart rewrote the tests with gomega
- it's not much of a difference, so it makes sense to me to stick with one testing framework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw
grep -r "github.com/stretchr/testify/assert" pkg/ | grep "_test.go" | wc -l
26
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for change! Missed the fact that you bring testify to edge_view, not pillar, didn't know we were using testify, I think it's a good raise on LF-Edge call
8e0fafa
to
846b26b
Compare
the changes to EVE API are here |
the dependencies on pillar types are here |
docs/CONFIG-PROPERTIES.md
Outdated
| debug.kernel.loglevel | string | info | level of the produced kernel log messages. System default loglevel string representation should be used as described here ["https://man7.org/linux/man-pages/man3/syslog.3.html"]: emerg, alert, crit, err, warning, notice, info, debug. | | ||
| debug.kernel.remote.loglevel | string | info | level of the kernel log messages sent to the controller. System default loglevel string representation should be used as described here ["https://man7.org/linux/man-pages/man3/syslog.3.html"]: emerg, alert, crit, err, warning, notice, info, debug. | | ||
|
||
In addition, there can be per-agent settings to overwrite the default log level set for zedbox. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/zedbox/EVE microservices/ (even though this only applies to some microservices like downloader and zedagent and not to others like wwan and wlan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I meant to replace all of the occurrences and forgot that one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this applies to wwan as well
https://github.com/lf-edge/eve/blob/08d0583f68c9ce35747d2ea6566a9b30ad7404c6/pkg/wwan/mmagent/agent.go#L486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is the full list of all "agents" that support log level settings (call agentlog.HandleGlobalConfig
) right now:
newlogd
wwan
nodeagent
downloader
tpmmgr
client
vcomlink
executor
vaultmgr
baseosmgr
zedagent
verifier
wstunnelclient
zfsmanager
zedkube
ledmanager
faultinjection
zedmanager
nim
loguploader
watcher
volumemgr
zedrouter
msrv
domainmgr
diag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it also to CONFIG-PROPERTIES.md
docs/CONFIG-PROPERTIES.md
Outdated
|
||
Additionally all log levels can be set to "none" to disable logging for the corresponding component or to "all" to enable all log levels. | ||
|
||
Furthermore, the "remote" log levels control, which subset of the produced logs will be sent to the controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I don't think the comma after "control" makes any sense. But I'm not a native english speaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. copilot suggested it to me and I didn't dare to contradict it 🤖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, FWIW you can tick "include copy of your diagram" in draw.io and then you'll have just one file my_diagram.drawio.png and if you can edit that one directly in draw.io
As opposed to the default log level, which was setting which logs are produced by EVE's microservices, the remote log level will set which device logs are uploaded to the cloud. So it's assumed that the remote log level is always higher than the default log level. There are no changes in how newlogd collects the logs, only in how it handles the log files. For the logs that are uploaded: - we create a separate file with prefix dev.log.upload in collect - that file is gzipped when it reaches a certain size or by timer - the same as before - once gzipped the file is moved to devUpload - the same as before - once the file is uploaded successfully, it's deleted instead of being moved to keepSentQueue For the logs that stay on the device: - we create a separate file with prefix dev.log.keep in collect - that file is gzipped when it reaches a certain size - no timer - once gzipped the file is moved to keepSentQueue - the name of the folder is preserved The commit also adds log levels "all" and "none" as well as remote log levels for kernel and syslog logs: as the names suggest, "all" will log everything and "none" will suppress all logs. The levels can be set in the global configuration for - debug.default.loglevel - debug.default.remote.loglevel - agent.*agentname*.debug.loglevel - agent.*agentname*.debug.remote.loglevel - debug.syslog.loglevel - debug.syslog.remote.loglevel - debug.kernel.loglevel - debug.kernel.remote.loglevel Signed-off-by: Paul Gaiduk <[email protected]>
Add new metrics for newlogd: - OldestSavedLog - the timestamp of the oldest log available on the device - TotalSizeLogs - the total size of all logs on the device Signed-off-by: Paul Gaiduk <[email protected]>
In case the total size of gzipped logs exceeds the limit they should be removed in the following order by directory: - `keepSentQueue` - `failedUpload` - `devUpload` - `appUpload` Signed-off-by: Paul Gaiduk <[email protected]>
846b26b
to
8bf6b25
Compare
8bf6b25
to
4eeccfc
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4413 +/- ##
=======================================
Coverage 20.93% 20.93%
=======================================
Files 13 13
Lines 2895 2895
=======================================
Hits 606 606
Misses 2163 2163
Partials 126 126 ☔ View full report in Codecov by Sentry. |
Since keepSentQueue contains the same or more logs than devUpload, it's enough to only use those logs when searching with edge-view. Signed-off-by: Paul Gaiduk <[email protected]>
Add a chapter to LOGGING.md explaining how different log levels are handled in the system. Updated CONFIG-PROPERTIES.md to clarify remote log level settings and provide full list of log level parameters Signed-off-by: Paul Gaiduk <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
4eeccfc
to
95c5b13
Compare
the last push just fixed some spelling found by local yetus run, since the one in CI doesn't work |
thank you for your patience! in the final version I
|
could it be that yetus is failing because it doesn't exclude the changes to |
This PR is a work in progress, but I appreciate if somebody could take a look at the code and the general approach and give me feedback.
Please see the commit messages (especially the second commit) for the overview of the changes.
Before merging this I need to update the pillar first to include the dependency in newlogd. I also need to update eve-api to include the new metrics.
I'm also planning to add more metrics as well as proper tests for newlogd to at least ensure that there is no regression.