-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/3959 bg monitoring fixed #2423
Conversation
… also resolve merge conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test cases failing from local. Can you please check. Are there any configuration changes required to test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failed tests on my local. Not sure if these are specific to my local. Fixed import issue on my local in portality/bll/services/background_task_status.py before testing.
doajtest/unit/test_background_task_status.py:178 (TestBackgroundTaskStatus.test_create_background_status__error_in_period_not_found)
self = <doajtest.unit.test_background_task_status.TestBackgroundTaskStatus testMethod=test_create_background_status__error_in_period_not_found>
@apply_test_case_config(bg_monitor_errors_config__a)
def test_create_background_status__error_in_period_not_found(self):
save_mock_bgjob(JournalCSVBackgroundTask.__action__,
status=constants.BGJOB_STATUS_ERROR,
created_before_sec=1000000000)
status_dict = background_task_status.create_background_status()
journal_csv_dict = status_dict['queues']['scheduled_short']['errors'].get(JournalCSVBackgroundTask.__action__, {})
> assert is_stable(status_dict['status'])
E AssertionError: assert False
E + where False = is_stable('unstable')
test_background_task_status.py:189: AssertionError
doajtest/unit/test_background_task_status.py:106 (TestBackgroundTaskStatus.test_create_background_status__invalid_last_completed__events_queue)
self = <doajtest.unit.test_background_task_status.TestBackgroundTaskStatus testMethod=test_create_background_status__invalid_last_completed__events_queue>
@apply_test_case_config(bg_monitor_last_completed__now)
def test_create_background_status__invalid_last_completed__events_queue(self):
save_mock_bgjob(JournalCSVBackgroundTask.__action__,
queue_id=constants.BGJOB_QUEUE_ID_EVENTS,
status=constants.BGJOB_STATUS_COMPLETE, )
status_dict = background_task_status.create_background_status()
assert not is_stable(status_dict['status'])
> self.assert_unstable_dict(status_dict['queues'].get('events', {}))
test_background_task_status.py:116:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
val = {'err_msgs': [], 'errors': {'application_autochecks': {'err_msgs': [], 'in_monitoring_period': 0, 'status': 'stable', ...table'}, 'ingest_articles': {'err_msgs': [], 'last_run': None, 'last_run_status': None, 'status': 'stable'}, ...}, ...}
@staticmethod
def assert_unstable_dict(val):
> assert not is_stable(val.get('status'))
E AssertionError: assert not True
E + where True = is_stable('stable')
E + where 'stable' = <built-in method get of dict object at 0x165c907c0>('status')
E + where <built-in method get of dict object at 0x165c907c0> = {'err_msgs': [], 'errors': {'application_autochecks': {'err_msgs': [], 'in_monitoring_period': 0, 'status': 'stable', 'total': 0}, 'article_bulk_create': {'err_msgs': [], 'in_monitoring_period': 0, 'status': 'stable', 'total': 0}, 'article_bulk_delete': {'err_msgs': [], 'in_monitoring_period': 0, 'status': 'stable', 'total': 0}, 'ingest_articles': {'err_msgs': [], 'in_monitoring_period': 0, 'status': 'stable', 'total': 0}, ...}, 'last_completed_job': '2024-11-04T04:46:47Z', 'last_run_successful': {'application_autochecks': {'err_msgs': [], 'last_run': None, 'last_run_status': None, 'status': 'stable'}, 'article_bulk_create': {'err_msgs': [], 'last_run': None, 'last_run_status': None, 'status': 'stable'}, 'article_bulk_delete': {'err_msgs': [], 'last_run': None, 'last_run_status': None, 'status': 'stable'}, 'ingest_articles': {'err_msgs': [], 'last_run': None, 'last_run_status': None, 'status': 'stable'}, ...}, ...}.get
test_background_task_status.py:104: AssertionError
doajtest/unit/test_background_task_status.py:232 (TestBackgroundTaskStatus.test_create_background_status__queued_valid_oldest)
self = <doajtest.unit.test_background_task_status.TestBackgroundTaskStatus testMethod=test_create_background_status__queued_valid_oldest>
@apply_test_case_config(bg_monitor_queued_config__a)
def test_create_background_status__queued_valid_oldest(self):
save_mock_bgjob(JournalCSVBackgroundTask.__action__,
status=constants.BGJOB_STATUS_QUEUED, )
status_dict = background_task_status.create_background_status()
print(json.dumps(status_dict, indent=4))
journal_csv_dict = status_dict['queues']['scheduled_short']['queued'].get(JournalCSVBackgroundTask.__action__, {})
> assert is_stable(status_dict['status'])
E AssertionError: assert False
E + where False = is_stable('unstable')
test_background_task_status.py:243: AssertionError
doajtest/unit/test_background_task_status.py:205 (TestBackgroundTaskStatus.test_create_background_status__queued_valid_total)
self = <doajtest.unit.test_background_task_status.TestBackgroundTaskStatus testMethod=test_create_background_status__queued_valid_total>
@apply_test_case_config(bg_monitor_queued_config__zero_total)
def test_create_background_status__queued_valid_total(self):
save_mock_bgjob(JournalCSVBackgroundTask.__action__,
status=constants.BGJOB_STATUS_COMPLETE, )
status_dict = background_task_status.create_background_status()
journal_csv_dict = status_dict['queues']['scheduled_short']['queued'].get(JournalCSVBackgroundTask.__action__, {})
> assert is_stable(status_dict['status'])
E AssertionError: assert False
E + where False = is_stable('unstable')
test_background_task_status.py:215: AssertionError
doajtest/unit/test_background_task_status.py:130 (TestBackgroundTaskStatus.test_create_background_status__valid_last_completed)
self = <doajtest.unit.test_background_task_status.TestBackgroundTaskStatus testMethod=test_create_background_status__valid_last_completed>
@apply_test_case_config(bg_monitor_last_completed__a)
def test_create_background_status__valid_last_completed(self):
save_mock_bgjob(JournalCSVBackgroundTask.__action__,
queue_id=constants.BGJOB_QUEUE_ID_EVENTS,
status=constants.BGJOB_STATUS_COMPLETE, )
save_mock_bgjob(AnonExportBackgroundTask.__action__,
queue_id=constants.BGJOB_QUEUE_ID_SCHEDULED_LONG,
status=constants.BGJOB_STATUS_COMPLETE, )
status_dict = background_task_status.create_background_status()
> assert is_stable(status_dict['status'])
E AssertionError: assert False
E + where False = is_stable('unstable')
test_background_task_status.py:142: AssertionError
doajtest/unit/test_background_task_status.py:145 (TestBackgroundTaskStatus.test_create_background_status__valid_last_completed__no_record)
self = <doajtest.unit.test_background_task_status.TestBackgroundTaskStatus testMethod=test_create_background_status__valid_last_completed__no_record>
@apply_test_case_config(bg_monitor_last_completed__now)
def test_create_background_status__valid_last_completed__no_record(self):
status_dict = background_task_status.create_background_status()
> assert is_stable(status_dict['status'])
E AssertionError: assert False
E + where False = is_stable('unstable')
test_background_task_status.py:149: AssertionError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RK206 The testing was a bit confused now there is an additional monitoring stream, so I have replaced it with a parameterised test that gets all the variations and that appears to be working fine.
@@ -4,7 +4,9 @@ | |||
import itertools | |||
from typing import Iterable | |||
|
|||
from portality.constants import BGJOB_QUEUE_ID_LONG, BGJOB_QUEUE_ID_MAIN, BGJOB_STATUS_ERROR, BGJOB_STATUS_QUEUED, \ | |||
import constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it import from portality?
@@ -4,7 +4,9 @@ | |||
import itertools | |||
from typing import Iterable | |||
|
|||
from portality.constants import BGJOB_QUEUE_ID_LONG, BGJOB_QUEUE_ID_MAIN, BGJOB_STATUS_ERROR, BGJOB_STATUS_QUEUED, \ | |||
import constants | |||
from constants import BGJOB_STATUS_COMPLETE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above. import from portality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a pycharm thing, it sometimes does the imports for you, and it does them relative, which seems to work locally but not anywhere else. Very annoying!
…he background status variations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few changes for @RK206 due to recent deployment changes.
Note the queues have been added in addition to the old ones (main and long_running) so we don't need to touch the old tasks, and we only need to add the symlinks for deployment for the new consumers. The release will therefore be fairly straightforward.
@@ -0,0 +1,9 @@ | |||
[program:huey-events] | |||
command=/home/cloo/doaj/bin/python /home/cloo/doaj/bin/huey_consumer.py -v portality.tasks.consumer_events_queue.events_queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated deployment - changed binary paths /home/cloo/doaj/**venv/**bin/python and /home/cloo/doaj/**venv/**bin/huey_consumer.py
@@ -0,0 +1,9 @@ | |||
[program:huey-scheduled-long] | |||
command=/home/cloo/doaj/bin/python /home/cloo/doaj/bin/huey_consumer.py -v portality.tasks.consumer_scheduled_long_queue.scheduled_long_queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
venv path again
@@ -0,0 +1,9 @@ | |||
[program:huey-scheduled-short] | |||
command=/home/cloo/doaj/bin/python /home/cloo/doaj/bin/huey_consumer.py -v portality.tasks.consumer_scheduled_short_queue.scheduled_short_queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
venv path
long_running.always_eager = True | ||
main_queue.immediate = True | ||
long_running.immediate = True | ||
events_queue.always_eager = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax change for new huey - always_eager
is now immediate
I retract my suggestion for @RK206 to make the above changes for new deployment - I'll take care of that and fix the conflicts |
Replaces PR: #2419
update bg monitoring settings for mission critical monitoring changes
Applies tighter monitoring constraints on the background jobs, as per this spreadsheet: https://docs.google.com/spreadsheets/d/1XWq-7R9hZmZsBA27eOgs9cO-SfJnLEE0dW50DVtKhpI/edit?gid=543591434#gid=543591434
This PR...
Developer Checklist
Developers should review and confirm each of these items before requesting review
constants
ormessages
filesdates
)url_for
not hard-codeddevelop
Reviewer Checklist
Reviewers should review and confirm each of these items before approval
If there are multiple reviewers, this section should be duplicated for each reviewer
constants
ormessages
filesdates
)url_for
not hard-codeddevelop
Testing
This tightens the monitoring on a number of bg jobs, so would be good to validate that this won't cause any major surge in unnecessary reporting.
Probably need to send it live and then adjust.
Deployment
Configuration changes
Adds new values in:
Monitoring
This may trigger new types of "unstable" alerts in the background jobs reporting, so we should ensure that those alerts come through