Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robustness of regression testcases needs to be enhanced #4432

Open
chen1195585098 opened this issue Nov 11, 2024 · 0 comments · May be fixed by #4433
Open

robustness of regression testcases needs to be enhanced #4432

chen1195585098 opened this issue Nov 11, 2024 · 0 comments · May be fixed by #4433

Comments

@chen1195585098
Copy link
Contributor

Description of problem:
The result of some regression testcases may vary on diffirent envrionments, they get failed in env A, while succeed in env B.

After further analysis, I found there are mainly 2 reasons that cause failure:

  1. gluster volume heal failed in some testcases with log:
[2024-11-11 07:52:26.863275]:++++++++++ G_LOG:./tests/bugs/replicate/bug-1325792.t: TEST: 14 1 afr_child_up_status patchy 0 ++++++++++
[2024-11-11 07:52:26.912089]:++++++++++ G_LOG:./tests/bugs/replicate/bug-1325792.t: TEST: 15 1 afr_child_up_status patchy 1 ++++++++++
[2024-11-11 07:52:26.960697]:++++++++++ G_LOG:./tests/bugs/replicate/bug-1325792.t: TEST: 16 1 afr_child_up_status patchy 2 ++++++++++
[2024-11-11 07:52:27.007547]:++++++++++ G_LOG:./tests/bugs/replicate/bug-1325792.t: TEST: 17 1 afr_child_up_status patchy 3 ++++++++++
[2024-11-11 07:52:27.114742] E [MSGID: 100038] [glusterfsd-mgmt.c:784:glusterfs_handle_translator_op] 0-glusterfs: Not processing brick-op since volume graph is not yet active [{brick-op_no.=3}, {errno=11}, {error=Resource temporarily unavailable}]

The corresponding testcase is:

[root@192 glusterfs]# cat ./tests/bugs/replicate/bug-1325792.t
#!/bin/bash
. $(dirname $0)/../../include.rc
. $(dirname $0)/../../volume.rc
cleanup;


TEST glusterd
TEST pidof glusterd
TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0,1,2,3}
TEST $CLI volume start $V0

TEST glusterfs --volfile-id=$V0 --volfile-server=$H0 --entry-timeout=0 $M0;

EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" afr_child_up_status_in_shd $V0 0
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" afr_child_up_status_in_shd $V0 1
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" afr_child_up_status_in_shd $V0 2
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" afr_child_up_status_in_shd $V0 3


EXPECT 1 echo `$CLI volume heal $V0 statistics heal-count replica $H0:$B0/${V0}0 | grep -A 1 ${V0}0 | grep "entries" | wc -l`
EXPECT 1 echo `$CLI volume heal $V0 statistics heal-count replica $H0:$B0/${V0}1 | grep -A 1 ${V0}1 | grep "entries" | wc -l`
EXPECT 1 echo `$CLI volume heal $V0 statistics heal-count replica $H0:$B0/${V0}2 | grep -A 1 ${V0}2 | grep "entries" | wc -l`
EXPECT 1 echo `$CLI volume heal $V0 statistics heal-count replica $H0:$B0/${V0}3 | grep -A 1 ${V0}3 | grep "entries" | wc -l`

cleanup

It seems a heal is ran before shd process gets ready.
2. Some testcase rely on the output of command, such as touch in ./tests/bugs/replicate/issue-1254-prioritize-enospc.t, as the judge standard of a sub-testcase. However,the output of touch may be changed into diffirent language. And these changes will eventually leads to failure.

Meanwhile, a sub-testcase in ./tests/bugs/fuse/bug-858215.t is failed during obtaining glustershd pid. It is executed during shd startup and hence collects the pid of both the child and parent, which then leads to failure in TEST [ $glustershd_pid != 0 ].

not ok  15 [     85/      4] <  45> '[ 40232 40233 != 0 ]' -> ''

In bug-858215.t, it is to complex by getting mount_pid from all glusterfs pids except for shd and nfs. Whether it's ok by getting mount_pid directly from ps auxwww?

The exact command to reproduce the issue:

The full output of the command that failed:

Expected results:

regression testcases should return the same result even in diffirent environments.
chen1195585098 pushed a commit to chen1195585098/glusterfs that referenced this issue Nov 11, 2024
Results of some regression tests may vary on diffirent env,
due to the unready status of shd process or unexpected
output language of key commands in test case.

So, it should be made sure that shd process has got ready before
a `gluster v heal` cmd is triggered in testcase. And if the output
of command is the judge standard of sub-testcase, we also shuold
assure it is shown in specified language as expected.

Fixes: gluster#4432
Signed-off-by: chenjinhao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant