Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Participant Experience Survey Analysis #1147

Open
2 of 4 tasks
lorszag opened this issue Nov 1, 2024 · 35 comments
Open
2 of 4 tasks

Participant Experience Survey Analysis #1147

lorszag opened this issue Nov 1, 2024 · 35 comments
Assignees

Comments

@lorszag
Copy link
Collaborator

lorszag commented Nov 1, 2024

To Do

  • Basic Analyses
  • Stratified Analyses
  • Removal of Open Text PII
  • Open Text Response Analysis
@lorszag
Copy link
Collaborator Author

lorszag commented Nov 1, 2024

@m-j-horner

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 1, 2024

Current testing plan for open text analysis response:

  1. Obtain test data pertaining to questions
  2. Test https://chat.webllm.ai/ to see if model is sufficient for the questions we want to answer
  3. @jacobmpeters to send email ensuring this model is acceptable for our use
  4. If approved, use WebLLM for real data.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 4, 2024

Currently working on stratified analyses. @brotzmanmj I think each group will take 2-3 days. I am currently working by site but let me know if you'd like me to prioritize sex, race, or age next!

@brotzmanmj
Copy link
Collaborator

Thanks @lorszag I would suggest: site, age, sex, race, in that order.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 4, 2024

By site strata is done and posted here. Some of the tables are a bit hard to read just because of the number of categories present. Also, it seems as though there may have been one participant who were able to access questions they weren't supposed to (in the skip logic) because some of the row totals add up to larger numbers than they should (see question: Please choose the reason that fits best to describe why you have not completed the sample survey). Let me know if anyone has any comments.

@brotzmanmj
Copy link
Collaborator

Hi @lorszag thanks for running this! I will review more carefully, but one request, can you add a 'Total' row at the bottom of each table so we can get a sense of the relative sites to the whole for each table? Thanks

@brotzmanmj
Copy link
Collaborator

brotzmanmj commented Nov 4, 2024

Also @lorszag Overall this looks really good. Here are a few other things to look into:

  1. can you look into how there is a '-1' response in some of the sample survey tables (for KPGA)? See page 7
    2)) The last table on Page 6 seems to have the wrong title
  2. The first two tables on Page 7, can you put the columns in the typical order Yes, No, Unsure, Skipped
  3. Page 5 second table from the bottom, please add percentages
  4. Last table, bottom of page 8, please add the Site names. If you need to decrease the width of the columns to make the site names fit on the table, you could remove the word 'Selected' from each column
  5. Page 7 second table from the bottom seems to have the wrong title

Adding @cunnaneaq to take a look at this draft because it does look to me like there may be an issue with the skip patterns in both the survey, samples, and sample survey sections where too many people are getting asked these questions. Can you please check with IMS and see if they think there is an issue? Of course, we're not planning to send out this exact survey again, but would be good to know so if we do re-issue it in some form later on, we have identified any issues that exist.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 5, 2024

Made these edits @brotzmanmj. New version posted here here. Please bear with me on tables being split between pages--I have to remake them manually to make this not happen and don't want to do so until we are on the final iterations to avoid redoing it.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 6, 2024

@brotzmanmj newer version posted in the Box folder. Not feeling very well so hoping to be able to complete the age stratifications later this/early next week!

@cunnaneaq
Copy link
Collaborator

@brotzmanmj Leila and I met this morning to talk about the (-1) in the table "Did you complete the survey at your Connect visit?". Leila identified the Connect ID and I tested this skip logic in MyConnect stage. The markdown looks correct and I was not able to re-create the issue. Happy to meet to discuss this further

@brotzmanmj
Copy link
Collaborator

Hi @cunnaneaq , is the -1 gone from the table in the most recent run? And did you check the other places where there are too many respondents that appear to have received and responded to questions?

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 7, 2024

@brotzmanmj we only saw that one table with a negative number in it. I can manually remove the respondent if you'd like, but it looks like they were able to access the question about completing the sample survey at their Connect visit but skipped the question about completing the sample survey before it.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 7, 2024

@brotzmanmj @cunnaneaq I have found another skip logic issue. I have a connect ID for which participant was able to answer question 21 but did not answer question 20. This is resulting in another -1 in one of the age stratified tables. There is another -1 in the table for question 27 which seems to be the same one Aileen and I investigated yesterday. The stratified by age tables will be done by EOD today.

@brotzmanmj
Copy link
Collaborator

Hi Leila, Thanks for pointing this out. Can you explain the -1 again, why that is happening? Are some people seeing a question they are not supposed to see?

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 7, 2024

Yes, it seems like there are participants (I say participants, but I only see that this is true for one person per each of the questions we are discussing) where they do not answer the stem question but still receive the sub-question. Also, the tables stratified by age group have been posted to the Box folder. I used age groups of 5 years but can easily change this if the team thinks something else would be more effective!

@brotzmanmj
Copy link
Collaborator

Thanks Leila.

@cunnaneaq please make a note of this if you haven't already so if we use the survey again in the future we fix the skip pattern on question 20.

Leila, looking at the skip issue, for the reports, we should limit the inclusion for the tables with the experience about sample donation and sample survey to people who said yes to having donated blood, urine, or saliva. I know the skip patterns did not exclude people who said 'no' (did not donate), but the data for those who didn't donate are obscuring the actual experience of those who did.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 7, 2024

Will do, Michelle. Would you want the tables just to include those that said yes to donating, or to include anyone who did not say no (those who skipped the question or said unsure)?

@brotzmanmj
Copy link
Collaborator

Just those who said yes to donating, thanks

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 7, 2024

Done, file reposted.

@brotzmanmj
Copy link
Collaborator

Thanks. If you could go back re-run the univariate stats and the site stratified stats with that same change to follow the skip logic, that would be great.

For the age groups, two questions/requests:

  1. What variable was used for age? I'm wondering why we have so many 'skipped' for age and perhaps we can rectify that.
  2. For the row headers for age, can you make them mutually exclusive... 30-34, 35-39, 40-44, 45-49, etc?

Thanks,
Michelle

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 7, 2024

The groups are mutually exclusive but the labels are wrong; rerunning that now (and will fix the Box file). I used this variable, as indicated by @jacobmpeters to derive age: RcrtUP_DOB_v1r0. I can re-run it with a different age variable if that is preferable. The site report is reposted. I will redo the univariate report tomorrow to reflect these changes!

@brotzmanmj
Copy link
Collaborator

Hi, RcrtUP_DOB_v1r0 is the date of birth. And you're using that with the date the survey was submitted to calculate age?

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 12, 2024

Posted the stratification by gender here.

I used this variable for gender: SrvBOH_Gender_v1r1 so did combine Male with Man and Female with Woman as categories since the variable was pulled from both version 1 and version 2 of the survey. Both CIDs 983318667 and 654207589 were coded as "man" and both CIDs 218837028 and 536341288 were coded as "woman."

@brotzmanmj
Copy link
Collaborator

Hi Leila, Thank you, I'll review. For the Age run, can you check on those ~600 that are missing age?

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 12, 2024

Hi Michelle. It is because many of the participants are older than 70 when completing the survey. Would you like me to add a 70-75 row? That will fix the issue.

@brotzmanmj
Copy link
Collaborator

Hi Leila, Yes, that would be great, thank you!

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 12, 2024

Re-uploaded for your review:) Now # missing = 3 of the 7000.

@brotzmanmj
Copy link
Collaborator

Great! So now that we're down to just 3, we can look those up individually and see why they are missing. Can you check on those and see what you find?

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 12, 2024

They are missing Autogenerated date/time stamp for Start of 2024 Connect Experience Survey which I am using with DOB to calculate age. In other words, variable SrvCoE_ConExpTmStart_v1r0 is NA.

@brotzmanmj
Copy link
Collaborator

Interesting. That's odd. Do they have a submit time? SrvCoE_ConExpTmCompl_v1r0

@brotzmanmj
Copy link
Collaborator

Also, I have a theory on what might have happened... can you send the three Connect IDs?

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 12, 2024

They do not -- sent the IDs to you on teams :)

@brotzmanmj
Copy link
Collaborator

Glad to know these ppts do have dates after all!

So for the analysis, if you get a chance tomorrow to re-run, one thing we need to do is limit the analysis only to those who have CES survey status = Submitted, for the univariate and all of the stratified tables.

You should include withdrawals in the analysis, no reason to leave them out, as long as survey status = submitted.
And then I think we'll be good for now.

Thanks,
Michelle

@brotzmanmj
Copy link
Collaborator

Hi Leila, Here are the additional requests:

  1. Among participants who were eligible for the experience survey, prepare a table comparing participants who submitted the experience survey vs didn’t submit the survey on what study activities they completed (baseline survey, baseline samples). You can use the same format and code that Kelsey uses for Table 4.21 in the Operations Weekly Report, but instead of 'by site' you can replace that with 'by Experience Survey Completion'.

  2. Among participants who were eligible for the experience survey, prepare a table comparing participants who submitted the experience survey vs didn’t submit the survey on demographic factors. You can do this in a typical format to how you would show 'Table 1' in any typical publication. You can include the 5-years age brackets, race/ethnicity, and sex.

  3. Deanna's request - for the reasons why people joined Connect, she'd like to know the number of selections people chose, by site. Her email had additional background on this request.

@lorszag
Copy link
Collaborator Author

lorszag commented Nov 22, 2024

I ran the report as requested. A few comments:

  1. For table 2, age cannot be calculated the same way it was for the previous analyses, which required using the timestamp a participant finished the survey and their DOB. Instead, here I used "Site reported age at recruitment".
  2. I used Kelsey's code for the letter experiment table 1 as a base for my table 2, which involves not separating by all racial / ethnic groups and instead segmenting based on white vs. other vs. unknown. This can be changed!
  3. From Deanna's email, it seemed like she was interested in the overall average, so that is what I calculated but again can change and calculate by site if that is what is of interest.

Linked here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants