Proposal for tagging mechanism #79

jiegillet · 2023-11-23T10:28:57Z

Dear @mpizenberg, @ceddlyburge and @ErikSchierboom,

Following our (fun and productive) discussion about tags, I came up with a prototype for the tagging mechanism, using elm-review's data extractors, which seems like a perfect fit, I have been able to add some tags very quickly.

Some comments:

All of the tagging logic is in src/Tags.elm. If you are not familiar with elm-review, you can ignore the boilerplate, the core function is matchExpression and is easy to understand.
For now, I have added random tags, some because they were easy, and because they were a bit more subtle (LetExpression, multiline strings).
There is one rule commonTagsRule that always returns the same tags commonTags for any project.
For more complex traversals, for things like recursion, I would recommend adding a separate rule.
I don't know if we want to have more structure around the rules, for now one rule for each type of elm-syntax type (project, module, expressions, function declarations...) seems the most natural fit.
I have been testing every case so far, and I have found bugs thanks to it
I haven't added smoke tests for now, but it will be simple, we can do it once we have more tags in, they will be easy to add.- @ErikSchierboom I'm not sure if it can go live or not in this incomplete state. Is there a mechanism for going live? The tags.json files will be created from this PR on anyway.

I'd be happy to hear feedback about any of it.

src/Tags.elm

ErikSchierboom

I haven't added smoke tests for now, but it will be simple, we can do it once we have more tags in, they will be easy to add.- @ErikSchierboom I'm not sure if it can go live or not in this incomplete state. Is there a mechanism for going live? The tags.json files will be created from this PR on anyway.

The mechanism for going live is: merge this PR and then it will deploy. There is no on/off switch, which is also why I would like to see some smoke tests. An alternative would be to don't yet output the tags.json file and put it behind a feature flag. Ideally we would not have to re-run the analyzer on all solutions multiple times.

mpizenberg · 2023-11-24T00:26:31Z

This sounds like a solid starting point for all things that do not need more-than-one-node analysis!
I was wondering if uses:prefix-operator could be a tag? seems like something we might want to know (and forbid ahah, at least for operators that are not symmetric).

jiegillet · 2023-11-24T00:32:29Z

An alternative would be to don't yet output the tags.json file and put it behind a feature flag. Ideally we would not have to re-run the analyzer on all solutions multiple times.

@ErikSchierboom How would a feature flag work?

This sounds like a solid starting point for all things that do not need more-than-one-node analysis! I was wondering if uses:prefix-operator could be a tag? seems like something we might want to know (and forbid ahah, at least for operators that are not symmetric).

I actually added that very tag on the spreadsheet, but I forgot to include it in the code :)

ceddlyburge

This looks great. Its a stunningly small amount of code to create the tags :)

It seems like we need to think about what we need to do to release this.

Here is my first pass at it:

Make sure that all the tags we want are created
Write at least one smoke test, but probably a few more, probably for some of the more common exercises / solutions (definitely twofer!)
Ideally do a canary release, just using it for a small subset of solutions (maybe just solutions done by the maintainters or something like that). Not sure how easy / possible this is from an exercism point of view though.
Maybe have some monitoring / logging for a while at the start, so we can keep an eye on it. Not sure what this would entail, maybe it could use the AI tags as well and flag if anything is very different or something like that. Not totally sure this is worth it, but would provide more confidence.

ceddlyburge · 2023-11-24T07:47:50Z

src/Tags.elm

+
+type alias ModuleContext =
+    { lookupTable : ModuleNameLookupTable
+    , tags : Set Tag


Should this be ProjectContext?

Mmh that's an interesting question. I would say no, because conceptually I'm not trying to hold the ProjectContext inside of the ModuleContext, the ProjectContext happens to only have tags in it, but it could potentially need more things.
I think a satisfying solution would be to define type alias ProjectContext = { tags: Set Tag }.

I didn't want to derail the conv so I said nothing before, but now that's on the table ... I tend to forbid all type alias that are trivial renames in my code bases. I always prefer having something like { context : Set Tag } instead of type alias Context = Set Tag.

What would you do for type alias Tag = String?
It really does make the code easier to read for me, and wrapping would be quite noisy. Would you wrap it anyway? Or give up and use String?

-- I'd change the following type alias Tag = String type alias ProjectContext = Set Tag type alias ModuleContext = { ..., tags : Set Tag } commonTags : Set Tag fromProjectToModule : Rule.ContextCreator ProjectContext ModuleContext fromModuleToProject : Rule.ContextCreator ModuleContext ProjectContext dataExtractor : ProjectContext -> Value matchExpression : Node Expression -> Set Tag -- Into the following (no Tag alias, no ProjectContext alias) type alias ModuleContext = { ..., tags : Set String } commonTags : Set String fromProjectToModule : Rule.ContextCreator { tags : Set String } ModuleContext fromModuleToProject : Rule.ContextCreator ModuleContext { tags : Set String } dataExtractor : { tags : Set String } -> Value -- also changed the name here tagMatchExpression : Node Expression -> Set String

ceddlyburge · 2023-11-24T07:48:44Z

src/Tags.elm

+
+ruleConfig : RuleConfig
+ruleConfig =
+    { restrictToFiles = Nothing


Will this include any template type files that are provided along with the solution?

Yes, it will, that's a good point, the data extractor will run on the whole project (it can even look at the elm.json if we want).
Do we want to skip the support files? If we do, we should enforce a convention on their names (I think we already do that, but it should be in the CI).

@ErikSchierboom what does the C# tagger do when it comes to solutions with editor files in it? Skip it? Include it?

You just made me realize that this will probably tag the tests as well, which I guess we don't want, so I'll try to avoid that.

Adding file names in the field restrictToFiles won't work though, because it's eventually fed Rule.filterErrorsForFiles which I think only works for errors, not for data extractions. I should rename it to reflect that.

@ErikSchierboom what does the C# tagger do when it comes to solutions with editor files in it? Skip it? Include it?

I skip editor files (and example, exemplar, test and invalidator files)

ceddlyburge · 2023-11-24T07:51:26Z

src/Tags.elm

+        FunctionOrValue [ "Regex" ] _ ->
+            Set.singleton "technique:regular-expression"
+
+        FunctionOrValue [ "Debug" ] _ ->


I think we forbid this in the Analyser elsewhere, as part of the common rules? Might be worth having here just in case though.

That rule is Actionable, so I think it's not blocking students from publishing a solution with a debug anyway. Also solutions published before the rule would have them.

BTW, that just made me think of adding a Debug concept, that might be worth doing.

jiegillet · 2023-11-25T14:29:29Z

Thank you for the feedback so far. I made some changes and I added a bunch of tags, but it's still WIP.
Could I ask for your help on some of the decisions? On the spreadsheet I marked some questions in orange wherever I couldn't make a clear decision, please add some decisive comments on there so we can make progress :)

ErikSchierboom · 2023-11-28T08:24:10Z

Ideally do a canary release, just using it for a small subset of solutions (maybe just solutions done by the maintainters or something like that). Not sure how easy / possible this is from an exercism point of view though.

We don't yet have anything like this and I wouldnt' really want to start adding exceptions, sorry.

Maybe have some monitoring / logging for a while at the start, so we can keep an eye on it. Not sure what this would entail, maybe it could use the AI tags as well and flag if anything is very different or something like that. Not totally sure this is worth it, but would provide more confidence.

This sounds somewhat complicated. Personally, having a nice set of smoke tests should give us enough confidence that this is working.

ErikSchierboom · 2023-11-28T08:26:03Z

@ErikSchierboom How would a feature flag work?

Well, the thing I would like to enable is for you all to continue working on adding more tags whilst still having CI verify the tags output. So maybe a docker build arg that is off by default?

jiegillet · 2023-12-09T02:29:32Z

I know I won't have a lot of free time in the next few weeks, so I'll stop working on this PR further and submit it as it is, it's already very big anyway. Some notes:

I have added a feature flag, bin/run.sh slug input output --tags. If the 4th argument is missing or anything other than --tags then the tags.json won't be written. run-in-docker.sh doesn't have it and run-tests-in-docker.sh has it.
I added smoke tests for tags for the existing test data we have, but it would be nice to add more specific ones later.
There is still a number of tags not implemented, that can be done later in smaller chunks
If you want to add a tag or something, add it on the spreadsheet
I will open an issue with a roadmap for deploying the tags feature
@ErikSchierboom I have asked for a review from you, obviously you don't need to check the Elm code, but please check the scripts

ceddlyburge

Hi Jeremie, these latest changes look good to me.

ErikSchierboom

LGTM! Awesome work

jiegillet added 3 commits November 23, 2023 19:09

Add destructuring helper

5cc72c3

generate tags

395f0dd

compare sorted expected values

9daf799

jiegillet requested a review from a team as a code owner November 23, 2023 10:28

remove unused things

8f6c54e

ErikSchierboom reviewed Nov 23, 2023

View reviewed changes

src/Tags.elm Outdated Show resolved Hide resolved

ErikSchierboom reviewed Nov 23, 2023

View reviewed changes

ceddlyburge approved these changes Nov 24, 2023

View reviewed changes

jiegillet added 10 commits November 25, 2023 09:12

Merge branch 'main' into jie-tags

5ddf8f0

ProjectContext become record type

aa906dc

Add smoke test mechanism

bd4af55

Bug fix: don't overwrite extracts

6a64b85

tags ignore files that errors ignore

b9afae7

Remove Tag alias

ad9738c

Separate type expressions

2d620b1

Adding some tagd

beea31b

tagging comments

ca1a013

type declaration tags

2be7eec

jiegillet marked this pull request as draft November 25, 2023 14:30

destructuring

2c21bcf

jiegillet added 3 commits December 9, 2023 10:34

smoke tests pass

7efeab8

Add feature flag

27dffad

fix elm-review errors

335aa1b

jiegillet marked this pull request as ready for review December 9, 2023 02:17

jiegillet added the x:action/create Work on something from scratch label Dec 9, 2023

jiegillet added x:type/coding Write code that is not student-facing content (e.g. test-runners, generators, but not exercises) x:size/massive Massive amount of work labels Dec 9, 2023

jiegillet requested review from mpizenberg, ceddlyburge and ErikSchierboom December 9, 2023 02:28

jiegillet mentioned this pull request Dec 9, 2023

Implement the tags feature #83

Open

5 tasks

ceddlyburge approved these changes Dec 11, 2023

View reviewed changes

ErikSchierboom approved these changes Dec 12, 2023

View reviewed changes

jiegillet merged commit 7db1b85 into main Dec 13, 2023
4 checks passed

jiegillet deleted the jie-tags branch December 13, 2023 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for tagging mechanism #79

Proposal for tagging mechanism #79

jiegillet commented Nov 23, 2023

ErikSchierboom left a comment

mpizenberg commented Nov 24, 2023

jiegillet commented Nov 24, 2023 •

edited

Loading

ceddlyburge left a comment

ceddlyburge Nov 24, 2023

jiegillet Nov 24, 2023

mpizenberg Nov 24, 2023

jiegillet Nov 25, 2023

mpizenberg Nov 25, 2023

ceddlyburge Nov 24, 2023

jiegillet Nov 24, 2023

ErikSchierboom Nov 28, 2023

ceddlyburge Nov 24, 2023

jiegillet Nov 24, 2023 •

edited

Loading

jiegillet commented Nov 25, 2023 •

edited

Loading

ErikSchierboom commented Nov 28, 2023

ErikSchierboom commented Nov 28, 2023

jiegillet commented Dec 9, 2023 •

edited

Loading

ceddlyburge left a comment

ErikSchierboom left a comment

Proposal for tagging mechanism #79

Proposal for tagging mechanism #79

Conversation

jiegillet commented Nov 23, 2023

ErikSchierboom left a comment

Choose a reason for hiding this comment

mpizenberg commented Nov 24, 2023

jiegillet commented Nov 24, 2023 • edited Loading

ceddlyburge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiegillet Nov 24, 2023 • edited Loading

Choose a reason for hiding this comment

jiegillet commented Nov 25, 2023 • edited Loading

ErikSchierboom commented Nov 28, 2023

ErikSchierboom commented Nov 28, 2023

jiegillet commented Dec 9, 2023 • edited Loading

ceddlyburge left a comment

Choose a reason for hiding this comment

ErikSchierboom left a comment

Choose a reason for hiding this comment

jiegillet commented Nov 24, 2023 •

edited

Loading

jiegillet Nov 24, 2023 •

edited

Loading

jiegillet commented Nov 25, 2023 •

edited

Loading

jiegillet commented Dec 9, 2023 •

edited

Loading