Skip to content

Commit

Permalink
update docs link
Browse files Browse the repository at this point in the history
  • Loading branch information
ivyONS committed Jul 9, 2024
1 parent a56c7aa commit d3b7589
Show file tree
Hide file tree
Showing 5 changed files with 3 additions and 3 deletions.
Binary file added docs/_static/app-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/sic-soc-llm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/method.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ format: html

## Summary

A proof-of-concept large language model (LLM) application was created to assess whether an LLM could improve SIC autocoding performance for survey data. This was applied to sample of anonimized survey data and evaluated by comparing the results to clerical coding and to logistic regression model. The LLM showed marginal improvement over the logistic regression in the level of agreement with clerical coding at the 5-digit SIC level. It is likely that refinement of the method would improve performance further. Note that the evaluation scripts are out of scope for this repository. The methodology of the main SIC autocoding module is described bellow. For more information see Data science campus [blog](https://datasciencecampus.ons.gov.uk/category/projects/).
A proof-of-concept large language model (LLM) application was created to assess whether an LLM could improve SIC autocoding performance for survey data. This was applied to sample of anonimized survey data and evaluated by comparing the results to clerical coding and to logistic regression model. The LLM showed marginal improvement over the logistic regression in the level of agreement with clerical coding at the 5-digit SIC level. It is likely that refinement of the method would improve performance further. Note that the evaluation scripts are out of scope for this repository. The methodology of the main SIC autocoding module is described bellow. For more information see Data science campus [blog](https://datasciencecampus.ons.gov.uk/classifai-exploring-the-use-of-large-language-models-llms-to-assign-free-text-to-commonly-used-classifications/).

## RAG based classification

Expand Down
3 changes: 1 addition & 2 deletions docs/tutorials/3_soc_classifier.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ Demonstration notebook for the `ClassificationLLM` with Standard Occupational Cl

```{python}
#| code-summary: "Code: Import methods and initialise"
from langchain.llms.fake import FakeListLLM
from sic_soc_llm import setup_logging
from sic_soc_llm.llm import ClassificationLLM
Expand All @@ -22,6 +20,7 @@ logger = setup_logging("soc_classifier")
```{python}
#| echo: false
#| code-summary: "Code: Create a fake Large Language Model (LLM) for demonstration purposes"
from langchain.llms.fake import FakeListLLM
soc_demo_llm = FakeListLLM(responses=[
'''
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ app = [
test = [
"pytest==6.2.5",
"pytest-pythonpath==0.7.4",
"coverage==7.5.4",
]

docs = ["quartodoc>=0.6.6",
Expand Down

0 comments on commit d3b7589

Please sign in to comment.