Replies: 1 comment 1 reply
-
|
This is really great thanks @em-baggie I was recently sent this R package and R shiny app from Bennett that does something similar. https://github.com/bennettoxford/opencodecounts/tree/main We should have a look at this as well and find out if we want to use any of their approaches or even collaborate with them. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
NHSD Usage Data
The The NHSD SNOMED code usage data is a dataset published by NHS Digital which shows how often each code is used in English primary care (GP) records.
Why should we incorporate the usage data into the Codelist Builder?
This would allow the builder can take into account which codes are actually used in practice, and how frequently. For example, users could filter, prioritise, or exclude codes based on their real-world usage.
Available years and reporting periods
At the time of writing, data from 2012-2024 is available. Each file represents the usage over one year from 1st August to 31st July.
File formats
Data is available in
.txtand.xlsxformats. The structure is consistent across both data formats.File contents
The column structure is consistent across all years. Two metadata files are available, which describe the contents of the files:
Below are the descriptions of the column data, summarised from the information included in the metadata files. Most of the descriptions were identical between the two files and I've highlighted where these differ.
SNOMED_Concept_ID
"SNOMED concepts which have been added to a patient record in a general practice system during the reporting period. Text string of digits up to 18 characters long."
Description
"The fully specified name associated with the SNOMED_Concept_ID on the final day of the reporting period (31 July)."
Usage
"The number of times that the SNOMED_Concept_ID was added into any patient record within the reporting period, rounded to the nearerst 10. Usage of 1 to 4 is displayed as *. SNOMED concepts with no code usage are not included within the dataset."
IMPORTANT TO NOTE:
Data prior to 2019 was originally submitted mostly in READ V2 or CTV3, but in the usage files, these codes have been mapped to corresponding SNOMED codes using final 2020 version of the mapping tables published by NHS England. Therefore all of the available files, even if they are from 2019 or prior, only include valid SNOMED codes.
The usage does not show how many patients had each code added to their record - each addition regardless of whether it is the same patient increments the count by 1. Therefore it is not possible to infer the number of individual patients with a particular code.
For the 2011-12 to 2017-18 data, it is stated that "Current maximum value is approximately 250,000,000" - no such maximum is stated for the 2018-19 onwards data.
Active_at_Start
"Active status of the SNOMED_Concept_ID on the first day of the reporting period. This is taken from the most recent UK clinical extension, or associated International extention, which was published up to the start of the reporting year (1 August)
1 = SNOMED concept was published and was active (active = 1).
0 = SNOMED concept was either not yet available or was inactive (active = 0)."
Active_at_End
"Active status of the SNOMED_Concept_ID on the first day of the reporting period. This is taken from the most recent UK clinical extension, or associated International extention, which was published up to the end of the reporting year (31 July).
1 = SNOMED concept was published and was active (active = 1).
0 = SNOMED concept was either not yet available or was inactive (active = 0)."
Limitations
Loading data into the builder
See PR for proposed method of loading the usage data into the codelist builder
(Relates to #99)
Beta Was this translation helpful? Give feedback.
All reactions