-
Notifications
You must be signed in to change notification settings - Fork 7
feat(medcat-service): Medcat Demo and AnonCAT demo uplift in Gradio #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
a96e2b5
feat(medcat-service): Update gradio version
alhendrickson 476ebff
build(medcat-service): Update fastapi dependency
alhendrickson 7a9e413
feat(medcat-servie): Fix gradio version root path bug
alhendrickson 3f6e38c
build(medcat-service): Add hot module reloader. Update gradio demo
alhendrickson e452614
docs(medcat-service): Add dev readme. refactor gradio to extract the …
alhendrickson 9e436f2
feat(medcat-service): Add anoncat demo text
alhendrickson e7c3a11
feat(medcat-service): Move out of main.py. Configure overflow scrollb…
alhendrickson eecf7b9
test(medcat-service): Create gradio logic tests. Split into its own file
alhendrickson c055092
refactor(medcat-service): Update start_service_debug.sh for clarity a…
alhendrickson d78f2e3
feat(medcat-service): Enhance Gradio demo layout and add logging for …
alhendrickson fea70a6
Merge branch 'main' into feat/medcat-service/gradio-uplift
alhendrickson 5c443f1
fix(medct-service): fix syntax
alhendrickson 760abd1
build(medct-service): Update gradio version
alhendrickson 8e179fe
refactor(medcat-service): Move example files to txt files
alhendrickson f1daebf
feat(medcat-service): In demo Click on annotation to view details
alhendrickson a9e9227
feat(medcat-service): In demo Click on annotation to view details - text
alhendrickson 6d0f7c7
feat(medcat-service): In demo Click on annotation to view details - text
alhendrickson 3ebfdc1
feat(medcat-service): In demo Click on annotation to view details - ruff
alhendrickson 5da1b9d
feat(medcat-service): In demo move resource txt files to subfolder
alhendrickson ec2ff6c
feat(medcat-service): In demo move resource txt files to subfolder - …
alhendrickson 93d8095
feat(medcat-service): In demo move resource txt files to subfolder - …
alhendrickson cdc4a6a
feat(medcat-service): Support boolean redact flag in deid processor a…
alhendrickson b708c7c
feat(medcat-service): Fix mypy errors
alhendrickson 2e9e5c9
feat(medcat-service): Fix mypy errors
alhendrickson bca7d60
feat(medcat-service): Fix unit tests
alhendrickson aeadb78
feat(medcat-service): Fix unit tests
alhendrickson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| import importlib.resources | ||
| from functools import cache | ||
|
|
||
|
|
||
| @cache | ||
| def _read_file(filename: str) -> str: | ||
| package = importlib.resources.files(__package__ or 'medcat_service.demo') | ||
| file_path = package / 'resources' / filename | ||
| return file_path.read_text(encoding='utf-8') | ||
|
|
||
|
|
||
| short_example = _read_file('short_example.txt') | ||
| long_example = _read_file('long_example.txt') | ||
| anoncat_example = _read_file('anoncat_example.txt') | ||
| article_footer = _read_file('article_footer.txt') | ||
| anoncat_help_content = _read_file('anoncat_help_content.txt') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,176 @@ | ||
| """ | ||
| This module provides conversion utilities between the MedCAT output format | ||
| and the exact format expected by Gradio components, specifically aligning | ||
| with the output schema of Hugging Face Transformers pipelines (e.g., for | ||
| NER highlighting). Use these definitions and helper functions to bridge | ||
| MedCAT's annotation results and Gradio's interactive demo expectations. | ||
| """ | ||
|
|
||
| import logging | ||
|
|
||
| from pydantic import BaseModel | ||
|
|
||
| from medcat_service.dependencies import get_medcat_processor, get_settings | ||
| from medcat_service.types import ProcessAPIInputContent, ProcessErrorsResult, ProcessResult | ||
| from medcat_service.types_entities import Entity | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class EntityAnnotation(BaseModel): | ||
| """ | ||
| Expected data format for NER in gradio | ||
| """ | ||
|
|
||
| entity: str | ||
| score: float | ||
| index: int | ||
| word: str | ||
| start: int | ||
| end: int | ||
|
|
||
|
|
||
| headers = ["Pretty Name", "Identifier", "Confidence Score", "Start Index", "End Index", "ID"] | ||
|
|
||
|
|
||
| class EntityAnnotationDisplay(BaseModel): | ||
| """ | ||
| DIsplay data format for use in a datatable | ||
| """ | ||
|
|
||
| pretty_name: str | ||
| identifier: str | ||
| score: float | ||
| start: int | ||
| end: int | ||
| id: int | ||
| # Misisng Meta Anns | ||
|
|
||
|
|
||
| class EntityResponse(BaseModel): | ||
| """ | ||
| Expected data format of gradio highlightedtext component | ||
| """ | ||
|
|
||
| entities: list[EntityAnnotation] | ||
| text: str | ||
|
|
||
|
|
||
| def convert_annotation_to_ner_model(entity: Entity, index: int) -> EntityAnnotation: | ||
| return EntityAnnotation( | ||
| entity=entity.get("cui", "UNKNOWN"), | ||
| score=entity.get("acc", 0.0), | ||
| index=index, | ||
| word=entity.get("detected_name", ""), | ||
| start=entity.get("start", -1), | ||
| end=entity.get("end", -1), | ||
| ) | ||
|
|
||
|
|
||
| def convert_annotation_to_display_model(entity: Entity) -> EntityAnnotationDisplay: | ||
| return EntityAnnotationDisplay( | ||
| pretty_name=entity.get("pretty_name", ""), | ||
| identifier=entity.get("cui", "UNKNOWN"), | ||
| score=entity.get("acc", 0.0), | ||
| start=entity.get("start", -1), | ||
| end=entity.get("end", -1), | ||
| id=entity.get("id", -1), | ||
| # medcat-demo-app/webapp/demo/views.py | ||
| # if key == 'meta_anns': | ||
| # meta_anns=ent.get("meta_anns", {}) | ||
| # if meta_anns: | ||
| # for meta_ann in meta_anns.keys(): | ||
| # new_ent[meta_ann]=meta_anns[meta_ann]['value'] | ||
| ) | ||
|
|
||
|
|
||
| def convert_entity_dict_to_annotations(entity_dict_list: list[dict[str, Entity]]) -> list[EntityAnnotation]: | ||
| annotations: list[EntityAnnotation] = [] | ||
| for entity_dict in entity_dict_list: | ||
| for key, entity in entity_dict.items(): | ||
| annotations.append(convert_annotation_to_ner_model(entity, index=int(key))) | ||
| return annotations | ||
|
|
||
|
|
||
| def convert_entity_dict_to_display_model(entity_dict_list: list[dict[str, Entity]]) -> list[EntityAnnotationDisplay]: | ||
| logger.debug("Converting entity dict to display model") | ||
| annotations: list[EntityAnnotationDisplay] = [] | ||
| for entity_dict in entity_dict_list: | ||
| for key, entity in entity_dict.items(): | ||
| annotations.append(convert_annotation_to_display_model(entity)) | ||
| return annotations | ||
|
|
||
|
|
||
| def convert_display_model_to_list_of_lists(entity_display_model: list[EntityAnnotationDisplay]) -> list[list[str]]: | ||
| return [ | ||
| [str(getattr(entity, field)) for field in EntityAnnotationDisplay.model_fields] | ||
| for entity in entity_display_model | ||
| ] | ||
|
|
||
|
|
||
| def perform_named_entity_resolution(input_text: str, redact: bool | None = None): | ||
| """ | ||
| Performs clinical coding by processing the input text with MedCAT to extract and | ||
| annotate medical concepts (entities). | ||
|
|
||
| Returns: | ||
| 1. A dictionary following the NER response model (EntityResponse), containing the original text | ||
| and the list of detected entities. | ||
| 2. A datatable-compatible list of lists, where each sublist represents an entity annotation and | ||
| its attributes for display purposes. | ||
|
|
||
| This method is used as the main function for the Gradio MedCAT demo and MCP server, | ||
| enabling users to input free text and receive automatic annotation and coding of clinical entities. | ||
|
|
||
| Args: | ||
| input_text (str): The input text to be processed and annotated for medical entities by MedCAT. | ||
|
|
||
| Returns: | ||
| Tuple: | ||
| - dict: A dictionary following the NER response model (EntityResponse), containing the | ||
| original text and the list of detected entities. | ||
| - list[list[str]]: A datatable-compatible list of lists, where each sublist represents an | ||
| entity annotation and its attributes for display purposes. | ||
|
|
||
| """ | ||
| logger.debug("Performing named entity resolution") | ||
| if not input_text or not input_text.strip(): | ||
| return None, None, None | ||
|
|
||
| processor = get_medcat_processor(get_settings()) | ||
| input = ProcessAPIInputContent(text=input_text) | ||
|
|
||
| process_result = processor.process_content(input.model_dump(), redact=redact) | ||
|
|
||
| if isinstance(process_result, ProcessErrorsResult): | ||
| error_msg = ( | ||
| "; ".join(process_result.errors) if process_result.errors else "Unknown error occurred during processing" | ||
| ) | ||
| raise ValueError(f"Processing failed: {error_msg}") | ||
| result: ProcessResult = process_result | ||
|
|
||
| entity_ner_format: list[EntityAnnotation] = convert_entity_dict_to_annotations(result.annotations) | ||
|
|
||
| logger.debug("Converting entity dict to display model") | ||
| annotations_as_display_format = convert_entity_dict_to_display_model(result.annotations) | ||
| response_datatable_format = convert_display_model_to_list_of_lists(annotations_as_display_format) | ||
|
|
||
| response: EntityResponse = EntityResponse(entities=entity_ner_format, text=input_text) | ||
| response_tuple = response.model_dump(), response_datatable_format, result.text | ||
| return response_tuple | ||
|
|
||
|
|
||
| def medcat_demo_perform_named_entity_resolution(input_text: str): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, including the return type ( |
||
| """ | ||
| Performs named entity resolution for the MedCAT demo. | ||
| """ | ||
| result = perform_named_entity_resolution(input_text) | ||
| return result[0], result[1] | ||
|
|
||
|
|
||
| def anoncat_demo_perform_deidentification(input_text: str, redact: bool): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, return type might be useful. |
||
| """ | ||
| Performs deidentification for the AnonCAT demo. | ||
| """ | ||
| result = perform_named_entity_resolution(input_text, redact=redact) | ||
| return result | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps adding the return type a type hint here would be useful? You've descried it in the docs anyway (though it's missing the last
str).I.e
-> tuple[dict, list[list[str]], str]