Share to: share facebook share twitter share wa share telegram print page

Text nailing

Supervised learning versus Text Nailing
An example of an alphabetical-only converted note ("nailed note")

Text Nailing (TN) is an information extraction method used in the fields of natural language processing (NLP), computational linguistics, and health informatics. It was developed to address challenges in extracting structured, analyzable data from large volumes of unstructured text such as clinical notes, research articles, and other narrative documents.[1][2]

Unlike fully automated machine learning systems, Text Nailing emphasizes a "human-in-the-loop" approach: a person interactively reviews small portions of text to identify expressions that are common, non-negated, and semantically informative. These "nailed expressions" are then converted into simplified alphabetical-only forms, creating consistent and homogeneous representations of the text. This hybrid strategy allows TN to combine the precision of expert-guided annotation with the scalability of computational techniques.[3]

The method has been applied to improve predictive modeling in medicine, enhance the performance of text classifiers, and reduce reliance on large manually labeled datasets.[4][5]

History and development

Text Nailing was first developed in the mid-2010s at Massachusetts General Hospital as part of efforts to improve information extraction from electronic health records (EHRs).[6] The method was initially tested in several clinical scenarios, including the extraction of smoking status from narrative notes, identification of patients with physician-documented insomnia, and detection of family history of coronary artery disease.[7]

Further applications demonstrated its potential in refining widely used clinical risk models. For example, Text Nailing was employed to improve the accuracy of the Framingham risk score in patients with non-alcoholic fatty liver disease,[4] and to classify patient non-adherence in type-2 diabetes care. Its emphasis on identifying non-negated, recurrent expressions yielded higher accuracy compared with traditional machine learning approaches that required large, manually labeled datasets.[5]

The method also sparked discussion about the broader role of human-in-the-loop approaches in health informatics. Commentators noted that machine learning in medicine often relied on assumptions about infinite linguistic variation, while clinical text in practice tends to reuse a limited set of expressions.[8] A subsequent letter in Communications of the ACM emphasized that using non-negated expressions could increase the accuracy of text-based classifiers.[9]

In July 2018, researchers at Virginia Tech and the University of Illinois at Urbana–Champaign cited Text Nailing as an example of "progressive cyber-human intelligence" (PCHI), recognizing its hybrid model of combining expert human annotation with computational scalability.[10]

At the same time, critiques of machine learning in health care highlighted the risks of inflated expectations. Chen and Asch (2017) argued that more thoughtful approaches were needed to avoid disillusionment with predictive modeling in medicine.[11] In this context, Text Nailing was described by its co-creator, Uri Kartoun, as a method that initially faced skepticism for relying on "simple tricks" and human annotation, but ultimately gained acceptance as a more robust approach to clinical text analysis.[12]

Source code

A sample code for extracting smoking status from narrative notes using "nailed expressions" is available in GitHub.[13]

Criticism of machine learning in health care

Chen & Asch 2017 wrote "With machine learning situated at the peak of inflated expectations, we can soften a subsequent crash into a "trough of disillusionment" by fostering a stronger appreciation of the technology's capabilities and limitations."[11]

A letter published in Communications of the ACM, "Beyond brute force", emphasized that a brute force approach may perform better than traditional machine learning algorithms when applied to text. The letter stated "... machine learning algorithms, when applied to text, rely on the assumption that any language includes an infinite number of possible expressions. In contrast, across a variety of medical conditions, we observed that clinicians tend to use the same expressions to describe patients' conditions."[14]

In his viewpoint published in June 2018 concerning slow adoption of data-driven findings in medicine, Uri Kartoun, co-creator of Text Nailing states that " ...Text Nailing raised skepticism in reviewers of medical informatics journals who claimed that it relies on simple tricks to simplify the text, and leans heavily on human annotation. TN indeed may seem just like a trick of the light at first glance, but it is actually a fairly sophisticated method that finally caught the attention of more adventurous reviewers and editors who ultimately accepted it for publication."[15]

References

  1. ^ Kartoun, Uri (2017). "Text nailing". Interactions. 24 (6): 44–9. doi:10.1145/3139488. S2CID 29010232.
  2. ^ Barbosa, Simone; Cockton, Gilbert (2017). "Avoiding agenda bias with design thoughtfulness". Interactions. 24 (6): 5. doi:10.1145/3151556. S2CID 657561.
  3. ^ Beam, Andrew L; Kartoun, Uri; Pai, Jennifer K; Chatterjee, Arnaub K; Fitzgerald, Timothy P; Shaw, Stanley Y; Kohane, Isaac S (2017). "Predictive Modeling of Physician-Patient Dynamics That Influence Sleep Medication Prescriptions and Clinical Decision-Making". Scientific Reports. 7 42282. Bibcode:2017NatSR...742282B. doi:10.1038/srep42282. PMC 5299453. PMID 28181568.
  4. ^ a b Simon, Tracey G; Kartoun, Uri; Zheng, Hui; Chan, Andrew T; Chung, Raymond T; Shaw, Stanley Y; Corey, Kathleen E (2017). "Model for end-stage liver disease Na Score predicts incident major cardiovascular events in patients with nonalcoholic fatty liver disease". Hepatology Communications. 1 (5): 429–438. doi:10.1002/hep4.1051. PMC 5659323. PMID 29085919.
  5. ^ a b Corey, Kathleen E; Kartoun, Uri; Zheng, Hui; Chung, Raymond T; Shaw, Stanley Y (2016). "Using an Electronic Medical Records Database to Identify Non-Traditional Cardiovascular Risk Factors in Nonalcoholic Fatty Liver Disease". The American Journal of Gastroenterology. 111 (5): 671–6. doi:10.1038/ajg.2016.44. PMC 4864030. PMID 26925881.
  6. ^ Kartoun, Uri (2017). "Text nailing". Interactions. 24 (6): 44–9. doi:10.1145/3139488. S2CID 29010232.
  7. ^ Kartoun, Uri; et al. (2018). "Development of an Algorithm to Identify Patients with Physician-Documented Insomnia". Scientific Reports. 8 (1): 7862. Bibcode:2018NatSR...8.7862K. doi:10.1038/s41598-018-25312-z. PMC 5959894. PMID 29777125.
  8. ^ CACM Staff (2017). "Beyond brute force". Communications of the ACM. 60 (10): 8–9. doi:10.1145/3135241.
  9. ^ CACM Staff (2018). "More accurate text analysis for better patient outcomes". Communications of the ACM. 61 (10): 6–7. doi:10.1145/3273019. S2CID 52901757.
  10. ^ Rikakis, Thanassis; Kelliher, Aisling; Huang, Jia-Bin; Sundaram, Hari (2018). "Progressive cyber-human intelligence for social good". Interactions. 25 (4): 52–56. doi:10.1145/3231559. S2CID 49563432.
  11. ^ a b Chen, Jonathan H; Asch, Steven M (2017). "Machine Learning and Prediction in Medicine — Beyond the Peak of Inflated Expectations". New England Journal of Medicine. 376 (26): 2507–9. doi:10.1056/NEJMp1702071. PMC 5953825. PMID 28657867.
  12. ^ Kartoun, Uri (2018). "Toward an accelerated adoption of data-driven findings in medicine". Medicine, Health Care and Philosophy. 22 (1): 153–157. doi:10.1007/s11019-018-9845-y. PMID 29882052. S2CID 46973857.
  13. ^ "GitHub - kartoun/text-nailing". GitHub. 2018-01-07.
  14. ^ CACM Staff (2017). "Beyond brute force". Communications of the ACM. 60 (10): 8–9. doi:10.1145/3135241.
  15. ^ Kartoun, Uri (2018). "Toward an accelerated adoption of data-driven findings in medicine". Medicine, Health Care and Philosophy. 22 (1): 153–157. doi:10.1007/s11019-018-9845-y. PMID 29882052. S2CID 46973857.
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9

Portal di Ensiklopedia Dunia

Kembali kehalaman sebelumnya