{"id":126,"date":"2022-03-03T08:59:00","date_gmt":"2022-03-03T08:59:00","guid":{"rendered":"https:\/\/interactivehpc.dk\/?p=126"},"modified":"2025-06-13T10:05:49","modified_gmt":"2025-06-13T10:05:49","slug":"detecting-text-reuse-in-h-c-andersens-work","status":"publish","type":"post","link":"https:\/\/interactivehpc.dk\/?p=126","title":{"rendered":"Detecting text reuse in H.C. Andersen\u2019s work"},"content":{"rendered":"\n<p class=\"gp-gutenbergpro-767f9\">(&#8230;)In&nbsp;2019,&nbsp;senior researcher&nbsp;Ejnar Stig Askgaard from Odense City Museums began comparing&nbsp;Hans&nbsp;Christian Andersen\u2019s notes, written between approximately 1833 \u2013 1875, with the 162 fairy tales, novels and autobiographies. This&nbsp;had&nbsp;led to the discovery that Hans&nbsp;Christian&nbsp;Andersen liked to use symbols such as cross marks or deletions in his notes to indicate that the note had been reused in his fairytales.&nbsp;<\/p>\n\n\n\n<p class=\"gp-gutenbergpro-a3852\">For&nbsp;<em>Detecting text reuse in H.C. Andersen\u2019s&nbsp;work<\/em>,&nbsp;Berg&nbsp;wanted to find out where each note had been reused.&nbsp;Earlier research had managed&nbsp;to manually identify where 278 notes had been reused in Hans&nbsp;Christian&nbsp;Andersen\u2019s published work, but this had been a time-consuming effort, taking many months of work.<\/p>\n\n\n\n<p class=\"gp-gutenbergpro-7124e\">As 861 of the notes had been digitalized in addition to Hans&nbsp;Christian&nbsp;Andersen\u2019s published work<em>,<\/em>&nbsp;Berg was able to apply digital methods to solve his problem. He contacted&nbsp;Zhiru Sun, Assistant Professor at the Department of Design and Communication at SDU,&nbsp;who used a&nbsp;method called Natural Language Processing to find similarities between the notes and Hans&nbsp;Christian&nbsp;Anderson\u2019s work.&nbsp;Using&nbsp;the Python application on&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/cloud.sdu.dk\/app\/dashboard\" target=\"_blank\">UCloud<\/a>, this method&nbsp;generated&nbsp;a number of tables, which indicated how similar a specific note is to a specific fairytale.<\/p>\n\n\n\n<p class=\"gp-gutenbergpro-368ba\">This is an excerpt. <a href=\"https:\/\/escience.sdu.dk\/index.php\/2022\/03\/02\/detecting-text-reuse-in-h-c-andersens-work\/\" target=\"_blank\" rel=\"noreferrer noopener\">Click here for the full story.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>(&#8230;)In&nbsp;2019,&nbsp;senior researcher&nbsp;Ejnar Stig Askgaard from Odense City Museums began comparing&nbsp;Hans&nbsp;Christian Andersen\u2019s notes, written between approximately 1833 \u2013 1875, with the 162 fairy tales, novels and autobiographies. This&nbsp;had&nbsp;led to the discovery that Hans&nbsp;Christian&nbsp;Andersen liked to use symbols such as cross marks or deletions in his notes to indicate that the note had been reused in his [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":84,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"gtp_columnspro_styling":"{}","gtp_paragraph_styling":"{\"368ba185-31ed-465b-85d8-a3d8b97dad3a\":\" .gp-gutenbergpro-368ba { background-position-x: 50%;\\nbackground-position-y: 50%;\\nbackground-size: cover;\\nheight: px; }\"}","gtp_heading_styling":"{}","gtp_spacer_styling":"{}","gtp_video_styling":"{}","gtp_group_styling":"{}","gtp_cover_styling":"{}","footnotes":""},"categories":[8,258],"tags":[],"class_list":["post-126","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research","category-use-case___en"],"lang":"en","translations":{"en":126,"da":1754},"pll_sync_post":[],"_links":{"self":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts\/126","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=126"}],"version-history":[{"count":4,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts\/126\/revisions"}],"predecessor-version":[{"id":2091,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts\/126\/revisions\/2091"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/media\/84"}],"wp:attachment":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=126"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=126"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=126"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}