{"id":925,"date":"2021-04-09T19:02:00","date_gmt":"2021-04-09T16:02:00","guid":{"rendered":"https:\/\/dialexity.com\/blog\/?p=925"},"modified":"2023-12-07T09:28:51","modified_gmt":"2023-12-07T07:28:51","slug":"development-of-semantic-sentimental-similarity-algorithms","status":"publish","type":"post","link":"https:\/\/dialexity.com\/blog\/development-of-semantic-sentimental-similarity-algorithms\/","title":{"rendered":"Development of Semantic \/ Sentimental Similarity Algorithms"},"content":{"rendered":"\n<p>Semantic similarity is often estimated using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Word2vec\">word2vec<\/a>, but the freely available algorithms are not good enough [<a href=\"https:\/\/datascience.stackexchange.com\/questions\/12872\/how-can-i-get-a-measure-of-the-semantic-similarity-of-words\">1<\/a>, <a href=\"http:\/\/lcl.uniroma1.it\/adw\/\">2<\/a>, <a href=\"http:\/\/ws4jdemo.appspot.com\/?mode=w&amp;s1=&amp;w1=prudent&amp;s2=&amp;w2=careless%23\">3<\/a>]. Other word embeddings seem to be more successful [<a href=\"https:\/\/datascience.stackexchange.com\/questions\/12872\/how-can-i-get-a-measure-of-the-semantic-similarity-of-words\">1<\/a>, <a href=\"https:\/\/www.semanticscholar.org\/paper\/Integrating-Distributional-Lexical-Contrast-into-Nguyen-Walde\/2a0d8a3b1dfb06f584931fbbcf872a804b30c360?p2df\">4<\/a>, <a href=\"https:\/\/github.com\/commonsense\/conceptnet-numberbatch\">5<\/a>]. Another (more rigorous) approach &#8211; through enriching wordnet with semantic relations from other thesauruses with subsequent clustering [<a href=\"http:\/\/laimeskelias.lt\/wp-content\/uploads\/2018\/07\/Synonym-Graph.jpg\">6<\/a>]. The third approach is to measure Sentimental similarity [<a href=\"http:\/\/laimeskelias.lt\/wp-content\/uploads\/2020\/12\/Sense-Sentiment-Similarity-2012.pdf\">7<\/a>], possibly empowering by various word-emotion associations [<a href=\"http:\/\/saifmohammad.com\/WebPages\/lexicons.html\">8<\/a>, <a href=\"https:\/\/saifmohammad.com\/WebPages\/nrc-vad.html\">9<\/a>, <a href=\"https:\/\/sentic.net\/downloads\/\">10<\/a>, <a href=\"https:\/\/www.researchgate.net\/publication\/337075055_Toward_Dimensional_Emotion_Detection_from_Categorical_Emotion_Annotations\">11<\/a>]. Our goal is to find the winning combination of all of these approaches, that could be used for identifying our moods, opinions, values, cause-effect sequences, and automated decision making in <a href=\"https:\/\/dialexity.com\/blog\/global-wisdom-network\/\">Global Wisdom Network<\/a>. Note that similarity may differ from association (e.g., coffee and cup are not similar, but associated) [<a href=\"https:\/\/watermark.silverchair.com\/coli_a_00237.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAnwwggJ4BgkqhkiG9w0BBwagggJpMIICZQIBADCCAl4GCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMNJfRlJF7Of5zM5bGAgEQgIICLwXfxOIJWMKFpi4pui03X0w4KpLRBSufRx-FWhnjAYp9rZ8oQqDe33zkB9vPFypYac1jBkIDqZnW870VIG6qea66KvOBuTcDbH_96ONVzsCMfTA_9AoWF_g1i61paSJ2SRPkRxPkwhiU6k_DKkRiXwXeq5iKI2rywCzOKO2YfD6SZxKJlaRRviRoB8ySjEfNj_desb_BRXKpoSR2Oc6plsGzTci5KAdg8RkIAvCr1j7u8Ld3gTColXf0w9BFmDGRBXOtID8Wf128kncYaV6N7nYM9hkVYboxmX89qqKMV_3iuvTs4Ut19qcqRWLZtIifgFkQzNQgNr8QzEPbChUbThInupeacEPTPbb4d2DyKCklSDlcYAkTOlL1y_mHS7jYHxXB3tlSEL3eSmqv2pMacctJe4RQmv1NyQebLEsa5nyEx9e5yTrjG5pEteYE7vhkpe9UPdKF2r0iFpMdO3QjEfkhY0K9kJFV4EiL-fG8bjVlQcGaLFQu7v_lUOM6PHPVucQVP04LrU-wBQxFvurkkeSrYb1__aPMr7RghOEX4PQrnOFAv4lkhTMTjTHXKkU_HIZ22Vwa_ZGr-ETlNqh-H2IKJexJerMpGeRbb83Ev6X_OyHb2ClBTR3rCogH8QO_aa3vqL0T7KEaFQ5XTT-0bLRCtTTL-wClnbSUklyyZLApnTUE6Ptxn2UoPvkeBmdP6DUb2lNm3XsfVZd5wiWmHxSRC_1RT6XJjgiA4hFyCEg\">12<\/a>], and we may need both of them<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>[1] <a href=\"https:\/\/datascience.stackexchange.com\/questions\/12872\/how-can-i-get-a-measure-of-the-semantic-similarity-of-words\">How can I get a measure of the semantic similarity of words?<\/a><\/p>\n\n\n\n<p>[2] <a href=\"http:\/\/lcl.uniroma1.it\/adw\/\">Free algorithm(s) for semantic similarity<\/a>, but the results are disappointing, e.g. prudent + thoughtful (synonyms!) = 0.193, but prudent + careless (antonyms!) = 0.43; eccentric + odd (synonyms!) = 0.35, but eccentric + normal (antonyms!) = 0.275<\/p>\n\n\n\n<p>[3] Another compilation of free algorithms is <a href=\"http:\/\/ws4jdemo.appspot.com\/?mode=w&amp;s1=&amp;w1=prudent&amp;s2=&amp;w2=careless%23\">here<\/a><\/p>\n\n\n\n<p>[4] <a href=\"https:\/\/www.semanticscholar.org\/paper\/Integrating-Distributional-Lexical-Contrast-into-Nguyen-Walde\/2a0d8a3b1dfb06f584931fbbcf872a804b30c360?p2df\">Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction<\/a><\/p>\n\n\n\n<p>[5] <a href=\"https:\/\/github.com\/commonsense\/conceptnet-numberbatch\">ConceptNet is used to create word embeddings<\/a> &#8212; representations of word meanings as vectors, similar to word2vec, GloVe, or fastText, but better<\/p>\n\n\n\n<figure class=\"wp-block-image alignright size-full is-resized\"><a href=\"https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Synonym-Graph.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"668\" height=\"971\" src=\"https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Synonym-Graph.jpg\" alt=\"\" class=\"wp-image-927\" style=\"width:200px\" srcset=\"https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Synonym-Graph.jpg 668w, https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Synonym-Graph-206x300.jpg 206w, https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Synonym-Graph-526x765.jpg 526w\" sizes=\"auto, (max-width: 668px) 100vw, 668px\" \/><\/a><\/figure>\n\n\n\n<p>[6] Enriching wordnet with synonyms, near-synonyms, antonyms, near-antonyms from <a href=\"https:\/\/www.merriam-webster.com\/thesaurus\/wisdom\">Merriam Webster<\/a> ,\u00a0<a href=\"https:\/\/www.wordhippo.com\/what-is\/about-us.html\">Word Hippo<\/a> ,\u00a0<a href=\"https:\/\/thesaurus.plus\/\">Thesaurus Plus<\/a> ,\u00a0<a href=\"https:\/\/www.thesaurus.com\/\">Thesaurus.com<\/a> ,\u00a0<a href=\"https:\/\/www.thefreedictionary.com\/\">The Free Dictionary<\/a>, and other thesauri, with subsequent <a href=\"https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Synonym-Graph.jpg\">semantic clustering and counting the chain lengths<\/a><\/p>\n\n\n\n<p>[7] <a href=\"https:\/\/dialexity.com\/blog\/wp-content\/uploads\/2023\/12\/Sense-Sentiment-Similarity-2012.pdf\">Mohtarami et al (2012) Sense Sentiment Similarity<\/a><\/p>\n\n\n\n<p>[8] <a href=\"http:\/\/saifmohammad.com\/WebPages\/lexicons.html\">Mohammad, Sentiment and Emotion Lexicons<\/a><\/p>\n\n\n\n<p>[9] <a href=\"https:\/\/saifmohammad.com\/WebPages\/nrc-vad.html\">Mohammad, Valence, Arousal, and Dominance for 20,000 English Words<\/a><\/p>\n\n\n\n<p>[10] <a href=\"https:\/\/sentic.net\/downloads\/\">ScenticNet: 200,000 natural language concepts (in 40 languages, indexed by 8 continuous sentiment scales)<\/a><\/p>\n\n\n\n<p>[11] <a href=\"https:\/\/www.researchgate.net\/publication\/337075055_Toward_Dimensional_Emotion_Detection_from_Categorical_Emotion_Annotations\">Sungjoon et al (2019) Toward Dimensional Emotion Detection from Categorical Emotion Annotations<\/a><\/p>\n\n\n\n<p>[12] <a href=\"https:\/\/watermark.silverchair.com\/coli_a_00237.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAnwwggJ4BgkqhkiG9w0BBwagggJpMIICZQIBADCCAl4GCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMNJfRlJF7Of5zM5bGAgEQgIICLwXfxOIJWMKFpi4pui03X0w4KpLRBSufRx-FWhnjAYp9rZ8oQqDe33zkB9vPFypYac1jBkIDqZnW870VIG6qea66KvOBuTcDbH_96ONVzsCMfTA_9AoWF_g1i61paSJ2SRPkRxPkwhiU6k_DKkRiXwXeq5iKI2rywCzOKO2YfD6SZxKJlaRRviRoB8ySjEfNj_desb_BRXKpoSR2Oc6plsGzTci5KAdg8RkIAvCr1j7u8Ld3gTColXf0w9BFmDGRBXOtID8Wf128kncYaV6N7nYM9hkVYboxmX89qqKMV_3iuvTs4Ut19qcqRWLZtIifgFkQzNQgNr8QzEPbChUbThInupeacEPTPbb4d2DyKCklSDlcYAkTOlL1y_mHS7jYHxXB3tlSEL3eSmqv2pMacctJe4RQmv1NyQebLEsa5nyEx9e5yTrjG5pEteYE7vhkpe9UPdKF2r0iFpMdO3QjEfkhY0K9kJFV4EiL-fG8bjVlQcGaLFQu7v_lUOM6PHPVucQVP04LrU-wBQxFvurkkeSrYb1__aPMr7RghOEX4PQrnOFAv4lkhTMTjTHXKkU_HIZ22Vwa_ZGr-ETlNqh-H2IKJexJerMpGeRbb83Ev6X_OyHb2ClBTR3rCogH8QO_aa3vqL0T7KEaFQ5XTT-0bLRCtTTL-wClnbSUklyyZLApnTUE6Ptxn2UoPvkeBmdP6DUb2lNm3XsfVZd5wiWmHxSRC_1RT6XJjgiA4hFyCEg\">SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Semantic similarity is often estimated using word2vec, but the freely available algorithms are not good enough [1, 2, 3]. Other [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[27],"class_list":["post-925","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-analysis"],"_links":{"self":[{"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/posts\/925","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/comments?post=925"}],"version-history":[{"count":5,"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/posts\/925\/revisions"}],"predecessor-version":[{"id":932,"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/posts\/925\/revisions\/932"}],"wp:attachment":[{"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/media?parent=925"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/categories?post=925"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dialexity.com\/blog\/wp-json\/wp\/v2\/tags?post=925"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}