Many sentiment analysis tasks require extraction of sentence triplets, ie. Subject - Verb - Object, from a sentence. While there are many approaches to the problem, I recently stumbled upon a fairly easy-to-implement algorithm in a research paper (http://ailab.ijs.si/dunja/SiKDD2007/Papers/Rusu_Trippels.pdf).
The algorithm
function TRIPLET-EXTRACTION(sentence) returns a solution, or failure result ← EXTRACT-SUBJECT(NP_subtree) ∪ EXTRACT-PREDICATE(VP_subtree) ∪ EXTRACT-OBJECT(VP_siblings) if result ≠ failure then return result else return failure function EXTRACT-ATTRIBUTES(word) returns a solution, or failure // search among the word’s siblings if adjective(word) result ← all RB siblings else if noun(word) result ← all DT, PRP$, POS, JJ, CD, ADJP, QP, NP siblings else if verb(word) result ← all ADVP siblings // search among the word’s uncles if noun(word) or adjective(word) if uncle = PP result ← uncle subtree else if verb(word) and (uncle = verb) result ← uncle subtree if result ≠ failure then return result else return failure function EXTRACT-SUBJECT(NP_subtree) returns a solution, or failure subject ← first noun found in NP_subtree subjectAttributes ← EXTRACT-ATTRIBUTES(subject) result ← subject ∪ subjectAttributes if result ≠ failure then return result else return failure function EXTRACT-PREDICATE(VP_subtree) returns solution, or failure predicate ← deepest verb found in VP_subtree predicateAttributes ← EXTRACT-ATTRIBUTES(predicate) result ← predicate ∪ predicateAttributes if result ≠ failure then return result else return failure function EXTRACT-OBJECT(VP_sbtree) returns a solution, or failure siblings ← find NP, PP and ADJP siblings of VP_subtree for each value in siblings do if value = NP or PP object ← first noun in value else object ← first adjective in value objectAttributes ← EXTRACT-ATTRIBUTES(object) result ← object ∪ objectAttributes if result ≠ failure then return result else return failure
Implementation
The above algorithm works on the parsed tree generated by parser such as "Stanford Parser", "OpenNLP Parser". I was using the "Stanford Parser" and the parsed tree generated by the parser was supplied to my Triplet extractor for the result. For my work, I needed the Sentence Triplets along with its sentiment supportive attributes(not all). So, my implementation ignores the extraction of attributes from the "word's uncles", mentioned in the algorithm.
I have implemented the algorithm in java. You can find my work at this link: (https://github.com/SushantKafle/TripletExtraction).


Great breakdown of triplet extraction from sentences your explanation makes a complex NLP concept much easier to grasp. Posts like this are really helpful for anyone diving deeper into language processing and AI. For readers also looking to grow their digital and creative skill set alongside topics like this, you might check out a Digital Marketing Training Institute in Coimbatore or a UI UX Design Course in Coimbatore. Thanks for the insightful content!
ReplyDeleteNice article on triplet extraction! If you’re also interested in learning about digital marketing basics, you might find this helpful guide: What is Digital Marketing. Thanks for sharing!
ReplyDelete