1. Gabatarwa
Wannan takarda ta gabatar da STRUDEL (Tsarin Takaitaccen Tattaunawa Mai Tsari), wani sabon aiki da tsari da aka tsara don inganta ikon fahimtar tattaunawa na samfuran harshe da aka horar da su a baya (PLMs). Ba kamar takaitaccen rubutu na gaba ɗaya na al'ada ba, STRUDEL yana rarraba fahimtar tattaunawa zuwa tsari mai tsari, mai kallon fuskoki da yawa, yana kwaikwayon binciken fahimtar ɗan adam. Babban hasashe shine cewa wannan takaitaccen tsari na iya zama "meta-samfuri" mai inganci ko aiki na sama don inganta aiki akan ayyukan fahimtar tattaunawa na ƙasa kamar Amsa Tambayoyi (QA) da Tsinkayar Amsa.
Marubutan suna jayayya cewa, yayin da takaitaccen tattaunawa ya zama aiki mai zaman kansa da aka kafa, yuwuwar sa a matsayin kayan aiki don haɓaka aiki akan wasu ayyukan NLP har yanzu ba a bincika ba. STRUDEL yana nufin cike wannan gibi ta hanyar samar da samfura tare da sigina mai da hankali da koyarwa.
2. Ayyukan Da Suka Gabata
2.1 Takaitaccen Rubutu Mai Zato
Takardar ta sanya STRUDEL a cikin fagen takaitaccen rubutu mai zato, wanda ya haɗa da samar da taƙaitaccen ma'anar abubuwan da ke cikin rubutun tushe maimakon cire jimloli. Tana nuni ga muhimman ayyuka kamar hanyar sadarwar mai nuna alama ta See et al. (2017) da tsarin jeri-zuwa-jeri na Rush et al. (2015), yana nuna juyin halitta daga hanyoyin cirewa zuwa hanyoyin samarwa. Bambanci ga STRUDEL shine tsarinsa mai tsari, mai fuskoki da yawa musamman ga tattaunawa, ya wuce samar da takaitaccen bayani guda ɗaya zuwa samar da bincike da aka rarraba.
3. Tsarin STRUDEL
An gabatar da STRUDEL a matsayin aikin takaitaccen bayani mai tsari inda ake taƙaita tattaunawa daga fuskoki ko fannoni da yawa da aka ƙayyade da suka dace da fahimta (misali, muhimman yanke shawara, sauye-sauyen motsin rai, tsare-tsaren aiki, ra'ayoyi masu karo da juna). Wannan tsarin yana tilasta samfurin yin bincike akan tattaunawa a matakai da tsari.
Marubutan sun ƙirƙiri bayanan STRUDEL da aka yiwa alama ta ɗan adam don tattaunawa 400 da aka zana daga bayanan MuTual da DREAM, suna samar da muhimmin albarkatu don horarwa da kimantawa.
Muhimmin Bayani
STRUDEL ya sake tsara taƙaitaccen bayani ba a matsayin manufa ta ƙarshe ba, amma a matsayin tsarin tunani mai tsari. Yana aiki a matsayin wakilci na tsakiya wanda ke jagorantar hankalin samfurin a fili zuwa muhimman abubuwan tattaunawa, kamar yadda masu binciken ɗan adam ke ƙirƙira jigo ko bayanan kwalliya kafin amsa tambayoyi masu rikitarwa game da rubutu.
4. Hanyoyi & Tsarin Samfuri
Samfurin da aka gabatar yana haɗa aikin STRUDEL cikin tsarin fahimtar tattaunawa. Ya ginu akan samfurin maɓalli na transformer (misali, BERT, RoBERTa) don farkon ɓoyayyen tattaunawa.
Muhimmin Bayani na Fasaha: An sanya Module na Tunani na Tattaunawa wanda ya dogara da Cibiyar Sadarwar Graph (GNN) a saman maɓallin transformer. An haɗa takaitattun bayanai masu tsari (ko wakilcin su na ɓoye) cikin wannan graph don wadatar da alaƙa tsakanin kalamai na tattaunawa. Nodes na graph suna wakiltar kalamai ko fannoni na taƙaitaccen bayani, kuma gefuna suna wakiltar dogaro na alaƙa (misali, biyo baya, karyatawa, goyon baya). GNN yana yada bayanai ta wannan graph, yana ba da damar yin tunani mai zurfi. Sannan ana amfani da wakilcin da aka haɗa daga transformer da GNN don ayyukan ƙasa.
Horon yana iya haɗawa da manufa mai ayyuka da yawa: $L = L_{ƙasa} + \lambda L_{STRUDEL}$, inda $L_{ƙasa}$ shine asara don QA ko tsinkayar amsa, $L_{STRUDEL}$ shine asara don samar da takaitaccen bayani mai tsari, kuma $\lambda$ shine ma'auni mai ma'auni.
5. Sakamakon Gwaji
Takardar ta ba da rahoton kimantawa ta zahiri akan ayyuka biyu na ƙasa:
- Amsa Tambayoyin Tattaunawa: Dole ne samfura su amsa tambayoyi bisa tattaunawa mai juyi da yawa.
- Tsinkayar Amsar Tattaunawa: Dole ne samfura su zaɓi mafi dacewar amsa ta gaba daga zaɓuɓɓuka da yawa.
Sakamako: Samfurin da aka inganta da STRUDEL ya nuna ingantaccen ci gaban aiki sama da manyan ma'auni na maɓallin transformer akan waɗannan ayyuka. Sakamakon ya tabbatar da hasashen cewa takaitaccen bayani mai tsari yana ba da sigina mafi girma na koyo don fahimta idan aka kwatanta da horarwa akan aikin ƙasa kaɗai ko tare da manufar taƙaitaccen bayani mara tsari. Takardar tana iya haɗawa da teburi da ke kwatanta madaidaicin maki/F1 na samfurin da aka gabatar da ma'auni kamar BERT/RoBERTa na asali da samfuran da aka horar da taƙaitaccen bayani na al'ada.
Fassarar Chati (An Ƙaddara daga Rubutu)
Hoto na 1 a cikin PDF a zahiri yana kwatanta STRUDEL a matsayin meta-samfuri. Chatin sandar da ke kwatanta aiki zai iya nuna: 1) Ma'aunin transformer (ƙananan sandar), 2) Transformer ɗaya da aka daidaita akan aikin taƙaitaccen bayani na al'ada (ingantacciyar ci gaba), 3) Tsarin transformer + STRUDEL + GNN (mafi girman sandar), yana fiye da sauran a fili. Wannan na gani zai jaddada ƙimar tsarin tsari.
6. Bincike na Fasaha & Muhimman Bayanai
Ra'ayi na Manazarta: Rarraba Ƙimar Shawarar STRUDEL
Muhimmin Bayani: STRUDEL ba wani samfurin taƙaitaccen bayani kawai bane; yana da dabarun hanyar gine-gine don shigar da tsarin tunanin ɗan adam mai tsari a cikin transformer masu baƙin akwati. Gaskiyar gudunmawar takardar ita ce gane cewa maƙalar cikin fahimtar tattaunawa ba ilimin harshe na danye ba ne—wanda PLMs ke da shi da yawa—amma tunani na magana mai tsari. Ta hanyar tilasta samfurin samar da taƙaitaccen bayani mai fuskoki da yawa, a zahiri suna yin wani nau'i na "injiniyan fasali" a matakin ma'ana, suna ƙirƙira masu canji na tsaka-tsaki da za a iya fassara waɗanda ke jagorantar ƙididdiga na gaba. Wannan ya yi daidai da yanayin AI na neuro-symbolic, inda ake haɗa hanyoyin sadarwar jijiyoyi tare da wakilci mai tsari, kamar ka'idoji, kamar yadda aka tattauna a cikin binciken daga masu bincike a MIT da Stanford.
Kwararar Hankali & Kwatanta: Marubutan sun gano gibi daidai: aikin da ya gabata kamar samfuran taƙaitaccen bayani na CNN/Daily Mail (See et al., 2017) ko ma masu taƙaita tattaunawa na musamman suna ɗaukar aikin a matsayin matsalar jeri-zuwa-jeri guda ɗaya. STRUDEL ya karya wannan samfurin. Abokinsa na falsafa mafi kusa zai iya zama aiki akan "Ƙirƙirar Tunani" mai sauri, inda ake jagorantar samfura don samar da matakan tunani na tsaka-tsaki. Koyaya, STRUDEL ya gasa wannan tsarin cikin tsarin samfurin da manufar horo, yana mai da shi mafi ƙarfi kuma ƙasa da dogaro da sauri. Idan aka kwatanta da kawai amfani da GNN akan kalamai na tattaunawa (dabarar da aka gani a cikin ayyuka kamar DialogueGCN), STRUDEL yana ba wa GNN fasali na node masu wadatar ma'ana, waɗanda aka narke a baya (fannoni na taƙaitaccen bayani), wanda ke haifar da yaduwar graph mai ma'ana.
Ƙarfi & Kurakurai: Ƙarfinsa shine sauƙinsa mai kyau da sakamako mai ƙarfi na zahiri. Saitin ayyuka da yawa tare da GNN haɗin gwiwa ne mai ƙarfi. Koyaya, laifin takardar shine dogaronsa akan tsarin taƙaitaccen bayani da ɗan adam ya ayyana. Menene fannoni "daidai" don taƙaitawa? Wannan yana buƙatar alama mai tsada kuma bazai yi daidai da duk yankunan tattaunawa ba (misali, sabis na abokin ciniki vs. ilimin halin ɗan adam). Aikin samfurin yana da alaƙa da inganci da dacewar wannan tsarin da aka ƙayyade a baya. Bugu da ƙari, yayin da GNN ya ƙara tunani na alaƙa, yana ƙara rikitarwa. Binciken cirewa (wanda takardar ya kamata ta haɗa) zai zama mahimmanci don ganin ko ribar ta fito ne daga tsarin, GNN, ko haɗin gwiwarsu.
Bayanai Masu Aiki: Ga masu aiki, wannan binciken yana nuna cewa ƙara aikin tsaka-tsaki mai tsari na iya zama hanya mafi inganci don daidaita PLMs don matsalolin NLP masu rikitarwa fiye da daidaitawa kai tsaye kaɗai. Lokacin gina AI na tattaunawa, yi la'akari da yadda "takaitaccen bayani mai tsari" na yankin ku zai kasance (misali, don tallafin fasaha: "matsala ta bayyana," "matakan magance matsala," "warwarewa") kuma ku yi amfani da shi azaman sigina na horo na taimako. Ga masu bincike, mataki na gaba shine sarrafa kansa ko koyon tsarin taƙaitaccen bayani da kansa, watakila ta hanyoyin da ba a kula da su ba ko koyon ƙarfafawa, ya wuce alamar ɗan adam don ƙirƙirar samfuran tunani mai tsari masu daidaitawa da gaske.
7. Misalin Tsarin Bincike
Yanayi: Bincika tattaunawar taron aiki don tsinkaya abin aiki na gaba.
Bincike Mai Tsari Kamar STRUDEL (Babu Lamba):
- Fanni 1 - Yanke Shawarar Da Aka Yi: "Ƙungiyar ta yanke shawarar jinkirta ƙaddamar da Fasali X na makonni biyu."
- Fanni 2 - Abubuwan Aiki Da Aka Sanya: "Alice don kammala takaddun API. Bob don gudanar da binciken tsaro."
- Fanni 3 - Batutuwa/Batutuwa Masu Buɗe: "Kasafin kuɗi don ƙarin gwaji ba a warware ba. Dogaro akan Ƙungiyar Y wani babban haɗari ne."
- Fanni 4 - Matakai na Gaba Da Aka Tattauna: "Tsara taron biyo baya tare da Ƙungiyar Y. Tsara tsarin sadarwa don jinkiri."
Aikin Fahimta (Tsinkayar Amsa): Bayar da tattaunawa da takaitaccen bayani mai tsari na sama, samfuri na iya tsinkaya cewa amsar manaja ta gaba zata kasance: "Zan kafa taro da shugaban Ƙungiyar Y don gobe." Tsarin yana nuna mahimmin "Batu Mai Buɗe" da "Mataki na Gaba" kai tsaye, yana rage shubuha.
8. Aikace-aikace na Gaba & Jagorori
- Mataimakan Tattaunawa na Musamman na Yanki: A cikin tattaunawar shari'a, likita, ko sabis na abokin ciniki, ana iya keɓance tsarin STRUDEL don cire bayanan shari'a masu tsari, taƙaitaccen bayanin alamun cuta, ko bishiyoyin batutuwa, suna inganta tsarin tallafin yanke shawara kai tsaye.
- Minti na Taro ta Atomatik: Bayan taƙaitaccen bayani na gaba ɗaya, samar da minti masu tsari tare da sassan don Masu Halarta, Manufofi, Yanke Shawara, Abubuwan Aiki (Mai shi/Kwanan ƙarshe), da Muhimman Batutuwan Tattaunawa.
- Tsarin Koyarwa Mai Mu'amala: Tsara tattaunawar ɗalibi-malami don bin diddigin fahimtar ra'ayi, kuskuren fahimta, da ci gaban koyo, yana ba da damar koyarwa mafi dacewa.
- Jagorar Bincike - Samfuran Tsarin Kai: Babban jagora na gaba shine motsawa daga fannoni na taƙaitaccen bayani da ɗan adam ya ayyana zuwa tsarin da aka koya ko na fitowa. Dabarun daga samfuran jigo, tarawa na wakilcin ɓoye, ko koyon ƙarfafawa na iya ba da damar samfurin gano mafi amfani na fuskokin taƙaitaccen bayani don wani aiki da kansa.
- Fahimtar Tattaunawa Mai Nau'i-nau'i: Faɗaɗa ra'ayin STRUDEL zuwa taron bidiyo ko tattaunawar jiki, inda dole ne a samo tsari daga magana, rubutu, da alamun gani.
9. Nassoshi
- Chen, J., et al. (2021). Ci gaban Kwanan nan a cikin Takaitaccen Tattaunawa. arXiv preprint.
- Cui, C., et al. (2020). MuTual: Bayanan Bayanai don Tunani na Tattaunawa Mai Juyi Da Yawa. Proceedings of ACL.
- Fabbri, A., et al. (2021). ConvoSumm: Ma'auni da Bayanan Bayanai na Takaitaccen Tattaunawa. Proceedings of EMNLP.
- Gliwa, B., et al. (2019). SAMSum Corpus: Bayanan Tattaunawa da Dan Adam Ya Yiwa Alama don Takaitaccen Bayani. Proceedings of the 2nd Workshop on New Frontiers in Summarization.
- Rush, A. M., et al. (2015). Samfurin Hankali na Jijiyoyi don Takaitaccen Bayanin Jimla. Proceedings of EMNLP.
- See, A., et al. (2017). Ka Samu Matsayin: Takaitaccen Bayani tare da Hanyoyin Sadarwa Mai Nuna Alama. Proceedings of ACL.
- Sun, K., et al. (2019). DREAM: Bayanan Kalubale da Samfura don Fahimtar Karatu na Tushen Tattaunawa. Transactions of the Association for Computational Linguistics.
- Zhang, J., et al. (2020). PEGASUS: Horon Farko tare da Cire Jimlolin Gaba don Takaitaccen Bayani. Proceedings of ICML.
- Zhong, M., et al. (2021). DialoGPT: Babban Girman Horon Farko na Samarwa don Samar da Amsar Tattaunawa. arXiv preprint.
- Zhu, C., et al. (2021). Inganta Takaitaccen Tattaunawa tare da Fahimtar Duba Da Yawa Mai Sanin Jigo. Findings of ACL-IJCNLP.