1. Gabatarwa
Koyon harshe a yara yana bin tsari mai mahimmanci: daga raba sautunan magana, zuwa haɓaka ƙamus, sannan kuma zuwa ƙware a cikin sarƙaƙƙiyar tsarin nahawu. Wannan hanyar ci gaban, da aka lura daga jaririya har zuwa shekaru shida, yana tayar da tambayoyi na asali game da ƙa'idodin lissafi na tushe. Shin wannan matakan koyo wani siffa ne na musamman na ilimin halittar jikin ɗan adam, ko kuma yana iya fitowa a cikin tsarin wucin gadi? Wannan binciken ya magance wannan kai tsaye ta hanyar kwatanta hanyoyin koyo na yara 54 (masu shekaru 18 zuwa 6) da na tsarin GPT-2 48 da aka horar daga farko. Babban hasashe shi ne, idan irin wannan matakan sun bayyana a cikin duka biyun, yana iya nuna ƙa'idodin koyo gama-gari, waɗanda bayanai ke tafiyar da su.
2. Hanyar Bincike
Binciken ya yi amfani da tsarin kwatance, yana bincika ɗan adam da na'urorin wucin gadi a matakai daban-daban na ci gabansu.
2.1 Tsarin Gwaji
Yara: An yi nazarin yadda yara 54 ke yin magana. An kimanta maganarsu ta son rai da kuma ikon su na maimaita jimloli masu sarƙaƙƙiyar nahawu daban-daban, bisa hanyoyin da Friedmann da sauransu (2021) suka kafa.
Tsarin GPT-2: An horar da tsarin GPT-2 (nau'in sigogi miliyan 124) guda 48 daga farkon saiti akan manufofin ƙirar harshe na yau da kullun (misali, WebText). An bincika yanayinsu na ciki a lokuta na yau da kullun a duk tsawon horo.
2.2 Tattara Bayanai & Gwaje-gwajen Bincike
An tattara gwaje-gwajen bincike guda 96 daga ma'auni da aka kafa:
- BLiMP: Don kimanta ilimin nahawu a cikin abubuwan nahawu 67.
- Zorro: Don bincika tunani na ma'ana da na hankali.
- BIG-Bench: Don tantance iyawar harshe da fahimi mai faɗi.
An yi amfani da waɗannan gwaje-gwajen a kan tsarin GPT-2 a kowane mataki na horo kuma sun zama ma'auni masu kama da ayyukan samar da yara.
3. Sakamako & Bincike
3.1 Kwatancen Hanyar Koyo
Binciken ya nuna cewa tsarin GPT-2, kamar yara, suna koyon ƙwarewar harshe a cikin tsari mai tsari. Ayyuka masu sauƙi (misali, yarjejeniyar nahawu ta asali) ana ƙware su da farko a cikin horo, yayin da ayyuka masu sarƙaƙi (misali, sarƙaƙƙiyar tsarin nahawu kamar jimlolin dangantaka) suna buƙatar matakan horo da yawa (mai kama da lokacin ci gaba).
3.2 Tsarin Koyo Mai Kama
Wani muhimmin bincike shi ne yanayin kama na koyo. Ko da ayyukan da aka cim ma gaba ɗaya a ƙarshen horo suna nuna ci gaba mai ma'ana tun daga matakan farko. Wannan yana nuna cewa tsarin yana gina wakilci na tushe waɗanda ake inganta su akai-akai, maimakon koyon ƙwarewa a cikin tsari mai tsauri, keɓaɓɓe.
3.3 Matakan Gama-gari da Na Bambance
Binciken ya gano duka maɗauri da bambance-bambance masu mahimmanci:
- Gama-gari: Babban ci gaba daga siffofi masu sauƙi zuwa mafi sarƙaƙi na nahawu.
- Bambance: Tsarin takamaiman wasu ƙananan ƙwarewa ya bambanta. Misali, tsarin na iya koyon wasu ƙa'idodin nahawu na yau da kullun a wani tsari daban da na yara, mai yiwuwa saboda bambance-bambance a cikin rarraba bayanan horo da kwarewar fahimta da zamantakewar ɗan adam.
Wannan yana nuna cewa, yayin da matsin lamba na bayanai yana haifar da matakai, cikakkun bayanai na jerin matakan ana daidaita su ta hanyar tsarin mai koyo da shigar da shi.
Mahimman Ma'auni na Gwaji
Tsarin da aka Horar: Tsarin GPT-2 guda 48
Gwaje-gwajen Bincike: Ayyuka 96 daga BLiMP, Zorro, BIG-Bench
Mahalarta Yara: 54 (watanni 18 - shekaru 6)
Babban Bincike: Muhimmiyar alaƙa a cikin tsarin matakan koyo tsakanin yara da tsarin, amma ba iri ɗaya ba.
4. Tsarin Fasaha
4.1 Tsarin Lissafi
Babban manufar koyo na GPT-2 shine hasashen alama na gaba ta hanyar ƙididdiga mafi yiwuwa. Idan aka ba da jerin alamomi $x_1, x_2, ..., x_t$, tsarin da aka ƙayyade ta $ heta$ ana horar da shi don rage mummunan log-likelihood:
$L(\theta) = -\sum_{t} \log P(x_t | x_{ Daidaituwar bincike $A_p(\theta, \tau)$ don takamaiman binciken harshe $p$ a matakin horo $\tau$ yana auna iyawar da ta taso. Hanyar koyo ita ce aikin $\tau \rightarrow \{A_{p_1}(\theta, \tau), A_{p_2}(\theta, \tau), ...\}$. Binciken ya kwatanta tsarin da gwaje-gwaje daban-daban $p$ suka ketare bakin kofa na aiki (misali, daidaito 80%) a cikin $\tau$ don tsarin da kuma a cikin shekaru don yara. Hali: Bin Didigin Koyon Jimlolin Dangantaka Aikin Bincike: Bambance nahawu ("Yaron da na gani ya rera waƙa") daga marasa nahawu ("Yaron da na gani ya rera waƙa") jimloli. Matakan Bincike: Wannan tsarin yana ba da damar kwatanta ƙididdiga na jadawalin ci gaba a cikin tsarin koyo daban-daban na asali. Zanen Ra'ayi: Kwatancen Hanyar Koyo Ana iya nuna sakamakon akan zanen axis biyu: Zanen zai nuna duka hanyoyin suna nuna lanƙwasa na koyo mai siffar S ga kowane ƙwarewa, amma tare da tsari na layukan (wane ƙwarewa ya tashi da farko) yana kama ko da yake ba daidai ba. Wani muhimmin hoto na biyu zai zama taswirar zafi wanda ke nuna matrix na alaƙa na tsarin samun duk gwaje-gwaje 96 don ƙungiyar tsarin da kuma tsarin da aka lura a cikin yara, yana haskaka gungu na babban alaƙa da ƙananan alaƙa. Babban Fahimta: Wannan takarda tana ba da muhimmin bincike, mai cike da bayanai: matakan koyon harshe ba sirri ne na musamman ga ɗan adam ba amma wani siffa ne da ke fitowa daga ingantaccen ingantawa, wanda bayanai ke tafiyar da shi a ƙarƙashin ƙuntatawa. Duk da haka, tsarin na waɗannan matakan tsarin mai koyo ne ya haɗa shi. GPT-2 da yara sun haɗu a kan "tsarin koyo daga sauƙi zuwa sarƙaƙi" saboda bayanan sun ƙunshi wannan tsarin. Sun bambanta akan cikakkun bayanai saboda "ra'ayoyin shigarwa" na transformer (Vaswani da sauransu, 2017) sun bambanta da farkon fahimta da fahimtar yaro. Tsarin Ma'ana: Hujja an gina ta da kyau. Ya fara da ingantaccen gaskiyar gwaji (matakan tsari a cikin yara), ya gabatar da tambayar lissafi (shin wannan tsari yana fitowa a cikin AI?), kuma ya yi amfani da ingantacciyar hanyar bincike mai yawa don gwada shi. Matsi daga nuna "tsari yana wanzu" zuwa nazarin "yanayin kama" kuma a ƙarshe zuwa rarraba abubuwan "gama-gari/bambance" yana da ƙarfi a ma'ana. Yana kama da ci gaban nazari a cikin ayyukan tushe kamar takardar CycleGAN (Zhu da sauransu, 2017), wanda bai gabatar da sabon tsari ba kawai amma ya rarraba matsalar fassarar hoto mara biyu zuwa ƙuntatawa na daidaiton zagayowar. Ƙarfi & Kurakurai: Ƙarfin binciken shine tsauraran hanyoyinsa da kwatancen kai tsaye. Yin amfani da nau'ikan tsari da yawa da babban saitin bincike yana rage hayaniya. Babban aibi, wanda aka yarda da shi a fakaice, shine rashin daidaituwa a cikin ma'auni: samarwa a cikin yara da daidaiton bincike na ciki a cikin tsarin. Shin tsarin "sanin" ƙa'idar nahawu a cikin bincike yana daidai da yaro "amfani" da shi a cikin magana ta son rai? Ba lallai ba ne. Wannan yana kama da sukar ma'auni kamar ImageNet inda tsarin ke koyon gajerun hanyoyi (Geirhos da sauransu, 2020). Saitin bincike, ko da yake mai faɗi, bazai iya ɗaukar haɗin kai, ma'anar sadarwa na koyon harshe na ɗan adam ba. Fahimta Mai Aiki: Ga masu binciken AI, wannan ma'adinai ne na zinariya don koyon tsarin karatu da binciken tsari. Idan muna son tsarin su koyi kamar ɗan adam, muna buƙatar injiniyan jerin bayanan horo ko ayyukan asara waɗanda suka fi kama da jadawalin ci gaban ɗan adam. Ga masana kimiyyar fahimi, aikin yana ba da sabon wurin gwaji, mai sarrafawa: canza tsarin tsarin (misali, gabatar da haɗin maimaitawa kamar a cikin LSTM) ko bayanan horo (misali, ƙara shigarwa mai yawa), kuma duba yadda hanyar ci gaban ta canza. Wannan zai iya taimakawa ware gudummawar takamaiman ra'ayoyin ɗan adam. Babban fahimta shi ne, gina AI mafi kyau da fahimtar fahimtar ɗan adam yanzu aiki ɗaya ne, haɗe-haɗe.4.2 Misalin Tsarin Bincike
5. Hoto na Sakamako
6. Babban Fahimta & Ra'ayi na Mai Bincike
7. Aikace-aikace na Gaba & Hanyoyi
8. Nassoshi