Zaɓi Harshe

Binciken Ra'ayin Duniya na Samfuran Harshe don Ƙirƙirar Labarun Ƙagaggun

Nazarin iyawar LLMs na kiyaye duniyoyin ƙagaggun masu daidaituwa, bayyana gazawar su wajen daidaiton labari da riƙe yanayi don rubuce-rubucen kirkire-kirkire.
audio-novel.com | PDF Size: 0.1 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Binciken Ra'ayin Duniya na Samfuran Harshe don Ƙirƙirar Labarun Ƙagaggun

1. Gabatarwa

Samfuran Harshe Manya (LLMs) sun zama kayan aiki na yau da kullun a cikin kirkire-kirkire na kwamfuta, tare da ƙarin aikace-aikace a cikin ƙirƙirar labarun ƙagaggun. Duk da haka, labarun ƙagaggun suna buƙatar fiye da ƙwarewar harshe—suna buƙatar ƙirƙira da kiyaye duniyar labari mai daidaituwa wacce ta bambanta da gaskiya yayin riƙe daidaiton ciki. Wannan takarda tana bincika ko LLMs na yanzu suna da "ra'ayin duniya" ko yanayin ciki da ake buƙata don ƙirƙirar labarun ƙagaggun masu jan hankali, suna motsawa bayan kammala rubutu mai sauƙi zuwa ga ginin labari na gaskiya.

Kalubalen asali yana cikin bambanci tsakanin dawo da ilimin gaskiya da gina duniyar ƙagaggun. Yayin da LLMs suka yi fice wajen daidaita tsari da haɗa bayanai, suna fuskantar wahalar kiyaye duniyoyin madadin da suka dace—abu na asali da ake buƙata don rubuce-rubucen labarun ƙagaggun. Wannan binciken yana tantance LLMs tara bisa ma'auni na daidaito da ayyukan ƙirƙirar labari, yana bayyana manyan iyakoki a cikin gine-ginen na yanzu.

2. Tambayoyin Bincike & Hanyoyin Bincike

Binciken yana amfani da tsarin ƙima mai tsari don tantance dacewar LLMs don ƙirƙirar labarun ƙagaggun, yana mai da hankali kan iyakoki biyu masu mahimmanci.

2.1. Tambayoyin Bincike na Asali

  • Daidaito: Shin LLMs za su iya gano da sake fitar da bayanai daidai gwargwado a cikin yanayi daban-daban?
  • Ƙarfi: Shin LLMs suna da ƙarfi ga canje-canje a cikin harshen umarni lokacin da ake sake fitar da bayanan ƙagaggun?
  • Kula da Yanayin Duniya: Shin LLMs za su iya kiyaye "yanayi" na ƙagaggun mai daidaituwa a duk tsawon ƙirƙirar labari?

2.2. Zaɓin Samfura & Tsarin Ƙima

Binciken yana tantance LLMs tara waɗanda suka shafi girma, gine-gine, da tsarin horo daban-daban (dukansu buɗe-tushe da rufaffiyar tushe). Tsarin ƙima ya ƙunshi:

  1. Tambayoyin Ra'ayin Duniya: Jerin umarni da aka tsara don bincika daidaito a cikin tunawa da gaskiyar ƙagaggun.
  2. Aikin Ƙirƙirar Labari: Ƙirƙirar gajerun labarun ƙagaggun kai tsaye bisa takamaiman ƙayyadaddun gina duniya.
  3. Kwatanta Tsakanin Samfura: Binciken tsarin labari da daidaito a cikin gine-gine daban-daban.

Iyakar Ƙima

Samfuran da aka Gwada: LLMs 9

Ma'auni na Farko: Ƙimar Daidaiton Ra'ayin Duniya

Ma'auni na Biyu: Fihirisar Daidaiton Labari

3. Sakamakon Gwaji & Bincike

Sakamakon gwaji ya bayyana iyakoki na asali a cikin iyawar LLMs na yanzu na aiki azaman masu ƙirƙirar labarun ƙagaggun.

3.1. Ƙimar Daidaiton Ra'ayin Duniya

Biyu kawai daga cikin samfuran tara da aka tantance sun nuna daidaitaccen kulawar ra'ayin duniya a duk faɗin tambayoyi. Sauran bakwai sun nuna manyan saba wa kai lokacin da aka nemi su sake fitarwa ko ƙarin bayani kan gaskiyar ƙagaggun da aka kafa a baya a cikin hulɗar. Wannan yana nuna cewa yawancin LLMs ba su da tsarin yanayin ciki mai dorewa don bin diddigin sigogin duniyar ƙagaggun.

Babban Bincike: Yawancin samfuran suna komawa ga amsoshi masu yuwuwar ƙididdiga maimakon kiyaye ƙayyadaddun ƙagaggun da aka kafa, yana nuna rashin daidaituwa na asali tsakanin hasashen token na gaba da sarrafa yanayin labari.

3.2. Binciken Ingancin Ƙirƙirar Labari

Binciken labarun da samfura huɗu masu wakilci suka ƙirƙira ya bayyana "tsarin labari mai ban mamaki iri ɗaya" a cikin gine-gine. Duk da bayanan horo daban-daban da ƙididdiga masu yawa, labarun da aka ƙirƙira sun haɗu kan tsarin shiri iri ɗaya, nau'ikan halaye, da tsarin warwarewa.

Ma'ana: Wannan daidaito yana nuna LLMs ba sa ƙirƙirar labarun ƙagaggun da gaske bisa samfurin duniya na ciki amma a maimakon haka suna sake haɗa samfuran labari da aka koya. Rashin "muryar marubuci" ta musamman ko gina duniya mai daidaituwa yana nuna rashin kulawar yanayi da ake buƙata don labarun ƙagaggun na gaske.

Hoto na 1: Daidaiton Labari a cikin Samfura

Binciken ya bayyana cewa kashi 78% na labarun da aka ƙirƙira sun bi ɗaya daga cikin tsarin shiri na asali guda uku, ba tare da la'akari da umarnin gina duniya na farko ba. Ci gaban halayen ya nuna haɗuwa iri ɗaya, tare da kashi 85% na jaruman sun nuna tsarin dalilai iri ɗaya a cikin saitunan ƙagaggun daban-daban.

4. Tsarin Fasaha & Tsarin Lissafi

Ana iya tsara babban kalubalen a matsayin matsalar kula da yanayi. Bari $W_t$ ya wakilci yanayin duniya a lokacin $t$, yana ɗauke da duk gaskiyar ƙagaggun da aka kafa, halayen halaye, da ƙayyadaddun labari. Ga LLM da ke ƙirƙirar labarun ƙagaggun, muna tsammanin:

$P(response_{t+1} | prompt, W_t) \neq P(response_{t+1} | prompt)$

Wato, amsar samfurin ya kamata ta dogara da umarnin nan take da yanayin duniya da aka tara $W_t$. Duk da haka, gine-ginen na yanzu na tushen mai canzawa suna fifita da farko don:

$\max \sum_{i=1}^{n} \log P(w_i | w_{

inda $\theta$ ke wakiltar sigogin samfurin kuma $w_i$ sune alamun rubutu. Wannan manufar hasashen token na gaba ba ta ƙarfafa kiyaye $W_t$ a bayan taga mahallin nan take ba.

Binciken ya nuna cewa ƙirƙirar labarun ƙagaggun mai nasara yana buƙatar hanyoyi masu kama da waɗanda ke cikin tsarin jijiyoyi-alama ko gine-ginen ƙwaƙwalwar ajiya na waje, inda yanayin duniya $W_t$ ke nan a fili ana kiyaye shi da sabunta shi, kamar yadda aka tattauna a cikin ayyuka kamar Kwamfutar Jijiyoyi Mai Bambanta (Graves et al., 2016).

5. Nazarin Hali: Rashin Bin Yanayin Duniya

Labari: Ana umurci samfurin da ya ƙirƙiri labari game da "duniya inda nauyi ke aiki a gefe." Bayan kafa wannan jigo, umarni na gaba suna tambaya game da rayuwar yau da kullum, gine-gine, da sufuri a cikin wannan duniyar.

Lura: Yawancin samfuran suna komawa cikin sauri zuwa zato na nauyi na yau da kullum a cikin juyi 2-3 na amsa, suna saba wa jigon da aka kafa. Misali, bayan bayyana "gidaje da aka gina a cikin fuskokin dutse," samfurin na iya ambaton "faɗuwa daga gini" ba tare da gane sabani a cikin duniyar nauyi a gefe ba.

Tsarin Bincike: Ana iya samfurin wannan a matsayin gazawar bin diddigin yanayi inda wakilcin ciki na samfurin $W_t$ bai sabunta ko dage ƙayyadaddun ƙagaggun $C_{gravity} = \text{gefe}$ yadda ya kamata ba. Rarraba yuwuwar akan amsoshi a hankali yana komawa ga rarraba horo $P_{train}(\text{ra'ayoyin nauyi})$ maimakon ya kasance yana dogara da $C_{gravity}$.

Ma'ana: Ba tare da hanyoyi na musamman don kula da ƙayyadaddun ƙagaggun ba, LLMs ba za su iya zama masu ƙirƙirar labarun ƙagaggun amintattu ba, ba tare da la'akari da ƙwarewarsu ta harshe ba.

6. Ayyukan Gaba & Jagororin Bincike

Binciken ya nuna zuwa ga jagororin bincike masu ban sha'awa da yawa don inganta iyawar LLMs na ƙirƙirar labarun ƙagaggun:

  • Ƙungiyoyin Yanayin Duniya na Musamman: Gine-gine waɗanda ke raba bin diddigin yanayin labari daga ƙirƙirar harshe, mai yuwuwar ta amfani da ƙwaƙwalwar ajiya na waje ko wakilcin alama.
  • Horo Mai Da Hankali kan Daidaito: Manufofin daidaitawa waɗanda ke ba da lada a fili don kiyaye ƙayyadaddun ƙagaggun a cikin mahallin da aka faɗaɗa.
  • Tsarin Mutum-a-cikin-Madauki: Musanya haɗin gwiwa inda mutane ke sarrafa yanayin duniya yayin da LLMs ke kula da gane harshe, kama da tsarin haɗin gwiwar da aka bincika a Yuan et al. (2022).
  • Samfuran Labarun Ƙagaggun na Musamman: Horon musamman na yanki akan tarin labarun ƙagaggun da aka tsara tare da bayyana bayanin abubuwan gina duniya da baka na labari.
  • Ma'aunin Ƙima: Haɓaka ma'auni na daidaitattun ma'auni don daidaiton ƙagaggun, wucewa bayan ma'aunin samfurin harshe na al'ada don tantance daidaiton labari da kula da yanayin duniya.

Waɗannan hanyoyin za su iya cike gibin tsakanin iyawar LLM na yanzu da buƙatun ƙirƙirar labarun ƙagaggun na gaske, mai yuwuwar ba da damar sabbin nau'ikan kirkire-kirkire na kwamfuta da ba da labari mai ma'amala.

7. Nassoshi

  1. Graves, A., et al. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), 471-476.
  2. Patel, A., et al. (2024). Large Language Models for Interactive Storytelling: Opportunities and Challenges. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.
  3. Riedl, M. O., & Young, R. M. (2003). Character-focused narrative generation for storytelling in games. Proceedings of the AAAI Spring Symposium on Artificial Intelligence and Interactive Entertainment.
  4. Tang, J., Loakman, T., & Lin, C. (2023). Towards coherent story generation with large language models. arXiv preprint arXiv:2302.07434.
  5. Yuan, A., et al. (2022). Wordcraft: A Human-AI Collaborative Editor for Story Writing. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems.
  6. Yang, L., et al. (2023). Improving coherence in long-form story generation with large language models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.

8. Ra'ayin Manazarta: Gibin Ƙirƙirar Labarun Ƙagaggun

Fahimta ta Asali

Takardar ta fallasa aibi mai mahimmanci amma sau da yawa ana watsi da shi a cikin zagayowar hawan LLM: waɗannan samfuran su ne masu daidaita tsari masu amsawa na asali, ba masu gina duniya masu himma ba. Masana'antu sun kasance suna sayar da labarin ƙagaggun na "AI mai ƙirƙira" yayin da samfuran da kansu ba za su iya kiyaye daidaiton ƙagaggun na asali ba. Wannan ba matsala ta sikelin ba ce—na gini ne. Kamar yadda binciken ya nuna, ko da manyan samfuran sun gaza a abin da marubutan ɗan adam suke ɗauka a matsayin sana'a ta asali: kiyaye duniyoyin labarunsu daidai.

Kwararar Hankali

Hanyar binciken ta ware matsala ta asali da wayo. Ta hanyar gwada daidaito a cikin gaskiyar ƙagaggun mai sauƙi maimakon auna ingancin harshe, sun ketare ban sha'awa na saman rubutun LLM don bayyana fanko na tsari a ƙarƙashinsa. Ci gaba daga tambayoyin ra'ayin duniya zuwa ƙirƙirar labari yana nuna cewa rashin daidaituwa ba kawai ƙaramin kura ba ne—yana lalata fitarwar labari kai tsaye. Labarun iri ɗaya a cikin samfura sun tabbatar da cewa muna fuskantar iyaka na tsarin, ba gazawar samfurin ɗaya ba.

Ƙarfi & Aibobi

Ƙarfi: Binciken yana ba da sake duba gaskiya mai buƙata ga yanki aikace-aikace da aka wuce gona da iri. Ta hanyar mai da hankali kan kula da yanayi maimakon fasalin saman, ya gano mashigar ruwa na gaske don ƙirƙirar labarun ƙagaggun. Kwatanta tsakanin samfura tara yana ba da shaida mai ƙarfi cewa wannan iyaka ce ta LLM ta duniya.

Aibi: Takardar ba ta nuna tasirin kasuwanci ba. Idan LLMs ba za su iya kiyaye daidaiton ƙagaggun ba, ƙimarsu don kayan aikin rubutu na ƙwararru tana da iyaka sosai. Wannan ba damuwa na ilimi kawai ba ne—yana shafar tsarin samfura a kowane babban kamfani na AI da ke tallata "mataimakan rubuce-rubucen kirkire-kirkire" a halin yanzu. Binciken kuma bai haɗa da aikin da ke da alaƙa a cikin wasan AI da labari mai ma'amala ba, inda bin diddigin yanayi ya kasance matsala da aka warware shekaru da yawa ta amfani da hanyoyin alama.

Fahimta Mai Aiki

Na farko, kamfanonin AI suna buƙatar daina tallata LLMs a matsayin marubutan labarun ƙagaggun har sai sun warware matsalar kula da yanayi. Na biyu, masu bincike ya kamata su duba bayan gine-ginen mai canzawa kawai—hanyoyin haɗin jijiyoyi-alama, kamar waɗanda aka ƙirƙira a cikin Kwamfutar Jijiyoyi Mai Bambanta ta DeepMind, suna ba da hanyoyin da aka tabbatar don sarrafa yanayi mai dorewa. Na uku, tsarin ƙima da aka haɓaka anan ya kamata ya zama ma'auni don kowane ma'auni na "AI mai ƙirƙira". A ƙarshe, akwai damar samfura a cikin gina musanya waɗanda ke raba sarrafa yanayin duniya daga ƙirƙirar rubutu a fili, suna mai da iyaka zuwa fasali don haɗin gwiwar ɗan adam-AI.

Gudunmawar da takardar ta fi daraja na iya zato gargaɗinta ta ɓoye: muna gina samfuran harshe masu ƙwarewa ba tare da magance ƙayyadaddun ƙayyadaddun gine-gine waɗanda ke hana su cimma haƙƙin labari na gaske ba. Har sai mun warware matsalar yanayi, labarun ƙagaggun da LLM suka ƙirƙira za su ci gaba da kasancewa abin da yake a halin yanzu—shirme mai kyau.