ã¢ã³ããŒãœã³ã®èŠç¹
2020幎ã®ãã¥ãŒãã³ã³ãŒãããã€ãã³ãŒããšãŒãžã§ã³ãããšãŒãžã§ã³ããã¹ãã§å§å

ChatGPT ããã®ä»ã®ãã€ãã³ãŒãã£ã³ã° ããŒã«ã¯ãçŽ 40,000 ä»¶ã®ãããã§ãã¹ããããŸããããå€§èŠæš¡èšèªã¢ãã«ã®çºæåã«æžããã倧åŠé¢çã®ã³ãŒãã«è² ããŸããã
è±åœã®æ°ããç ç©¶ã§ã¯ãç ç©¶è ãã¯äººéãã³ãŒãåãããšãŒãžã§ã³ããš ãã€ãã³ãŒã ææ°ã®å€§èŠæš¡èšèªã¢ãã«ïŒLLMç ç©¶è ãã¡ã¯ãChatGPT-5 ã Claude ãªã©ã®æ°ããã¿ã€ãã®ãšãŒãžã§ã³ãããã¹ãããAI ã®æ¯æŽãªãã§äœæããããšãŒãžã§ã³ãã AI ãæŽ»çšããããŒãžã§ã³ãéåžžã«ç°¡åã«æã¡è² ããããšãçºèŠããŸããã
äž¡ãšãŒãžã§ã³ãã»ããã¯ãã¹ã€ã¹é£éŠå·¥ç§å€§åŠããŒã¶ã³ãæ ¡äººå·¥ç¥èœç ç©¶æã®ç°ãªãäžä»£ã®åŠçã«ãã£ãŠäœæãããŸãããéAIãšãŒãžã§ã³ãã¯ãChatGPTã®èªçãšæ³åŠä¿®å£«èª²çšïŒLLMïŒé©åœã®å§ãŸãã®2幎åã2020å¹Žã«ææ¥ã®äžç°ãšããŠéçºãããŸãããäžæ¹ãæ°ãããšãŒãžã§ã³ãã¯ãææ°ãã€æé«ã®æ³åŠä¿®å£«èª²çšïŒLLMïŒã®æ¯æŽãåããŠãçŸåšåšç±ããåŠçã«ãã£ãŠäœæãããŸããã
äžæ£æäœãããã²ãŒã ã§ãã£ãŠãããã€ãã³ãŒãåããããœãªã¥ãŒã·ã§ã³ã¯åã€ããšãã§ãããäžäœ 5 äœã¯äžè²«ããŠãæªå å·¥ã®ããšãŒãžã§ã³ãã«ãã£ãŠå ããããLLM ãšãŒãžã§ã³ãã®å€§å€æ° (40 åäž 33 å) ã¯ãããŸããŸãªå€æ°ãšç¶æ³ã䌎ãããŒãã¡ã³ãã§ã® 38,304 åã®ãã£ã¬ã³ãžã§ããéåžžã«åçŽãªãããŒã¹ã©ã€ã³ ãšãŒãžã§ã³ãã«ç°¡åã«æã¡è² ããããŸããã
è«æã¯æ¬¡ã®ããã«è¿°ã¹ãŠããŸãã
ãç§ãã¡ã®ç ç©¶ã¯ãæå 端㮠LLM ã¯å®è¡å¯èœãªã³ãŒã (ã€ãŸããæ§æãšã©ãŒã®ãªãã³ãŒã) ãçæã§ãããã®ã®ãçæããããœãªã¥ãŒã·ã§ã³ã¯ãæŠç¥èšç»ãæé©åããã«ããšãŒãžã§ã³ãç«¶äºãªã©ã®åŽé¢ã§äººéãèšèšãããœãªã¥ãŒã·ã§ã³ãšç«¶åã§ããªãããšã瀺ããŠããŸãã
ããããã£ãŠããã®ç ç©¶ã¯ã³ãŒãçæã«ããããã®æ°ããªé åãæåç·ã«ãããããæšè«äž»å°ã®ã³ãŒãåæãéèŠãããã³ãããŒã¯ãããŒã¿ã»ããããªãŒãã³ãœãŒã¹ã®ããŒã¹ã©ã€ã³ã®éçºãä¿é²ããããšãç®æããŠããŸããã
èæ¡ããã課é¡ã¯ãããŸããŸãªæŠç¥ãé§äœ¿ããŠãªãŒã¯ã·ã§ã³ã«åµé çã«åå ããèœæããååãèœæè ã«å±ããç©æµãæé ããããšã§ããã
èè ãã¯ãLLMã«ã¯ãããã©ãŒãã³ã¹åäžã®ããã«ã³ãŒãã«ä»å ¥ãããªã©ãå€ãã®å©ç¹ãäžããããŠãããšææããŠãããããã¯2020幎çã®ã³ãŒãã§ã¯èªããããŠããªãå©ç¹ã§ãããã«ãããããããçµæã確å®ã«æ¹åããä¿®æ£ã³ãŒããæäŸããããšããŠããLLMã¯ãããåãå ¥ããããæŽ»çšãããããããšãã§ããªãã
ãç§ãã¡ã®ãã³ãããŒã¯ã§ã¯ãã³ã³ããã¹ãå ã§åªãããœãªã¥ãŒã·ã§ã³ãå ¬éããŠããLLM ã¯ãããå©çšã§ããŸããã
ããã®çµæã¯ãè€éãªã·ããªãªã«ãããæèå åŠç¿ãšæ€çŽ¢åŒ·ååé¡è§£æ±ºã®éçã«é¢ãããä»åŸã®è峿·±ãç 究課é¡ãæèµ·ããŠããŸããã
ãã¹ãã«äœ¿çšãããLLM㯠GPT-5ã®æè, ãžã§ãã 2.5 ãã, ã¯ããŒãã»ãªãŒãã¹4.1ããš ãã£ãŒãã·ãŒã¯R1*.
åœåŠæ ¡åºã® æ°ããçŽ ãšããã¿ã€ãã«ã§ã ãã€ãã³ãŒãã£ã³ã°ã¯ã³ã³ãã¥ãŒã¿ãµã€ãšã³ã¹å€§åŠé¢çã«åãŠããïŒåžå Žäž»å°åæŠç¥ãã©ã³ãã³ã°ã«ãããæ³åŠä¿®å£«èª²çšïŒLLMïŒå¯Ÿäººéã³ãŒãã£ã³ã°ããŒãã¡ã³ãã¯ããµãŠãµã³ããã³å€§åŠã®èè ãšããªãã¯ã¹ãã©ãŒã倧åŠããã³ã¢ã©ã³ã»ãã¥ãŒãªã³ã°ç ç©¶æã®èè ã«ãã£ãŠçºè¡šããããèè ãã«ãããšããã®ãã³ãããŒã¯ã¯ ãŸããªããªãªãŒã¹.
æ¹æ³
èè ãã¯ããã®åéã«ãããåŸæ¥ã®ãã¹ãã¯ãæç¢ºã«å®çŸ©ããããã€ããªãœãªã¥ãŒã·ã§ã³ãæã€èª²é¡ã«çŠç¹ãåœãŠãŠãããšææããŠããïŒæ£ãã or æ£ãããããŸããïŒãæ€èšŒæžã¿ åäœãã¹ãèè ãã¯ãããã¯LLMæ¯æŽã³ãŒãã®éçãæ¢ãçæ³çãªæ¹æ³ã§ã¯ãªããšäž»åŒµãã代ããã«è€æ°ã®å éšãã³ãããŒã¯ãšãã€ã«ã¹ããŒã³ãåãããããè€éãªãã£ã¬ã³ãžã·ããªãªãèæ¡ããããã®ã·ããªãªã§ã¯ãåå©ã¯å¯èœã ãæ±ºããŠåçŽã§ã¯ãªãã
![æšæºçãªãŠããããã¹ãããŒã¹ã®ã¢ãããŒãïŒäžïŒãšãèè
ããèæ¡ãããããªãŒãã³ãšã³ããªãã£ã¬ã³ãžã·ããªãªïŒäžãéåïŒã®æ¯èŒãåºå
ž [ https://arxiv.org/pdf/2511.20613 ]](https://www.unite.ai/wp-content/uploads/2025/11/figure-1-2.jpg)
æšæºçãªåäœãã¹ã ããŒã¹ã®ã¢ãããŒã (äž) ãšãèè ãèæ¡ãããããªãŒãã³ãšã³ããªãã£ã¬ã³ãž ã·ããªãª (äžãé) ã®æ¯èŒã ãœãŒã¹
èè ãã®ç ç©¶ã§äœ¿çšããããªãŒã¯ã·ã§ã³ã»éè·ã»é éåé¡ïŒAPDPïŒã¯ãã¹ã€ã¹ã®å€§åŠã2020幎ã«äœæããåŠçã®èª²é¡éãå©çšå¯èœã§ãã£ããããéšåçã«èªäž»çã«éžæããããã®ã§ããããã®èª²é¡ã¯ãAIã«ããéçºã®ä¿é²ãå¯èœã«ãªã以åã«ãAPDPã¿ã¹ã¯çšã®èªåãšãŒãžã§ã³ãã®äœæãç®æãããã®ã§ããããã®ãããçŸä»£ã®åŠçã«åã課é¡ãäžããææ°ã®ããŒã«ãå©çšããããšã¯æ¯èŒç容æã§ããã
èè ãã¯ã次ã®ãããªäžè¬çãªãã¹ããã¬ãŒã ã¯ãŒã¯ãé¿ããããšããã HumanEval, ããã°ã³ãŒããã³ã ããã³ WebDevã¢ãªãŒã ïŒä»ã«ãå€ãã®äŸããããŸããïŒãã®çš®ã®ãã¹ãæé ã¯ããŒã¿æ±æïŒã€ãŸããã·ã¹ãã ã ãã¹ãããŒã¿ã§ãã¬ãŒãã³ã° å°éãã代ããã« split).
APDPã¯ã以äžã®2段éã®ããžã¹ãã£ã¯ã¹åé¡ã«åºã¥ããŠããŸãã éãªãŒã¯ã·ã§ã³ ããã³ é è»ã«ãŒãæåã®æ®µéã§ã¯ããšãŒãžã§ã³ãã¯åé éã¿ã¹ã¯ã®å®äºã«å¯ŸããŠæ¯æãããã¹ãéé¡ãå ¥æããããšã§ãã¿ã¹ã¯ã®ç²åŸãç«¶ããŸããå ¥æé¡ãé«ããããšã¿ã¹ã¯ã®ç²åŸãé£ãããªããäœããããšæå€±ã«ã€ãªããå¯èœæ§ããããŸãã
第 2 段éã§ã¯ãåãšãŒãžã§ã³ãã¯ãæéãšãªãœãŒã¹ã®å¶çŽäžã§ãç²åŸããã¿ã¹ã¯ã®ã¿ãã容éãšã³ã¹ããç°ãªãè»äž¡ã«å²ãåœãŠãŠãéæããããã®å¹ççãªèšç»ãäœæããå¿ èŠããããŸãã

APDP ã§ã¯ãäŒæ¥ã¯é éã¿ã¹ã¯ã®éãªãŒã¯ã·ã§ã³ã§å ¥æããç²åŸããã¿ã¹ã¯ã®ã¿ãå®è¡ããããã«è»äž¡ã«ãŒããæé©åããŠãå©çã®æå€§åãç®æããŸãã
ç®æšã¯ãåã«ã¿ã¹ã¯ãå®äºããããšã§ã¯ãªããã©ã®ã¿ã¹ã¯ã®çµã¿åãããæã广çããäºæž¬ããåãããšãããããšããŠããç«¶åä»ç€Ÿã®æŠç¥ãäºæž¬ããããšã§ãå šäœçãªå©çãæå€§åããããšã§ãã
APDP ãã³ãããŒã¯ã¯ãäžé£ã®çžäºäŸåãªãŒã¯ã·ã§ã³ã«æŠç¥çèšç»ãå°å ¥ããåå ¥æãå°æ¥ã®éžæè¢ã®ç¶æ³ãå圢æããããšã§ãã³ãŒãçæã¿ã¹ã¯ã®é£æåºŠãé«ããŸãããã®ããããšãŒãžã§ã³ãã¯ã峿ã®ã³ã¹ãã ãã§ãªããäœçœ®ãã¿ã€ãã³ã°ãé·æçãªçµæã«ã€ããŠãæšè«ããå¿ èŠããããŸãã
é éã®æ ¹æ¬çãªåé¡ã¯ NPããŒãã€ãŸããã¿ã¹ã¯æ°ãå¢ããã«ã€ããŠããããªãã¢ã«ãŽãªãºã ã劥åœãªæéå ã«æé©ãªè§£ã確å®ã«èŠã€ããããšã¯ã§ããªãããã®ãããç·åœããæ³ã¯å®è¡äžå¯èœãªã¢ãããŒããšãªãããšãŒãžã§ã³ãã¯ç²ŸåºŠãšé床ãç ç²ã«ããããåŸãªããªãã
ã¬ãŒã¹ã¯å§ãŸã£ãŠããŸã
èè ãã®è©äŸ¡ã§ã¯ã40åã®LLMã³ãŒããšãŒãžã§ã³ããš17åã®äººéã³ãŒããšãŒãžã§ã³ããäžé£ã®çŽæ¥å¯Ÿæ±ºããŒãã¡ã³ãã§æ¯èŒããã12ã®ããŒãã¡ã³ãã§ã¯ãããã4çš®é¡ã®é路網ããããžãŒã®ç°ãªãçµã¿åããã䜿çšããã ãªãŒã«ãã¬ã€ãªãŒã« ãšãŒãžã§ã³ãã¯ä»ã®ãã¹ãŠã®å¯ŸæŠçžæãš 2 å察æŠããŸãã1 åã¯ãç°ãªãè»äž¡ä»æ§ãæã€ 2 ã€ã®äŒç€Ÿãããããå¶åŸ¡ããŸãã
ãã®èšå®ã«ãããããŒãã¡ã³ãããšã«3,192詊åãåèš38,304詊åãè¡ãããŸãããå詊åã§ã¯ãéè·ã»é éå°ç¹ãšééã«ãã£ãŠå®çŸ©ããã50ã®é éã¿ã¹ã¯ããªãŒã¯ã·ã§ã³ã«ãããããã¹ã€ã¹ããã©ã³ã¹ãã€ã®ãªã¹ããªã©ã³ããã¢ãã«ã«ããéè·¯ã¬ã€ã¢ãŠãäžã§ã©ã³ãã ã«æœéžãããŸããã

ããŒãã¡ã³ãã§äœ¿çšãããç°¡ç¥åãããé路網ïŒã€ã®ãªã¹ïŒå·ŠäžïŒãã¹ã€ã¹ïŒå³äžïŒããªã©ã³ãïŒå·ŠäžïŒããã©ã³ã¹ïŒå³äžïŒãéãšèµ€ã®åè§ã¯éè·ãšé éã®ã¿ã¹ã¯ã瀺ããŸããè²ä»ãã®äžè§åœ¢ã¯ãšãŒãžã§ã³ãã®è»äž¡ã®çŸåšäœçœ®ã瀺ããŸãã
åŠçãšãŒãžã§ã³ãã¯2020幎ã®ã³ãŒã¹ããŒãã¡ã³ãããéžåºãããŸãããã·ã³ã°ã«ãšãªãããŒã·ã§ã³æ±ºåã§äžäœã®æçžŸãåãã8åãšãããŒã¹ã©ã€ã³ãšãŒãžã§ã³ããšã®çŽæ¥å¯Ÿæ±ºã§åªããæçžŸãåãã4åãéžåºãããŸããã
ããŒã¹ã©ã€ã³ãšãŒãžã§ã³ãã¯åºå®ããã ãã¥ãŒãªã¹ãã£ãã¯. ãã€ãŒã 1 å°ã®è»äž¡ã®ã¿ã䜿çšãããããåŠçãç¡èŠããŠåèšè·é¢ãèšç®ããããã«å¿ããŠå ¥æããŸããã ExpCostFixedBid 10 åã®ã©ã³ãã ãªã¿ã¹ã¯ãã·ãã¥ã¬ãŒãããå¹³åéçè²»çšãå ¥æããŸããã æ£çŽãª ã¿ã¹ã¯ãã¹ã±ãžã¥ãŒã«ã«æ¿å ¥ããéã®å®éã®éçè²»çšãèšç®ããŸããã ã¢ãã«å¯ŸæŠçžæ åãããšãããããçžæã®ã³ã¹ãã®èŠç©ããã远å ããæå€§é¡ãå ¥æããã ãªã¹ã¯ã·ãŒãã³ã° æéæžè¡°äºåååžãšã©ã€ãã³ã¹ãæšå®ããã³å¯ŸæŠçžæã®ã¢ããªã³ã°ãçµã¿åãããããã§ã 2 ã€ã®ãã¡é«ãæ¹ãå ¥æããŸããã
è©äŸ¡ã«ã¯ãïŒåè¿°ã®ïŒGPT-5 ThinkingãClaude Opus 4.1ãGemini 2.5 ProãDeepSeek R1ã䜿çšããŠæ§ç¯ãããLLMã³ãŒãåãšãŒãžã§ã³ã40äœãå«ãŸããŸãããåã¢ãã«ã«ã¯5ã€ã®ç°ãªãæŠç¥ãæç€ºãããã¢ãã«ããšã«2åãã€é©çšãããŸããã
2ã€ã®æŠç¥ã§ã¯ãç°ãªãèè ã«ãã£ãŠæžãããéçãªããã³ããã䜿çšãã3ã€ç®ã®æŠç¥ã§ã¯ã¢ãã«ã«èªå·±åçãšåºåã®ä¿®æ£ãæ±ããããã«å¥ã®æŠç¥ã§ã¯å¥ã®æ³åŠä¿®å£«ã«ããæ¹è©ãšä¿®æ£ã宿œãããæåŸã®æŠç¥ã§ã¯ãGPT-4ãçšããŠã4ã€ã®å è¡ã¢ãããŒããã¹ãŠãã¬ãã¥ãŒããããšã§ãæ°ããããã³ãããåæããã
åºæ¬ããã³ããã¯å ã®åŠçã®èª²é¡ãåæ ãããã®ã§ãéåžžã«è€éãªæ¹æ³ã«é Œãããšãªããé ä¿¡ç°å¢ã説æããã¢ãã«ã«å ¥æããŠå©çãæå€§åããããã®èšç»ãç«ãŠãããã«æç€ºããŸãã
ãã¹ãŠã®LLMãšãŒãžã§ã³ãã¯ãã»ã«ããã¬ã€ãšããŒãã¡ã³ãã®äž¡æ¹ã®èšå®ã§ã芳å¯å¯èœãªãã°ããã¹ãŠä¿®æ£ããããŸã§ãã¹ããããŸããããã°ä¿®æ£ã¯ããšã©ãŒæ å ±ã«åºã¥ããŠLLMèªèº«ã«ãã£ãŠèªåŸçã«åŠçãããŸããã
è«æã«ãããšãäžè¬çãªLLMã®å€±æã«ã¯ãã¿ã€ã ã¢ãŠãå¶éã®éåãå²ãåœãŠãããã¿ã¹ã¯ã®åãåããŸãã¯é éã®å€±æãè»äž¡å®¹éå¶éã®éåãªã©ãããããããã®ãšã©ãŒã¯ãæç€ºçãªæç€ºãç¡èŠãããã誀ã£ãåèšç»ããžãã¯ããçºçããããšãå€ããâ :
ç§ãã¡ãçºèŠãããã 1 ã€ã®äžè¬çãªåé¡ (䞻㫠GeminiãClaudeãDeepSeek ã§çºçããGPT ã§ã¯ããã»ã©å€ããããŸãã) ã¯ãLLM ããã°ã解決ã§ããªãããšãé »ç¹ã«çºçããããšã§ãã
ããšãã°ããšãŒãžã§ã³ãã¯ãLLM ã«ãšã©ãŒãéç¥ããŠæŽæ°ãããããŒãžã§ã³ã®ã³ãŒããåä¿¡ãããµã€ã¯ã«ãè€æ°å (ããšãã° 5 ïœ 15 å) å®è¡ããã«ãããããããäžè²«ããŠã¿ã€ã ã¢ãŠãããŸãã
ããã®ãããªç¶æ³ïŒLLMãç¹°ãè¿ãåããã°ã解決ã§ããªãç¶æ³ïŒã«å¯ŸããŠç§ãã¡ãèŠã€ããå¯äžã®è§£æ±ºçã¯ã ãŒãããããçŽãå šäœçã«ãç§ãã¡ã¯ ãã°ã®ãªãã³ãŒããå®çŸããããã®å€å€§ãªæäœæ¥è©äŸ¡å¯Ÿè±¡ãšãããã°ã®ãªããšãŒãžã§ã³ã 40 åãååŸããã«ã¯ãçžåœå€ãã®ãšãŒãžã§ã³ããçæããå¿ èŠããããŸãããã
以äžã«ç€ºãçµæã¯ã4 ã€ã®ãããã¯ãŒã¯ ããããžãšããããžããšã« 3 ã€ã®ããŒãã¡ã³ãã«ãŸãããã40,000 詊åã®ãã¹ã ããŒããçæãã 12 ã®ããã« ã©ãŠã³ãããã³ ããŒãã¡ã³ãã®çµæããŸãšãããã®ã§ãã
| ãšãŒãžã§ã³ã | å¹³åå婿° / ãã¢ãŒ | SD #å婿° / ãã¢ãŒ | å¹³åæåæ° / ãã¢ãŒ | SD #æå / ãã¢ãŒ | åèšå婿° | ç·æå€± | åç |
|---|---|---|---|---|---|---|---|
| åŠç1 | 108.167 | 1.193 | 3.833 | 1.193 | 1298 | 46 | 0.9658 |
| åŠç2 | 104.917 | 2.539 | 7.083 | 2.539 | 1259 | 85 | 0.9368 |
| åŠç3 | 103.917 | 2.466 | 8.083 | 2.466 | 1247 | 97 | 0.9278 |
| åŠç4 | 103.25 | 1.815 | 8.75 | 1.815 | 1239 | 105 | 0.9219 |
| åŠç5 | 96.5 | 2.908 | 15.5 | 2.908 | 1158 | 186 | 0.8616 |
| LLM(O, IR, 1) | 95.417 | 2.314 | 16.583 | 2.314 | 1145 | 199 | 0.8519 |
| LLM(O, A2, 1) | 94.583 | 2.314 | 17.417 | 2.314 | 1135 | 209 | 0.8445 |
| åŠç6 | 93.167 | 1.899 | 18.833 | 1.899 | 1118 | 226 | 0.8318 |
| åŠç7 | 93.167 | 3.563 | 18.833 | 3.563 | 1118 | 226 | 0.8318 |
| LLM(O, A1, 1) | 86.083 | 3.029 | 25.917 | 3.029 | 1033 | 311 | 0.7686 |
| LLM(O, GEN, 2) | 84.083 | 6.947 | 27.917 | 6.947 | 1009 | 335 | 0.7507 |
| LLM(O, CR, 2) | 83.5 | 4.442 | 28.5 | 4.442 | 1002 | 342 | 0.7455 |
| åŠç8 | 83.417 | 4.122 | 28.583 | 4.122 | 1001 | 343 | 0.7448 |
| ãªã¹ã¯ã·ãŒãã³ã° | 82.417 | 3.343 | 29.583 | 3.343 | 989 | 355 | 0.7359 |
| LLM(O, GEN, 1) | 80.667 | 4.355 | 31.25 | 4.372 | 968 | 375 | 0.7208 |
| ã¢ãã«å¯ŸæŠçžæ | 80.583 | 3.26 | 31.417 | 3.26 | 967 | 377 | 0.7195 |
| æ³åŠä¿®å£«(D, A1, 1) | 79.417 | 3.965 | 32.583 | 3.965 | 953 | 391 | 0.7091 |
| ExpCostFixedBid | 77.167 | 4.951 | 34.833 | 4.951 | 926 | 418 | 0.689 |
| LLM(O, IR, 2) | 73.917 | 3.502 | 38 | 3.618 | 887 | 456 | 0.6605 |
| LLM(O, A1, 2) | 72.417 | 2.193 | 39.583 | 2.193 | 869 | 475 | 0.6466 |
| LLM(GãA1ã2) | 68.5 | 3.555 | 43.5 | 3.555 | 822 | 522 | 0.6116 |
| LLM(A, GEN, 2) | 67.917 | 2.968 | 44.083 | 2.968 | 815 | 529 | 0.6064 |
| LLM(G, IR, 2) | 65.917 | 2.314 | 46.083 | 2.314 | 791 | 553 | 0.5885 |
| åŠç9 | 64.167 | 11.044 | 47.833 | 11.044 | 770 | 574 | 0.5729 |
| LLM(GãA1ã1) | 64 | 4.243 | 47.917 | 4.316 | 768 | 575 | 0.5719 |
| LLM(G, IR, 1) | 60.333 | 3.725 | 51.667 | 3.725 | 724 | 620 | 0.5387 |
| LLM(O, A2, 2) | 59.333 | 4.499 | 52.667 | 4.499 | 712 | 632 | 0.5298 |
| LLM(D, CR, 1) | 55.083 | 6.694 | 56.833 | 6.59 | 661 | 682 | 0.4922 |
| LLM(G, GEN, 2) | 53.167 | 3.664 | 58.833 | 3.664 | 638 | 706 | 0.4747 |
| LLM(D, GEN, 2) | 52.083 | 9.06 | 59.917 | 9.06 | 625 | 719 | 0.465 |
| æ£çŽãª | 50.583 | 3.848 | 61.417 | 3.848 | 607 | 737 | 0.4516 |
| åŠç10 | 48.833 | 2.98 | 63.167 | 2.98 | 586 | 758 | 0.436 |
| LLM(D, IR, 1) | 48.583 | 10.211 | 63.417 | 10.211 | 583 | 761 | 0.4338 |
| LLM(AãA1ã1) | 48 | 4.69 | 64 | 4.69 | 576 | 768 | 0.4286 |
| LLM(GãA2ã1) | 47.25 | 3.864 | 64.75 | 3.864 | 567 | 777 | 0.4219 |
| LLM(A, CR, 1) | 43.833 | 4.609 | 68.167 | 4.609 | 526 | 818 | 0.3914 |
| LLM(AãA1ã2) | 43.75 | 2.05 | 68.25 | 2.05 | 525 | 819 | 0.3906 |
| åŠç11 | 42.083 | 5.664 | 69.917 | 5.664 | 505 | 839 | 0.3757 |
| LLM(A, IR, 1) | 39.5 | 2.541 | 72.5 | 2.541 | 474 | 870 | 0.3527 |
| ãã€ãŒã | 36.75 | 1.712 | 75.25 | 1.712 | 441 | 903 | 0.3281 |
| åŠç12 | 36.333 | 1.775 | 75.667 | 1.775 | 436 | 908 | 0.3244 |
| æ³åŠä¿®å£«(D, A2, 1) | 33.917 | 2.193 | 78.083 | 2.193 | 407 | 937 | 0.3028 |
| LLM(A, GEN, 1) | 30.167 | 1.749 | 81.833 | 1.749 | 362 | 982 | 0.2693 |
| æ³åŠä¿®å£«(D, A2, 2) | 29.833 | 2.038 | 82.167 | 2.038 | 358 | 986 | 0.2664 |
| LLM(GãA2ã2) | 27 | 2.256 | 85 | 2.256 | 324 | 1020 | 0.2411 |
| LLM(AãA2ã1) | 26.333 | 0.985 | 85.667 | 0.985 | 316 | 1028 | 0.2351 |
| LLM(O, CR, 1) | 25 | 3.411 | 87 | 3.411 | 300 | 1044 | 0.2232 |
| LLM(A, IR, 2) | 24.333 | 8.542 | 87.667 | 8.542 | 292 | 1052 | 0.2173 |
| LLM(AãA2ã2) | 24 | 1.809 | 88 | 1.809 | 288 | 1056 | 0.2143 |
| LLM(A, CR, 2) | 23.333 | 1.557 | 88.667 | 1.557 | 280 | 1064 | 0.2083 |
| LLM(D, GEN, 1) | 22.5 | 1.784 | 89.5 | 1.784 | 270 | 1074 | 0.2009 |
| æ³åŠä¿®å£«(D, A1, 2) | 13.333 | 1.826 | 98.667 | 1.826 | 160 | 1184 | 0.119 |
| LLM(G, CR, 1) | 9.5 | 1.087 | 102.5 | 1.087 | 114 | 1230 | 0.0848 |
| LLM(G, GEN, 1) | 9.167 | 0.937 | 102.833 | 0.937 | 110 | 1234 | 0.0818 |
| LLM(D, IR, 2) | 7.75 | 0.622 | 104.25 | 0.622 | 93 | 1251 | 0.0692 |
| LLM(G, CR, 2) | 7.25 | 1.422 | 104.75 | 1.422 | 87 | 1257 | 0.0647 |
| LLM(D, CR, 2) | 5.667 | 0.985 | 106.333 | 0.985 | 68 | 1276 | 0.0506 |
åãšãŒãžã§ã³ãã¯ããŒãã¡ã³ãããšã«112詊åããã¬ã€ããããããšãŒãžã§ã³ããããã®åææ°ã®æå€§å¹³åã¯112ã§ããæšæºåå·®ïŒSDïŒã¯ããŒãã¡ã³ãéã®å€åãåæ ããŠããŸãã人éãã³ãŒãã£ã³ã°ãããšãŒãžã§ã³ãã¯å€ªåã§è¡šç€ºãããŸããLLMã³ãŒãã£ã³ã°ããããšãŒãžã§ã³ãã¯ãã¢ãã«ïŒO = GPT-5 ThinkingãG = Gemini 2.5 ProãA = Claude Opus 4.1ãD = DeepSeek R1ïŒããšã«ã©ãã«ä»ãããããã®åŸã«2æåã®ããã³ããæŠç¥ã³ãŒããšããã®ããã³ããã§çæãããæåã®ãšãŒãžã§ã³ãã2çªç®ããç€ºãæ°åãç¶ããŸãã ãœãŒã¹
äžèšã®çµæã«é¢ããŠãèè ãã¯æ¬¡ã®ããã«è¿°ã¹ãŠãããâ :
ãLLMã¯ãAPDPåé¡ã®ããåçŽãªå€çš®ã«ãããŠãããæåŸ éãã®ããããã¯ç«¶äºåã®ããã³ãŒããçæããŸããã§ããïŒã³ãŒãã¯ã»ãŒæ§æãã°ããªãã«ããããããïŒãããã¯ãèªåè£å®ãè¶ ããŠLLMã®æ°ããªåŒ±ç¹ãç¹å®ãããæšè«é§ååã³ãŒãè©äŸ¡ãã³ãããŒã¯ã®éèŠæ§ã匷調ããŠããŸããã
ãç§ãã¡ã®çµæã¯ã人éãã³ãŒãåãããšãŒãžã§ã³ãã®æãããªåªäœæ§ã瀺ããŠããŸãã(i) äžäœ5äœã¯åžžã«åŠçãšãŒãžã§ã³ããå ããŠãããããã³ (ii) LLMãšãŒãžã§ã³ãã®å€§éšåïŒ40ã®ãã¡33ïŒã¯ãéåžžã«åçŽãªããŒã¹ã©ã€ã³ãšãŒãžã§ã³ãã«è² ããŠããŸãã ïŒäºæ³ã³ã¹ãåºå®å ¥æãªã©ïŒã
ãéèŠãªã®ã¯ãåŠçã®ã³ãŒãããããã°ããªãã£ãããšã§ãïŒLLMã®ã³ãŒãã¯ãã»ã«ããã¬ã€ãšããŒãã¡ã³ãã®äž¡æ¹ã§åŸ¹åºçã«ãã¹ãã»ãããã°ããŸããïŒãåŠçãšãŒãžã§ã³ããã¯ã©ãã·ã¥ãããã³ã«ãèªåçã«LLMã«åå©ãäžããŸããããããã®ã¯ã©ãã·ã¥ã®å€ãã¯ç°¡åã«ä¿®æ£ã§ããŸãïŒãšãŒãžã§ã³ãã®ã¿ã€ã ã¢ãŠããªã©ïŒããã®ãããåŠçãšãŒãžã§ã³ã㯠ããã«äžäœã«ã©ã³ã¯ã€ã³ã '
ãããªãå®éšãšããŠãGPT-5 Thinkingã¯ãæé«ã®ããã©ãŒãã³ã¹ãçºæ®ãã人éã®ãšãŒãžã§ã³ãã®ã³ãŒããæ¹åããããã«ä¿ãããŸããã åŠç1ããããLLMã«ãã£ãŠä¿®æ£ããããšãŒãžã§ã³ãã¯ãã®åŸ10äœã«èœã¡ã人éã®ã¹ã³ã¢ã®äžã§ææªã®ã¹ã³ã¢ãšãªããŸãããLLMã®å€æŽã¯ãè§£ãåäžãããã©ãããã20%è¿ããäœäžãããŠããŸã£ãã®ã§ãã
èè ãã¯çµè«ããïŒ
ã[ç§ãã¡ã®]ç ç©¶çµæã¯ãLLMã³ãŒãçæã«ãããéèŠãªéçãç¹ã«çææã®æšè«ããã³èšç»èœåã®éçãæµ®ã圫ãã«ããŸãããçŸä»£ã®LLMã¯æ§æãã°ã®ãªãå®è¡å¯èœãªã³ãŒããçæã§ããŸãããããã¯é«åºŠãªæ±çšAIã«åãã鲿©ã枬ãããã®ãã³ãããŒã¯ãšããŠã¯é©åã§ã¯ãããŸãããã
çµè«
èè ãã¯è«æã®çµç€ã§ããã€ãã³ãŒãã£ã³ã°ãããããæè¡çèæ¯ãæã€äººã ã«åãäžããŠãããšè¿°ã¹ããã®å®è·µããå¹³çåã®åããšããŠè¯å®çã«è©äŸ¡ããŠããããããåæã«ããã€ãã³ãŒãã£ã³ã°ã¯ç»å Žããã°ããã§ããããããã®éçã¯äžæã§ãããçŸå®çã«äºæ³ããããããããªãé«ãå¯èœæ§ããããšç€ºåããŠããã
圌ãã¯ç®æšã®è»¢æãåŒã³ãããããšã§ææ¡ãç· ãããã£ããã³ã³ãã€ã«ãããã³ãŒãããç«¶åããã³ãŒããž.
ãã®è峿·±ãæ°ããè«æãããŸããŸèªãã äººãæ±ãçåã®äžã€ã¯ãèè ããäžãçã£ãŠããã®ãäžãçã£ãŠããã®ãããšããããšã ããªããªããåé¡ã®ãšãŒãžã§ã³ãã¿ã¹ã¯ã¯ãPowerShell ã¹ã¯ãªããããã®ä»ã®åœ¢åŒã®å°ããªæ©èœãä¿®æ£ãåãåºãããšããããã¯ããã«è€éã§èŸŒã¿å ¥ã£ãŠããããã€ãã³ãŒãã£ã³ã°ãé©ããŠããããã ã
* ãã®è«æã§ã¯ããã£ãŒãèãã ãR1ããšããååã¯ååšããªãããã§ãããã€ã³ã¿ãŒãããäžã§ã¯ã»ãã®æ°ä»¶ã®åèæç®ããèŠã€ãããŸããïŒããããä»ã®èè ããDeepSeek R1ããšèª€ã£ãŠè¡šèšãããã®ãšæãããŸãïŒããããããç§ã®èª€ãã§ããã°ããããã£ãŒã«ãããé£çµ¡ãã ãããä¿®æ£ããããŸãã
â 匷調ã¯èè ã®ãã®ã§ãããç§ã®ãã®ã§ã¯ãããŸããã
æåã«å ¬éãããã®ã¯ã2025 幎 11 æ 26 æ¥æ°Žææ¥ã§ãããã©ãŒãããã®ããã17:35 EST ã«ä¿®æ£ãããŸããã












