Refresh

This website www.unite.ai/so/zero-to-advanced-prompt-engineering-with-langchain-in-python/ is currently offline. Cloudflare's Always Online™ shows a snapshot of this web page from the Internet Archive's Wayback Machine. To check for the live version, click Refresh.

stub Eber ilaa injineer degdeg ah oo sare leh oo leh Langchain gudaha Python - Unite.AI
Connect nala

Injineerin degdeg ah

Eber ilaa injineer degdeg ah oo sare leh oo leh Langchain gudaha Python

mm
Updated on

Arrin muhiim ah oo ka mid ah Moodooyinka Luuqadaha Waaweyn (LLMs) waa tirada cabbirrada moodooyinkan ay u isticmaalaan barashada. Mar kasta oo uu qaabku leeyahay cabbirro badan, way sii fiicnaanaysaa in uu fahmo xidhiidhka ereyada iyo odhaahyada. Tani waxay ka dhigan tahay in moodooyinka leh balaayiin xuduudo ah ay leeyihiin awood ay ku soo saaraan qaabab qoraal oo kala duwan oo hal abuur leh oo ay uga jawaabaan su'aalaha dhammaadka-furan iyo kuwa adag ee hab wargelin leh.

LLM-yada sida ChatGPT, ee isticmaala qaabka Transformer-ka, ayaa aad ugu yaqaana fahamka iyo soo saarista luqadda aadanaha, iyaga oo ka dhigaya kuwo faa'iido u leh codsiyada u baahan fahamka luqadda dabiiciga ah. Si kastaba ha ahaatee, ma aha kuwa aan lahayn xaddidaaddooda, oo ay ku jiraan aqoonta duugowday, awood la'aanta in ay la falgalaan nidaamyada dibadda, la'aanta fahamka macnaha guud, iyo mararka qaarkood waxay dhaliyaan jawaabo macquul ah laakiin khaldan ama aan macno lahayn, iyo kuwo kale.

Wax ka qabashada xaddidaadyadan waxay u baahan tahay in lagu dhex daro LLM-yada ilaha xogta dibadda iyo awoodaha, kuwaas oo soo bandhigi kara kakanaanta oo dalban karta xirfado codayn iyo xog badan oo maarayn ah. Tani, oo ay weheliso caqabadaha fahamka fikradaha AI iyo algorithms-ka adag, waxay gacan ka geysataa qalooca waxbarashada ee la xidhiidha horumarinta codsiyada iyadoo la adeegsanayo LLMs.

Si kastaba ha ahaatee, isku-dhafka LLM-yada iyo qalabyada kale si loo sameeyo codsiyada ku shaqeeya LLM waxay dib u qeexi kartaa muuqaalkeena dhijitaalka ah. Kartida codsiyada noocaan ah waa mid aad u ballaaran, oo ay ku jiraan hagaajinta hufnaanta iyo wax soo saarka, fududaynta hawlaha, kobcinta go'aan qaadashada, iyo bixinta khibradaha gaarka ah.

Maqaalkani, waxaanu si qoto dheer u sii wadi doonaa arrimahan, sahaminta farsamooyinka horumarsan ee injineernimada degdega ah ee Langchain, oo bixiya sharraxaad cad, tusaalooyin wax ku ool ah, iyo tilmaamo tallaabo-tallaabo ah oo ku saabsan sida loo hirgeliyo.

Langchain, maktabadda casriga ah, waxay keentaa ku habboonaanta iyo dabacsanaanta naqshadaynta, hirgelinta, iyo hagaajinta dardargelinta. Markaan furfureyno mabaadi'da iyo dhaqamada injineerinka degdega ah, waxaad baran doontaa sida looga faa'iidaysto astaamaha xoogga leh ee Langchain si aad uga faa'iidaysato xoogga SOTA Generative AI moodooyinka sida GPT-4.

Fahamka Dareenka

Kahor intaanan u galin farsamooyinka injineernimada degdega ah, waxaa lama huraan ah in la fahmo fikradda dardargelinta iyo muhiimadooda.

A 'isla markiiba' waa taxane calaamado ah oo loo isticmaalo gelinta qaabka luqadda, isaga oo faraya in uu dhaliyo nooc gaar ah oo jawaab celin ah. Dalabyadu waxay door muhiim ah ka ciyaaraan hagidda hab-dhaqanka tusaalaha. Waxay saamayn ku yeelan karaan tayada qoraalka la soo saaray, iyo marka si sax ah loo farsameeyo, waxay ka caawin karaan moodalku inuu bixiyo aragti, sax ah, iyo natiijooyin u gaar ah macnaha guud.

Injineerka degdega ah waa farshaxanka iyo sayniska naqshadaynta dardargelinta waxtarka leh. Hadafku waa in laga soo saaro wax soo saarka la rabo ee qaabka luqadda. Adoo si taxadar leh u dooranaya oo u habaynaya dardargelinta, qofku wuxuu ku hagi karaa qaabka si loo soo saaro jawaabo sax ah oo habboon. Ficil ahaan, tani waxay ku lug leedahay hagaajinta weedhaha wax gelinta si loo daboolo tababbarka moodeelka iyo eexda qaabdhismeedka.

Farsamaynta injineernimada degdega ah waxay u dhaxaysaa farsamooyin fudud, sida quudinta moodalka ereyada muhiimka ah ee khuseeya, iyo habab aad u horumarsan oo ku lug leh qaabaynta qallafsan, dardargelinta habaysan ee adeegsada makaanikada gudaha ee moodeelka si ay faa'iido u yeelato.

Langchain: Qalabka degdega ah ee u koraya ee ugu dhaqsaha badan

LangChain, oo la bilaabay Oktoobar 2022 by Harrison Chase, wuxuu noqday mid ka mid ah qaab-dhismeedka isha furan ee aadka loo qiimeeyo on GitHub ee 2023. Waxay bixisaa is-dhexgal la fududeeyay oo la habeeyey si loogu daro Moodooyinka Luuqadaha Weyn (LLMs) ee codsiyada. Waxa kale oo ay siisaa is-dhexgal sifo qani ah oo loogu talagalay injineernimada degdegga ah, taas oo u oggolaanaysa horumariyeyaasha inay tijaabiyaan xeelado kala duwan oo ay qiimeeyaan natiijooyinkooda. Adigoo isticmaalaya Langchain, waxaad si hufan oo dareen leh u gudan kartaa hawlaha injineernimada degdega ah.

LangFlow waxay u adeegtaa sidii is-dhexgal isticmaale si loogu abaabulo qaybaha LangChain galka jaantuska la fulin karo, taas oo awood u siinaysa taabasho degdeg ah iyo tijaabin.

LangChain waxay buuxisaa farqiga muhiimka ah ee horumarinta AI ee dadweynaha. Waxay awood u siinaysaa barnaamijyo badan oo NLP ah sida kaaliyeyaasha farsamada, abuurayaasha nuxurka, nidaamyada su'aalaha-jawaabista, iyo in ka badan, si loo xalliyo tiro badan oo ah mashaakilaadka adduunka dhabta ah.

Halkii laga noqon lahaa moodel gooni ah ama bixiye, LangChain wuxuu fududeeyaa isdhexgalka noocyo kala duwan, isagoo kordhinaya awooda codsiyada LLM ee ka baxsan caqabadaha wicitaanka API fudud.

Qaab dhismeedka LangChain

 

Qaybaha ugu muhiimsan ee LangChain waxaa ka mid ah Model I/O, Qaababka degdega ah, Xasuusta, Wakiilada, iyo Silsilada.

Qaabka I/O

LangChain waxay fududaysaa isku xidhka aan kala go'a lahayn ee noocyada luqadaha kala duwan iyadoo ku duuduubaysa is-dhexgal heersare ah oo loo yaqaan Model I/O. Tani waxay fududaynaysaa beddelka moodeelka dadaal la'aanta ah ee hagaajinta ama waxqabadka wanaagsan. LangChain waxay taageertaa bixiyeyaasha mooda luuqadaha kala duwan, oo ay ku jiraan OpenAI, Wajiga laabta, Azure, rashka, Iyo in ka badan.

Hababka degdega ah

Kuwaas waxaa loo isticmaalaa in lagu maareeyo oo lagu wanaajiyo isdhexgalka LLMs iyadoo la siinayo tilmaamo kooban ama tusaalooyin. Hagaajinta dardargelintu waxay kor u qaadaysaa waxqabadka moodeelka, dabacsanaantooduna waxay si weyn wax uga tartaa habka wax gelinta.

Tusaalaha fudud ee qaab-dhismeedka degdega ah:

from langchain.prompts import PromptTemplate
prompt = PromptTemplate(input_variables=["subject"],
template="What are the recent advancements in the field of {subject}?")
print(prompt.format(subject="Natural Language Processing"))

Markaan ku hormareyno kakanaanta, waxaan la kulannaa qaabab aad u casrisan gudaha LangChain, sida qaabka Sababta iyo Sharciga (ReAct). ReAct waa hab muhiim u ah fulinta ficilka halkaas oo wakiilku u xilsaaro hawsha qalab ku habboon, u habeeyo gelinta, oo u kala qaybiyo wax soo saarkiisa si uu hawsha u guto. Tusaalaha Python ee hoose wuxuu muujinayaa qaabka ReAct. Waxay muujinaysaa sida degdega ah loogu habeeyey LangChain, iyadoo la adeegsanayo fikrado iyo ficilo taxane ah si loo xalliyo dhibaatada loona soo saaro jawaab kama dambays ah:

PREFIX = """Answer the following question using the given tools:"""
FORMAT_INSTRUCTIONS = """Follow this format:
Question: {input_question}
Thought: your initial thought on the question
Action: your chosen action from [{tool_names}]
Action Input: your input for the action
Observation: the action's outcome"""
SUFFIX = """Start!
Question: {input}
Thought:{agent_scratchpad}"""

Xasuusta

Xusuustu waa fikrad muhiim ah oo ku jirta LangChain, taasoo awood u siinaysa LLMs iyo qalabyada inay hayaan macluumaadka waqti ka dib. Dabeecaddan sharafta leh waxay wanaajisaa waxqabadka codsiyada LangChain iyadoo kaydisa jawaabihii hore, isdhexgalka isticmaalaha, xaalada deegaanka, iyo hadafyada wakiilka. Istaraatiijiyada ConversationBufferMemory iyo ConversationBufferWindowMemory waxay caawiyaan la socodka qaybaha wada hadalka oo buuxa ama kuwa u dambeeyay, siday u kala horreeyaan. Si loo helo qaab aad u casrisan, istaraatiijiyadda WadahadalkaKGMemory waxay ogolaataa in wada hadalka lagu meeleeyo garaaf aqooneed kaas oo dib loogu celin karo jawaabaha ama loo isticmaali karo in lagu saadaaliyo jawaabaha iyada oo aan la wicin LLM.

Wakiilada

Wakiilku wuxuu la falgalaa aduunka isagoo fulinaya hawlo iyo hawlo. Gudaha LangChain, wakiiladu waxay isku daraan qalabka iyo silsiladaha fulinta hawsha. Waxa ay samayn kartaa xidhiidh ay la yeelato caalamka dibeda xogta soo celinta si ay u kordhiso aqoonta LLM, si ay uga gudubto xadayntooda. Waxay go'aansan karaan inay xisaabinta u gudbiyaan xisaabiyaha ama turjumaan Python iyadoo ku xiran xaaladda.

Wakiilada waxaa lagu qalabeeyay qaybo hoose:

  • Tools: Kuwani waa qaybo shaqeynaya.
  • Xirmooyinka QalabkaUrurinta agabka.
  • Wakiilada Fuliyeyaasha: Kani waa habka fulinta ee ogolaanaya kala doorashada qalabka.

Wakiilada LangChain sidoo kale waxay raacaan qaabka Zero-shot ReAct, halkaasoo go'aanku ku salaysan yahay kaliya sharaxaadda qalabka. Habkan waxa lagu kordhin karaa xusuusta si loo tixgeliyo taariikhda wada sheekaysiga oo dhammaystiran. Iyada oo la adeegsanayo ReAct, halkii aad ka codsan lahayd LLM inuu si toos ah u dhammaystiro qoraalkaaga, waxaad ku kicin kartaa inay kaga jawaabto feker/ficil/fiirin loop ah.

Silsilado

Silsilada, sida ereygu soo jeediyo, waa hawlo taxane ah oo u oggolaanaya maktabadda LangChain in ay u habayso qaab-dhismeedka luqadda iyo soo-saarka si aan kala go 'lahayn. Qaybahan muhiimka ah ee LangChain waxay asal ahaan ka kooban yihiin isku-xirayaal, kuwaas oo noqon kara silsilado kale, ama kuwa hore sida dardargelinta, moodooyinka luqadda, ama adeegyada.

Bal qiyaas silsilad sida suunka qaadaha ee warshadda. Tallaabo kasta oo ku taal suunkan waxay ka dhigan tahay hawl gaar ah, kaas oo noqon kara ku baaqaya qaab luqadeed, ku dabaqida shaqada Python qoraal, ama xitaa u kicinta moodalka si gaar ah.

LangChain waxay u kala saartaa silsiladaheeda saddex nooc: Silsilada adeegga, silsiladaha guud, iyo Isku-darka silsiladaha Documents. Waxaan u dhex geli doonnaa silsiladaha Utility iyo Generic si aan uga doodno.

  • Silsilada Utility waxaa si gaar ah loogu talagalay in laga soo saaro jawaabo sax ah qaababka luqadda ee hawlaha si kooban loo qeexay. Tusaale ahaan, aan eegno LLMMathChain. Silsiladdan utility waxay awood u siinaysaa moodooyinka luqadda inay sameeyaan xisaabinta xisaabta. Waxay aqbashaa su'aasha luqadda dabiiciga ah, iyo qaabka luqadda ayaa isna soo saara qayb ka mid ah koodka Python kaas oo markaa la fuliyo si loo soo saaro jawaabta.
  • Silsilad guud, dhinaca kale, waxay u adeegaan sidii dhismeyaal silsilado kale laakiin si toos ah looma isticmaali karo. Silsiladaan, sida LLMChain, waa aasaas waxaana inta badan lagu daraa silsilado kale si loo fuliyo hawlo adag. Tusaale ahaan, LLMChain waxa had iyo jeer loo adeegsadaa in lagu waydiiyo shay qaabka luqadda iyada oo la habeeyey gelinta ku salaysan qaab-dhismeed degdeg ah oo la bixiyay ka dibna loo gudbiyaa qaabka luqadda.

Talaabo-tallaabo Hirgelinta Injineeriyada Degdegga ah ee Langchain

Waxaan ku socon doonaa habka aad u hirgalinayso injineernimada degdega ah anagoo adeegsanayna Langchain. Kahor intaadan sii wadin, hubi inaad ku rakibtay software-ka lagama maarmaanka ah iyo xirmooyinka.

Waxaad ka faa'iideysan kartaa qalabka caanka ah sida Docker, Conda, Pip, iyo Poetry si aad u dejiso LangChain. Faylasha ku habboon ee mid kasta oo ka mid ah hababkan waxaa laga heli karaa gudaha kaydka LangChain ee https://github.com/benman1/generative_ai_with_langchain. Tan waxaa ku jira a Dockerfile Docker, a shuruudaha.txt ee Pip, a pyproject.toml Gabayga, iyo a langchain_ai.yml faylka Conda.

Maqaalkeena waxaan u isticmaali doonaa Pip, maareeyaha xirmada caadiga ah ee Python, si loo fududeeyo rakibaadda iyo maareynta maktabadaha qolo saddexaad. Haddii aysan ku jirin qaybinta Python kaaga, waxaad ku rakibi kartaa Pip adoo raacaya tilmaamaha https://pip.pypa.io/.

Si aad ugu rakibto maktabad leh Pip, isticmaal amarka pip install library_name.

Si kastaba ha ahaatee, Pip kaligiis ma maareeyo deegaanka. Si loo maareeyo degaano kala duwan, waxaan isticmaalnaa qalabka virualenv.

Qaybta soo socota, waxaan ka hadli doonaa isdhexgalka moodeelka.

Tallaabada 1: Dejinta Langchain

Marka hore, waxaad u baahan tahay inaad rakibto xirmada Langchain. Waxaan isticmaaleynaa Windows OS. Ku socodsii amarka soo socda terminaalkaaga si aad u rakibto:

pip install langchain

Tallaabada 2: Soo dejinta Langchain iyo qaybaha kale ee lagama maarmaanka ah

Marka xigta, soo deji Langchain oo ay la socdaan cutubyada kale ee lagama maarmaanka ah. Halkan, waxaan sidoo kale soo dejineynaa maktabadda transformers, taas oo si weyn loogu isticmaalo hawlaha NLP.

import langchain
from transformers import AutoModelWithLMHead, AutoTokenizer

Talaabada 3: Soo rar Model Hore loo tababaray

Fur AI

OpenAI moodooyinka waxaa si ku habboon loogu dhex geli karaa maktabadda LangChain ama maktabadda macmiilka ee OpenAI Python. Si gaar ah, OpenAI waxay soo saartaa fasalka dhex-galka moodooyinka qoraalka. Labada nooc ee muhiimka ah ee LLM waa GPT-3.5 iyo GPT-4, oo ku kala duwan inta badan dhererka calaamada. Qiimaynta nooc kasta waxa laga heli karaa mareegaha OpenAI. Halka ay ka badan yihiin moodooyinka casriga ah sida GPT-4-32K kuwaas oo leh aqbalaad sare oo calaamad ah, helitaankooda API waa aan mar walba dammaanad qaadin.

Helitaanka moodooyinkan waxay u baahan tahay furaha API OpenAI. Tan waxa lagu samayn karaa iyada oo la samaynayo koonto goobta OpenAI, dejinta macluumaadka biilasha, iyo abuurista fure sir ah oo cusub.

import os
os.environ["OPENAI_API_KEY"] = 'your-openai-token'

Kadib markaad si guul leh u abuurto furaha, waxaad u dejin kartaa sidii doorsoome deegaan (OPENAI_API_KEY) ama waxaad u gudbin kartaa halbeeg ahaan inta lagu jiro dagdaga fasalka ee wicitaanada OpenAI.

Tixgeli qoraalka LangChain si aad u muujiso isdhexgalka moodooyinka OpenAI:

from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")
# The LLM takes a prompt as an input and outputs a completion
prompt = "who is the president of the United States of America?"
completion = llm(prompt)
The current President of the United States of America is Joe Biden.

Tusaalahan, wakiilka ayaa loo bilaabay inuu sameeyo xisaabinta. Wakiilku waxa uu qaataa tallo bixin, hawl fudud oo isku-darka ah, waxa uu ku socodsiiyaa isaga oo isticmaalaya qaabka OpenAI ee la bixiyay oo waxa uu soo celiyaa natiijada.

Wajiga isku duuban

Wajiga isku duuban waa BILAASH- LAGU ISTICMAALO Transformers Python Library, oo ku habboon PyTorch, TensorFlow, iyo JAX, oo ay ku jiraan hirgelinta moodooyinka sida BERT, T5, IWM

Hugging Face waxa kale oo ay bixisaa Hugging Face Hub, oo ah goob lagu martigeliyo kaydka koodka, moodooyinka barashada mashiinka, kaydinta xogta, iyo codsiyada shabakada.

Si aad ugu adeegsato Wajiga Hugging ahaan moodooyinkaaga, waxaad u baahan doontaa akoon iyo furayaasha API, kuwaas oo laga heli karo shabakadooda. Calaamadda waxaa laga heli karaa deegaankaaga sida HUGGINGFACEHUB_API_TOKEN.

Tixgeli qaybta Python ee soo socota ee adeegsata moodal il furan oo uu sameeyay Google, nooca Flan-T5-XXL:

from langchain.llms import HuggingFaceHub
llm = HuggingFaceHub(model_kwargs={"temperature": 0.5, "max_length": 64},repo_id="google/flan-t5-xxl")
prompt = "In which country is Tokyo?"
completion = llm(prompt)
print(completion)

Qoraalkani waxa uu u qaataa su'aal ahaan qayb ahaan wuxuuna soo celinayaa jawaab, isagoo soo bandhigaya aqoonta iyo kartida odoroska tusaalaha.

Tallaabada 4: Injineeriyada Degdegga ah ee aasaasiga ah

Si aan ku bilowno, waxaan soo saari doonaa degdeg ah oo aan aragno sida uu qaabku uga jawaabo.

prompt = 'Translate the following English text to French: "{0}"'
input_text = 'Hello, how are you?'
input_ids = tokenizer.encode(prompt.format(input_text), return_tensors='pt')
generated_ids = model.generate(input_ids, max_length=100, temperature=0.9)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

Qoddobka koodka sare, waxaanu ku siinaynaa degdeg ah in qoraalka Ingiriisiga loo turjumo Faransiis. Qaabka luqadda ayaa markaa isku dayaya inuu tarjumo qoraalka la bixiyay isagoo ku saleysan degdegga.

Tallaabada 5: Injineeriyada degdega ah ee Sare

Iyadoo habka kor ku xusan uu si fiican u shaqeeyo, si fiican ugama faa'iidaysto awoodda injineernimada degdega ah. Aynu ku wanaajino anagoo soo bandhigayna dhismayaal degdeg ah oo aad u adag.

prompt = 'As a highly proficient French translator, translate the following English text to French: "{0}"'
input_text = 'Hello, how are you?'
input_ids = tokenizer.encode(prompt.format(input_text), return_tensors='pt')
generated_ids = model.generate(input_ids, max_length=100, temperature=0.9)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

Qodobbada koodka, waxaanu wax ka beddeleynaa isla markiiba si aan u soo jeedino in tarjumaada uu sameynayo 'turjubaan Faransiis oo aad u yaqaana. Isbeddelka degdegga ah wuxuu u horseedi karaa turjumaado la wanaajiyey, maaddaama moodalku hadda u qaadanayo qof khabiir ah.

Dhisida Nidaamka Q&A ee Suugaanta Waxbarasho ee Langchain

Waxaan dhisi doonaa su'aalaha iyo nidaamka jawaabaha suugaanta tacliinta anagoo adeegsanayna LangChain kaasoo ka jawaabi kara su'aalaha ku saabsan waraaqaha akadeemiyadeed ee dhawaan la daabacay.

Marka hore, si loo dejiyo deegaankeena, waxaan rakibnaa ku tiirsanaanta lagama maarmaanka ah.

pip install langchain arxiv openai transformers faiss-cpu

Rakibaadda kadib, waxaan abuurnaa buug cusub oo Python ah waxaanan soo dejineynaa maktabadaha lagama maarmaanka ah:

from langchain.llms import OpenAI
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.docstore.document import Document
import arxiv

Udub-dhexaadka nidaamkayaga Q&A waa awooda lagu soo qaadan karo waraaqo tacliimeed oo khuseeya goob gaar ah, halkan waxaan ku tixgalinaynaa Habraaca Luqadda Dabiiciga ah (NLP), anagoo adeegsanayna xogta akadeemiyadeed ee arXiv. Si tan loo sameeyo, waxaan qeexeynaa shaqo get_arxiv_data(max_results=10). Hawshani waxay ururinaysaa warqadihii ugu dambeeyay ee NLP ee arXiv waxayna ku soo koobaysaa shayga Dukumentiga ee LangChain, iyada oo la isticmaalayo koobitaanka nuxur ahaan iyo aqoonsiga gelitaanka gaarka ah sida isha.

Waxaan u isticmaali doonaa arXiv API si aan u soo qaadanno waraaqihii dhawaa ee la xiriira NLP:

def get_arxiv_data(max_results=10):
    search = arxiv.Search(
        query="NLP",
        max_results=max_results,
        sort_by=arxiv.SortCriterion.SubmittedDate,
    )
   
    documents = []
   
    for result in search.results():
        documents.append(Document(
            page_content=result.summary,
            metadata={"source": result.entry_id},
        ))
    return documents

Hawshan waxay soo ceshataa koobabkii waraaqihii NLP ee ugu dambeeyay ee arXiv waxayna u beddeshaa walxaha Dukumentiga LangChain. Waxaan u isticmaaleynaa koobitaanka warqadda iyo aqoonsigeeda u gaarka ah (URL ee warqadda) sida nuxurka iyo isha, siday u kala horreeyaan.

def print_answer(question):
    print(
        chain(
            {
                "input_documents": sources,
                "question": question,
            },
            return_only_outputs=True,
        )["output_text"]
    )                 

Aynu qeexno jirkeena oo aynu dejino LangChain:

sources = get_arxiv_data(2)
chain = load_qa_with_sources_chain(OpenAI(temperature=0))

Nidaamkeena su'aalaha iyo jawaabaha tacliimeed ee hadda diyaarsan, waxaan ku tijaabin karnaa anagoo weydiinayna su'aal:

print_answer("What are the recent advancements in NLP?")

Soo-saarku wuxuu noqon doonaa jawaabta su'aashaada, adoo tixraacaya ilaha macluumaadka laga soo saaray. Tusaale ahaan:

Recent advancements in NLP include Retriever-augmented instruction-following models and a novel computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs).
SOURCES: http://arxiv.org/abs/2307.16877v1, http://arxiv.org/abs/2307.16830v1

Waxaad si fudud u bedeli kartaa moodooyinka ama bedeli kartaa nidaamka hadba baahidaada. Tusaale ahaan, halkan waxa aan u bedelaynaa GPT-4 taas oo ina siinaysa jawaab aad u fiican oo faahfaahsan.

sources = get_arxiv_data(2)
chain = load_qa_with_sources_chain(OpenAI(model_name="gpt-4",temperature=0))
Recent advancements in Natural Language Processing (NLP) include the development of retriever-augmented instruction-following models for information-seeking tasks such as question answering (QA). These models can be adapted to various information domains and tasks without additional fine-tuning. However, they often struggle to stick to the provided knowledge and may hallucinate in their responses. Another advancement is the introduction of a computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs). This approach utilizes a single-instruction, multiple-data (SIMD) abstraction of nonlinear programs (NLP) and employs a condensed-space interior-point method (IPM) with an inequality relaxation strategy. This strategy allows for the factorization of the KKT matrix without numerical pivoting, which has previously hampered the parallelization of the IPM algorithm.
SOURCES: http://arxiv.org/abs/2307.16877v1, http://arxiv.org/abs/2307.16830v1

Calaamadda GPT-4 waxay noqon kartaa mid gaaban sida hal xaraf ama ilaa hal eray. Tusaale ahaan, GPT-4-32K, waxay farsamayn kartaa ilaa 32,000 calaamado hal orod halka GPT-4-8K iyo GPT-3.5-turbo ay taageerayaan 8,000 iyo 4,000 calaamad siday u kala horreeyaan. Si kastaba ha ahaatee, waxaa muhiim ah in la ogaado in la macaamilka moodooyinkan ay la socdaan kharash si toos ah u dhigma tirada calaamadaha la farsameeyey, ha ahaato wax-soo-saarka ama soo-saarka.

Marka la eego nidaamkayaga Q&A, haddii qayb ka mid ah suugaanta akadeemiyadeed ay dhaafto xadka ugu sarreeya, nidaamku wuxuu ku guuldareysan doonaa inuu ka shaqeeyo gabi ahaanba, sidaas darteed saameynaya tayada iyo dhammaystirka jawaabaha. Si arrintan looga shaqeeyo, qoraalka waxaa loo qaybin karaa qaybo yaryar oo u hoggaansamaya xadka calaamadda.

FAISS (Facebook AI Raadin isku mid ah) waxay ka caawisaa in si degdeg ah loo helo qaybaha qoraalka ee ugu habboon ee la xidhiidha weydiinta isticmaalaha. Waxa ay abuurtaa matalida vector ee qayb kasta oo qoraal ah waxayna isticmaashaa vectors-kan si ay u aqoonsato ugana soo saarto qaybo badan oo la mid ah matalaadda vector ee su'aasha la bixiyay.

Waxaa muhiim ah in la xasuusto in xitaa isticmaalka qalabka sida FAISS, baahida loo qabo in qoraalka loo qaybiyo qaybo yaryar sababtoo ah xaddidaadaha calaamaduhu waxay mararka qaarkood u horseedi kartaa luminta macnaha guud, saameynaya tayada jawaabaha. Sidaa darteed, maaraynta taxadarka leh iyo wanaajinta isticmaalka calaamaduhu waa muhiim marka la shaqaynayo moodooyinkan waaweyn ee luqadaha.

 
pip install faiss-cpu langchain CharacterTextSplitter

Kadib markaad hubiso in maktabadaha sare lagu rakibay, orod

 
from langchain.embeddings.openai import OpenAIEmbeddings 
from langchain.vectorstores.faiss import FAISS 
from langchain.text_splitter import CharacterTextSplitter 
documents = get_arxiv_data(max_results=10) # We can now use feed more data
document_chunks = []
splitter = CharacterTextSplitter(separator=" ", chunk_size=1024, chunk_overlap=0)
for document in documents:
    for chunk in splitter.split_text(document.page_content):
        document_chunks.append(Document(page_content=chunk, metadata=document.metadata))
search_index = FAISS.from_documents(document_chunks, OpenAIEmbeddings())
chain = load_qa_with_sources_chain(OpenAI(temperature=0))
def print_answer(question):
    print(
        chain(
            {
                "input_documents": search_index.similarity_search(question, k=4),
                "question": question,
            },
            return_only_outputs=True,
        )["output_text"]
    )

Iyada oo koodku dhammaystiran yahay, waxaan hadda haysannaa qalab awood leh oo lagu weydiinayo suugaanta tacliimeed ee ugu dambeysay ee berrinka NLP.

 
Recent advancements in NLP include the use of deep neural networks (DNNs) for automatic text analysis and natural language processing (NLP) tasks such as spell checking, language detection, entity extraction, author detection, question answering, and other tasks. 
SOURCES: http://arxiv.org/abs/2307.10652v1, http://arxiv.org/abs/2307.07002v1, http://arxiv.org/abs/2307.12114v1, http://arxiv.org/abs/2307.16217v1 

Ugu Dambeyn

Ku daridda Hababka Luuqadaha Waaweyn (LLMs) ee arjiyada waxa ay dardargelisay qaadashada qaybo badan, oo ay ku jiraan tarjumaada luqadda, falanqaynta dareenka, iyo soo celinta macluumaadka. Injineernimada degdega ah waa qalab awood badan oo lagu kordhinayo kartida moodooyinkan, Langchain-na waxa ay hogaaminaysaa habka fududaynta hawshan adag. Interface-keeda la jaan-qaadaya, jaangooyooyin degdeg ah oo dabacsan, isku-dhafka moodeelka adag, iyo isticmaalka cusub ee wakiilada iyo silsiladaha waxay xaqiijinayaan natiijooyinka ugu wanaagsan ee waxqabadka LLMs.

Si kastaba ha ahaatee, inkasta oo horumarkan la sameeyay, waxaa jira talooyin yar oo maskaxda lagu hayo. Markaad isticmaaleyso Langchain, waxaa lagama maarmaan ah in la fahmo in tayada wax soo saarka ay si weyn ugu xiran tahay jumlada degdega ah. Tijaabinta qaababka degdega ah iyo qaababka kala duwan waxay dhalin kartaa natiijooyin horumarsan. Sidoo kale, xasuusnoow in kasta oo Langchain ay taageerto noocyo luqadeed oo kala duwan, mid kastaa wuxuu leeyahay meelaha uu ku wanaagsan yahay iyo meelaha uu ku liito. Doorashada midka saxda ah ee hawshaada gaarka ah waa muhiim. Ugu dambeyntii, waxaa muhiim ah in la xasuusto in isticmaalka moodooyinkan ay la socdaan tixgalinta kharashka, maaddaama ka shaqeynta calaamaduhu si toos ah u saameeyaan qiimaha isdhexgalka.

Sida lagu muujiyey hagaha tillaabo-tallaabo, Langchain waxa uu awood u yeelan karaa codsiyada adag, sida nidaamka Q&A ee suugaanta tacliinta. Iyada oo ay sii kordheyso bulshada adeegsadaha iyo kor u kaca sumcadda muuqaalka il-furan, Langchain waxa ay ballan qaadaysaa in ay noqoto aalad muhiim u ah ka faa’iidaysiga awoodda buuxda ee LLM-yada sida GPT-4.

Waxaan ku qaatay shantii sano ee la soo dhaafay aniga oo ku milmay adduunka xiisaha leh ee Barashada Mashiinka iyo Barashada qoto dheer. Dareenkayga iyo khibradayda ayaa ii horseeday inaan wax ku biiriyo in ka badan 50 mashruuc oo injineernimo oo software ah, oo si gaar ah diiradda u saaray AI / ML. Xiisaha joogtada ah ayaa sidoo kale ii soo jiidatay Habraaca Luqadda Dabiiciga ah, oo ah goob aan aad u xiiseeyo in aan wax badan sahamiyo.