Rescale ๋ฏธํŒ… ์˜ˆ์•ฝ

ํŒŒ์ด์ฌ์—์„œ ๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ: ํฌ๊ด„์ ์ธ ๊ฐ€์ด๋“œ

์ธ๊ณต์ง€๋Šฅ

ํŒŒ์ด์ฌ์—์„œ ๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ: ํฌ๊ด„์ ์ธ ๊ฐ€์ด๋“œ

mm
ํŒŒ์ด์ฌ์—์„œ ๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ: ํฌ๊ด„์ ์ธ ๊ฐ€์ด๋“œ

๊ฐœ๋ฐœ์ž์ด์ž dta ๊ณผํ•™์ž๋กœ์„œ ์šฐ๋ฆฌ๋Š” ์ข…์ข… API๋ฅผ ํ†ตํ•ด ์ด๋Ÿฌํ•œ ๊ฐ•๋ ฅํ•œ ๋ชจ๋ธ๊ณผ ์ƒํ˜ธ ์ž‘์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋ณต์žก์„ฑ๊ณผ ๊ทœ๋ชจ๊ฐ€ ์ปค์ง์— ๋”ฐ๋ผ ํšจ์œจ์ ์ด๊ณ  ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚œ API ์ƒํ˜ธ ์ž‘์šฉ์— ๋Œ€ํ•œ ํ•„์š”์„ฑ์ด ์ค‘์š”ํ•ด์ง‘๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋น„๋™๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด ๋น›์„ ๋ฐœํ•˜๋ฉฐ, LLM API๋กœ ์ž‘์—…ํ•  ๋•Œ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ๊ทน๋Œ€ํ™”ํ•˜๊ณ  ์ง€์—ฐ ์‹œ๊ฐ„์„ ์ตœ์†Œํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ์ข…ํ•ฉ ๊ฐ€์ด๋“œ์—์„œ๋Š” Python์—์„œ ๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ์˜ ์„ธ๊ณ„๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋น„๋™๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์˜ ๊ธฐ๋ณธ๋ถ€ํ„ฐ ๋ณต์žกํ•œ ์›Œํฌํ”Œ๋กœ์šฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ณ ๊ธ‰ ๊ธฐ์ˆ ๊นŒ์ง€ ๋ชจ๋“  ๊ฒƒ์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ์ด ๊ธ€์„ ๋งˆ์น˜๋ฉด ๋น„๋™๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์„ ํ™œ์šฉํ•˜์—ฌ LLM ๊ธฐ๋ฐ˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ๊ฐ•ํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํ™•์‹คํžˆ ์ดํ•ดํ•˜๊ฒŒ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์‚ดํŽด๋ณด๊ธฐ ์ „์— ๋จผ์ € ๋น„๋™๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ฐœ๋…์— ๋Œ€ํ•œ ํŠผํŠผํ•œ ๊ธฐ์ดˆ๋ฅผ ๋‹ค์ ธ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋น„๋™๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์€ ์‹คํ–‰์˜ ์ฃผ ์Šค๋ ˆ๋“œ๋ฅผ ์ฐจ๋‹จํ•˜์ง€ ์•Š๊ณ  ์—ฌ๋Ÿฌ ์ž‘์—…์„ ๋™์‹œ์— ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค๋‹ˆ๋‹ค. Python์—์„œ ์ด๋Š” ์ฃผ๋กœ ๋‹ค์Œ์„ ํ†ตํ•ด ๋‹ฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ๋น„๋™๊ธฐ ์ฝ”๋ฃจํ‹ด, ์ด๋ฒคํŠธ ๋ฃจํ”„, ํ“จ์ฒ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋™์‹œ์„ฑ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๊ธฐ ์œ„ํ•œ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ชจ๋“ˆ์ž…๋‹ˆ๋‹ค.

์ฃผ์š” ๊ฐœ๋…:

  • ์ฝ”๋ฃจํ‹ด: ์ •์˜๋œ ํ•จ์ˆ˜ ๋น„๋™๊ธฐ ์ •์˜ ์ผ์‹œ ์ •์ง€ ๋ฐ ์žฌ๊ฐœ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฒคํŠธ ๋ฃจํ”„: ๋น„๋™๊ธฐ ์ž‘์—…์„ ๊ด€๋ฆฌํ•˜๊ณ  ์‹คํ–‰ํ•˜๋Š” ์ค‘์•™ ์‹คํ–‰ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ž…๋‹ˆ๋‹ค.
  • ๊ธฐ๋Œ€๋˜๋Š” ๊ฒƒ๋“ค: await ํ‚ค์›Œ๋“œ์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ์ฒด(์ฝ”๋ฃจํ‹ด, ํƒœ์Šคํฌ, ํ“จ์ฒ˜).

์ด๋Ÿฌํ•œ ๊ฐœ๋…์„ ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ„๋‹จํ•œ ์˜ˆ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

import asyncio

async def greet(name):
    await asyncio.sleep(1)  # Simulate an I/O operation
    print(f"Hello, {name}!")

async def main():
    await asyncio.gather(
        greet("Alice"),
        greet("Bob"),
        greet("Charlie")
    )

asyncio.run(main())

์ด ์˜ˆ์—์„œ ์šฐ๋ฆฌ๋Š” ๋น„๋™๊ธฐ ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. greet I/O ์ž‘์—…์„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๋Š” asyncio.sleep(). ๊ทธ๋งŒํผ main ๊ธฐ๋Šฅ ์‚ฌ์šฉ asyncio.gather() ์—ฌ๋Ÿฌ ์ธ์‚ฌ๋ง์„ ๋™์‹œ์— ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. sleep ์ง€์—ฐ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์„ธ ์ธ์‚ฌ๋ง์ด ๋ชจ๋‘ ์•ฝ 1์ดˆ ํ›„์— ์ธ์‡„๋˜์–ด ๋น„๋™๊ธฐ ์‹คํ–‰์˜ ํž˜์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

LLM API ํ˜ธ์ถœ์—์„œ ๋น„๋™๊ธฐ์˜ ํ•„์š”์„ฑ

LLM API๋กœ ์ž‘์—…ํ•  ๋•Œ, ์šฐ๋ฆฌ๋Š” ์ข…์ข… ์—ฌ๋Ÿฌ API ํ˜ธ์ถœ์„ ์ˆœ์„œ๋Œ€๋กœ ๋˜๋Š” ๋ณ‘๋ ฌ๋กœ ํ•ด์•ผ ํ•˜๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค์— ์ง๋ฉดํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ ๋™๊ธฐ ์ฝ”๋“œ๋Š” ์ƒ๋‹นํ•œ ์„ฑ๋Šฅ ๋ณ‘๋ชฉ ํ˜„์ƒ์œผ๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํŠนํžˆ LLM ์„œ๋น„์Šค์— ๋Œ€ํ•œ ๋„คํŠธ์›Œํฌ ์š”์ฒญ๊ณผ ๊ฐ™์€ ๊ณ  ์ง€์—ฐ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•  ๋•Œ ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค.

LLM API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 100๊ฐœ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ๊ธฐ์‚ฌ์— ๋Œ€ํ•œ ์š”์•ฝ์„ ์ƒ์„ฑํ•ด์•ผ ํ•˜๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์ƒ๊ฐํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋™๊ธฐ์  ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ API ํ˜ธ์ถœ์€ ์‘๋‹ต์„ ๋ฐ›์„ ๋•Œ๊นŒ์ง€ ์ฐจ๋‹จ๋˜์–ด ๋ชจ๋“  ์š”์ฒญ์„ ์™„๋ฃŒํ•˜๋Š” ๋ฐ ๋ช‡ ๋ถ„์ด ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด ๋น„๋™๊ธฐ์  ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๋ฉด ์—ฌ๋Ÿฌ API ํ˜ธ์ถœ์„ ๋™์‹œ์— ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์–ด ์ „์ฒด ์‹คํ–‰ ์‹œ๊ฐ„์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ™˜๊ฒฝ ์„ค์ •

๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ์„ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Python ํ™˜๊ฒฝ์„ ์„ค์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํ•„์š”ํ•œ ์‚ฌํ•ญ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ํŒŒ์ด์ฌ 3.7 ๋˜๋Š” ๊ทธ ์ด์ƒ(๋„ค์ดํ‹ฐ๋ธŒ asyncio ์ง€์›์˜ ๊ฒฝ์šฐ)
  • aiohttp: ๋น„๋™๊ธฐ HTTP ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
  • Openai: ๊ณต์‹ OpenAI Python ํด๋ผ์ด์–ธํŠธ (OpenAI์˜ GPT ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ)
  • ๋žญ์ฒด์ธ: LLM์„ ์‚ฌ์šฉํ•˜์—ฌ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•œ ํ”„๋ ˆ์ž„์›Œํฌ(์„ ํƒ ์‚ฌํ•ญ์ด์ง€๋งŒ ๋ณต์žกํ•œ ์›Œํฌํ”Œ๋กœ์— ๊ถŒ์žฅ๋จ)

pip๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋Ÿฌํ•œ ์ข…์†์„ฑ์„ ์„ค์น˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

pip install aiohttp openai langchain
<div class="relative flex flex-col rounded-lg">

asyncio ๋ฐ aiohttp๋ฅผ ์‚ฌ์šฉํ•œ ๊ธฐ๋ณธ ๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ

aiohttp๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ LLM API์— ๋Œ€ํ•œ ๊ฐ„๋‹จํ•œ ๋น„๋™๊ธฐ ํ˜ธ์ถœ์„ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. OpenAI์˜ GPT-3.5 API๋ฅผ ์˜ˆ๋กœ ๋“ค์–ด ์„ค๋ช…ํ•˜๊ฒ ์ง€๋งŒ, ์ด ๊ฐœ๋…์€ ๋‹ค๋ฅธ LLM API์—๋„ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.

import asyncio
import aiohttp
from openai import AsyncOpenAI

async def generate_text(prompt, client):
    response = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

async def main():
    prompts = [
        "Explain quantum computing in simple terms.",
        "Write a haiku about artificial intelligence.",
        "Describe the process of photosynthesis."
    ]
    
    async with AsyncOpenAI() as client:
        tasks = [generate_text(prompt, client) for prompt in prompts]
        results = await asyncio.gather(*tasks)
    
    for prompt, result in zip(prompts, results):
        print(f"Prompt: {prompt}\nResponse: {result}\n")

asyncio.run(main())

์ด ์˜ˆ์—์„œ ์šฐ๋ฆฌ๋Š” ๋น„๋™๊ธฐ ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. generate_text AsyncOpenAI ํด๋ผ์ด์–ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ OpenAI API๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค. main ์ด ๊ธฐ๋Šฅ์€ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ์™€ ์šฉ๋„์— ๋Œ€ํ•ด ์—ฌ๋Ÿฌ ์ž‘์—…์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. asyncio.gather() ๋™์‹œ์— ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์ด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๋ฉด LLM API์— ์—ฌ๋Ÿฌ ์š”์ฒญ์„ ๋™์‹œ์— ๋ณด๋‚ผ ์ˆ˜ ์žˆ์–ด ๋ชจ๋“  ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ์ด ์‹œ๊ฐ„์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ณ ๊ธ‰ ๊ธฐ์ˆ : ๋ฐฐ์น˜ ๋ฐ ๋™์‹œ์„ฑ ์ œ์–ด

์ด์ „ ์˜ˆ์ œ๋Š” ๋น„๋™๊ธฐ LLM API ํ˜ธ์ถœ์˜ ๊ธฐ๋ณธ์„ ๋ณด์—ฌ์ฃผ์ง€๋งŒ, ์‹ค์ œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—์„œ๋Š” ๋” ์ •๊ตํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. ๋‘ ๊ฐ€์ง€ ์ค‘์š”ํ•œ ๊ธฐ์ˆ , ์ฆ‰ ์š”์ฒญ ์ผ๊ด„ ์ฒ˜๋ฆฌ์™€ ๋™์‹œ์„ฑ ์ œ์–ด๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์š”์ฒญ ์ผ๊ด„ ์ฒ˜๋ฆฌ: ๋งŽ์€ ์ˆ˜์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ, ๊ฐ ํ”„๋กฌํ”„ํŠธ์— ๋Œ€ํ•ด ๊ฐœ๋ณ„ ์š”์ฒญ์„ ๋ณด๋‚ด๋Š” ๊ฒƒ๋ณด๋‹ค ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฌถ์–ด ์ผ๊ด„ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์ด ๋” ํšจ์œจ์ ์ธ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์—ฌ๋Ÿฌ API ํ˜ธ์ถœ๋กœ ์ธํ•œ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ณ  ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

import asyncio
from openai import AsyncOpenAI

async def process_batch(batch, client):
    responses = await asyncio.gather(*[
        client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        ) for prompt in batch
    ])
    return [response.choices[0].message.content for response in responses]

async def main():
    prompts = [f"Tell me a fact about number {i}" for i in range(100)]
    batch_size = 10
    
    async with AsyncOpenAI() as client:
        results = []
        for i in range(0, len(prompts), batch_size):
            batch = prompts[i:i+batch_size]
            batch_results = await process_batch(batch, client)
            results.extend(batch_results)
    
    for prompt, result in zip(prompts, results):
        print(f"Prompt: {prompt}\nResponse: {result}\n")

asyncio.run(main())

๋™์‹œ์„ฑ ์ œ์–ด: ๋น„๋™๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์€ ๋™์‹œ ์‹คํ–‰์„ ํ—ˆ์šฉํ•˜์ง€๋งŒ, API ์„œ๋ฒ„์— ๊ณผ๋ถ€ํ•˜๊ฐ€ ๊ฑธ๋ฆฌ๊ฑฐ๋‚˜ ์†๋„ ์ œํ•œ์„ ์ดˆ๊ณผํ•˜์ง€ ์•Š๋„๋ก ๋™์‹œ์„ฑ ์ˆ˜์ค€์„ ์ œ์–ดํ•˜๋Š” โ€‹โ€‹๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด asyncio.Semaphore๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

import asyncio
from openai import AsyncOpenAI

async def generate_text(prompt, client, semaphore):
    async with semaphore:
        response = await client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

async def main():
    prompts = [f"Tell me a fact about number {i}" for i in range(100)]
    max_concurrent_requests = 5
    semaphore = asyncio.Semaphore(max_concurrent_requests)
    
    async with AsyncOpenAI() as client:
        tasks = [generate_text(prompt, client, semaphore) for prompt in prompts]
        results = await asyncio.gather(*tasks)
    
    for prompt, result in zip(prompts, results):
        print(f"Prompt: {prompt}\nResponse: {result}\n")

asyncio.run(main())

์ด ์˜ˆ์—์„œ๋Š” ์„ธ๋งˆํฌ์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋™์‹œ ์š”์ฒญ ์ˆ˜๋ฅผ 5๊ฐœ๋กœ ์ œํ•œํ•˜์—ฌ API ์„œ๋ฒ„์— ๊ณผ๋ถ€ํ•˜๊ฐ€ ๊ฑธ๋ฆฌ์ง€ ์•Š๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

๋น„๋™๊ธฐ LLM ํ˜ธ์ถœ์—์„œ์˜ ์˜ค๋ฅ˜ ์ฒ˜๋ฆฌ ๋ฐ ์žฌ์‹œ๋„

์™ธ๋ถ€ API๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋Š” ๊ฐ•๋ ฅํ•œ ์˜ค๋ฅ˜ ์ฒ˜๋ฆฌ ๋ฐ ์žฌ์‹œ๋„ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๊ตฌํ˜„ํ•˜๋Š” ๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ์˜ค๋ฅ˜๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ์žฌ์‹œ๋„์— ๋Œ€ํ•œ ์ง€์ˆ˜ ๋ฐฑ์˜คํ”„๋ฅผ ๊ตฌํ˜„ํ•˜๋„๋ก ์ฝ”๋“œ๋ฅผ ๊ฐœ์„ ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

import asyncio
import random
from openai import AsyncOpenAI
from tenacity import retry, stop_after_attempt, wait_exponential

class APIError(Exception):
    pass

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
async def generate_text_with_retry(prompt, client):
    try:
        response = await client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content
    except Exception as e:
        print(f"Error occurred: {e}")
        raise APIError("Failed to generate text")

async def process_prompt(prompt, client, semaphore):
    async with semaphore:
        try:
            result = await generate_text_with_retry(prompt, client)
            return prompt, result
        except APIError:
            return prompt, "Failed to generate response after multiple attempts."

async def main():
    prompts = [f"Tell me a fact about number {i}" for i in range(20)]
    max_concurrent_requests = 5
    semaphore = asyncio.Semaphore(max_concurrent_requests)
    
    async with AsyncOpenAI() as client:
        tasks = [process_prompt(prompt, client, semaphore) for prompt in prompts]
        results = await asyncio.gather(*tasks)
    
    for prompt, result in results:
        print(f"Prompt: {prompt}\nResponse: {result}\n")

asyncio.run(main())

์ด ํ–ฅ์ƒ๋œ ๋ฒ„์ „์—๋Š” ๋‹ค์Œ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

  • ์‚ฌ์šฉ์ž ์ง€์ • APIError API ๊ด€๋ จ ์˜ค๋ฅ˜์— ๋Œ€ํ•œ ์˜ˆ์™ธ์ž…๋‹ˆ๋‹ค.
  • A generate_text_with_retry ์žฅ์‹๋œ ๊ธฐ๋Šฅ @retry tenacity ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ ์ง€์ˆ˜ ๋ฐฑ์˜คํ”„๋ฅผ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ค๋ฅ˜ ์ฒ˜๋ฆฌ process_prompt ์˜ค๋ฅ˜๋ฅผ ํฌ์ฐฉํ•˜๊ณ  ๋ณด๊ณ ํ•˜๋Š” ๊ธฐ๋Šฅ.

์„ฑ๋Šฅ ์ตœ์ ํ™”: ์ŠคํŠธ๋ฆฌ๋ฐ ์‘๋‹ต

์žฅ๋ฌธ ์ฝ˜ํ…์ธ  ์ƒ์„ฑ์˜ ๊ฒฝ์šฐ ์ŠคํŠธ๋ฆฌ๋ฐ ์‘๋‹ต์€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ์ธ์ง€๋œ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ „์ฒด ์‘๋‹ต์„ ๊ธฐ๋‹ค๋ฆฌ๋Š” ๋Œ€์‹ , ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•ด์ง€๋ฉด ํ…์ŠคํŠธ ์ฒญํฌ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

import asyncio
from openai import AsyncOpenAI

async def stream_text(prompt, client):
    stream = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )
    
    full_response = ""
    async for chunk in stream:
        if chunk.choices[0].delta.content is not None:
            content = chunk.choices[0].delta.content
            full_response += content
            print(content, end='', flush=True)
    
    print("\n")
    return full_response

async def main():
    prompt = "Write a short story about a time-traveling scientist."
    
    async with AsyncOpenAI() as client:
        result = await stream_text(prompt, client)
    
    print(f"Full response:\n{result}")

asyncio.run(main())

์ด ์˜ˆ์ œ๋Š” API์—์„œ ์‘๋‹ต์„ ์ŠคํŠธ๋ฆฌ๋ฐํ•˜๊ณ  ๋„์ฐฉํ•˜๋Š” ๋Œ€๋กœ ๊ฐ ์ฒญํฌ๋ฅผ ์ธ์‡„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์ด ์ ‘๊ทผ ๋ฐฉ์‹์€ ํŠนํžˆ ์ฑ„ํŒ… ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด๋‚˜ ์‚ฌ์šฉ์ž์—๊ฒŒ ์‹ค์‹œ๊ฐ„ ํ”ผ๋“œ๋ฐฑ์„ ์ œ๊ณตํ•˜๋ ค๋Š” ๋ชจ๋“  ์‹œ๋‚˜๋ฆฌ์˜ค์— ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

LangChain์„ ์‚ฌ์šฉํ•˜์—ฌ ๋น„๋™๊ธฐ ์›Œํฌํ”Œ๋กœ ๊ตฌ์ถ•

๋” ๋ณต์žกํ•œ LLM ๊ธฐ๋ฐ˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๊ฒฝ์šฐ LangChain ํ”„๋ ˆ์ž„์›Œํฌ ์—ฌ๋Ÿฌ LLM ํ˜ธ์ถœ์„ ์ฒด์ธ์œผ๋กœ ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค๋ฅธ ๋„๊ตฌ๋ฅผ ํ†ตํ•ฉํ•˜๋Š” ๊ณผ์ •์„ ๊ฐ„์†Œํ™”ํ•˜๋Š” ๊ณ ์ˆ˜์ค€ ์ถ”์ƒํ™”๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋น„๋™๊ธฐ ๊ธฐ๋Šฅ์„ ๊ฐ–์ถ˜ LangChain์„ ์‚ฌ์šฉํ•˜๋Š” ์˜ˆ๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์ด ์˜ˆ์—์„œ๋Š” LangChain์„ ์‚ฌ์šฉํ•˜์—ฌ ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ ๋น„๋™๊ธฐ ์‹คํ–‰์„ ํ†ตํ•ด ๋” ๋ณต์žกํ•œ ์›Œํฌํ”Œ๋กœ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. AsyncCallbackManager ๊ทธ๋ฆฌ๊ณ  StreamingStdOutCallbackHandler ์ƒ์„ฑ๋œ ์ฝ˜ํ…์ธ ์˜ ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค.

import asyncio
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.callbacks.manager import AsyncCallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

async def generate_story(topic):
    llm = OpenAI(temperature=0.7, streaming=True, callback_manager=AsyncCallbackManager([StreamingStdOutCallbackHandler()]))
    prompt = PromptTemplate(
        input_variables=["topic"],
        template="Write a short story about {topic}."
    )
    chain = LLMChain(llm=llm, prompt=prompt)
    return await chain.arun(topic=topic)

async def main():
    topics = ["a magical forest", "a futuristic city", "an underwater civilization"]
    tasks = [generate_story(topic) for topic in topics]
    stories = await asyncio.gather(*tasks)
    
    for topic, story in zip(topics, stories):
        print(f"\nTopic: {topic}\nStory: {story}\n{'='*50}\n")

asyncio.run(main())

FastAPI๋ฅผ ์‚ฌ์šฉํ•œ ๋น„๋™๊ธฐ LLM ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ œ๊ณต

๋น„๋™๊ธฐ LLM ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์›น ์„œ๋น„์Šค๋กœ ์ œ๊ณตํ•˜๋ ค๋ฉด ๋น„๋™๊ธฐ ์ž‘์—…์„ ๊ธฐ๋ณธ์ ์œผ๋กœ ์ง€์›ํ•˜๋Š” FastAPI๊ฐ€ ๋งค์šฐ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ํ…์ŠคํŠธ ์ƒ์„ฑ์„ ์œ„ํ•œ ๊ฐ„๋‹จํ•œ API ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์˜ ์˜ˆ์ž…๋‹ˆ๋‹ค.

from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel
from openai import AsyncOpenAI

app = FastAPI()
client = AsyncOpenAI()

class GenerationRequest(BaseModel):
    prompt: str

class GenerationResponse(BaseModel):
    generated_text: str

@app.post("/generate", response_model=GenerationResponse)
async def generate_text(request: GenerationRequest, background_tasks: BackgroundTasks):
    response = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": request.prompt}]
    )
    generated_text = response.choices[0].message.content
    
    # Simulate some post-processing in the background
    background_tasks.add_task(log_generation, request.prompt, generated_text)
    
    return GenerationResponse(generated_text=generated_text)

async def log_generation(prompt: str, generated_text: str):
    # Simulate logging or additional processing
    await asyncio.sleep(2)
    print(f"Logged: Prompt '{prompt}' generated text of length {len(generated_text)}")

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

์ด FastAPI ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์€ ์—”๋“œํฌ์ธํŠธ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. /generate ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฐ›์•„๋“ค์ด๊ณ  ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ ์‘๋‹ต์„ ์ฐจ๋‹จํ•˜์ง€ ์•Š๊ณ  ์ถ”๊ฐ€ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด ๋ฐฑ๊ทธ๋ผ์šด๋“œ ์ž‘์—…์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

๋ชจ๋ฒ” ์‚ฌ๋ก€ ๋ฐ ์ผ๋ฐ˜์ ์ธ ํ•จ์ •

๋น„๋™๊ธฐ LLM API๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋‹ค์Œ ๋ชจ๋ฒ” ์‚ฌ๋ก€๋ฅผ ์—ผ๋‘์— ๋‘์‹ญ์‹œ์˜ค.

  1. ์—ฐ๊ฒฐ ํ’€๋ง ์‚ฌ์šฉ: ์—ฌ๋Ÿฌ ์š”์ฒญ์„ ํ•˜๋Š” ๊ฒฝ์šฐ, ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์—ฐ๊ฒฐ์„ ์žฌ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  2. ์ ์ ˆํ•œ ์˜ค๋ฅ˜ ์ฒ˜๋ฆฌ๋ฅผ ๊ตฌํ˜„ํ•˜์„ธ์š”: ๋„คํŠธ์›Œํฌ ๋ฌธ์ œ, API ์˜ค๋ฅ˜, ์˜ˆ์ƒ์น˜ ๋ชปํ•œ ์‘๋‹ต์— ํ•ญ์ƒ ๋Œ€๋น„ํ•˜์„ธ์š”.
  3. ์š”๊ธˆ ์ œํ•œ์„ ์กด์ค‘ํ•˜์„ธ์š”: API์— ๊ณผ๋ถ€ํ•˜๊ฐ€ ๊ฑธ๋ฆฌ๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๋ ค๋ฉด ์„ธ๋งˆํฌ์–ด๋‚˜ ๊ธฐํƒ€ ๋™์‹œ์„ฑ ์ œ์–ด ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‚ฌ์šฉํ•˜์„ธ์š”.
  4. ๋ชจ๋‹ˆํ„ฐ ๋ฐ ๊ธฐ๋ก: ์„ฑ๋Šฅ์„ ์ถ”์ ํ•˜๊ณ  ๋ฌธ์ œ๋ฅผ ์‹๋ณ„ํ•˜๊ธฐ ์œ„ํ•ด ํฌ๊ด„์ ์ธ ๋กœ๊น…์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
  5. ์žฅํŽธ ์ฝ˜ํ…์ธ ์—๋Š” ์ŠคํŠธ๋ฆฌ๋ฐ์„ ์‚ฌ์šฉํ•˜์„ธ์š”: ์‚ฌ์šฉ์ž ๊ฒฝํ—˜์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ  ๋ถ€๋ถ„์ ์ธ ๊ฒฐ๊ณผ์˜ ์กฐ๊ธฐ ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

์ €๋Š” ์ง€๋‚œ 50๋…„ ๋™์•ˆ ๊ธฐ๊ณ„ ํ•™์Šต๊ณผ ๋”ฅ ๋Ÿฌ๋‹์˜ ๋งคํ˜น์ ์ธ ์„ธ๊ณ„์— ๋ชฐ๋‘ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ €์˜ ์—ด์ •๊ณผ ์ „๋ฌธ โ€‹โ€‹์ง€์‹์€ ํŠนํžˆ AI/ML์— ์ค‘์ ์„ ๋‘” XNUMX๊ฐœ ์ด์ƒ์˜ ๋‹ค์–‘ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง ํ”„๋กœ์ ํŠธ์— ๊ธฐ์—ฌํ•˜๋„๋ก ์ด๋Œ์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜์˜ ๊ณ„์†๋˜๋Š” ํ˜ธ๊ธฐ์‹ฌ์€ ๋˜ํ•œ ๋‚ด๊ฐ€ ๋” ํƒ๊ตฌํ•˜๊ณ  ์‹ถ์€ ๋ถ„์•ผ์ธ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ๋กœ ๋‚˜๋ฅผ ์ด๋Œ์—ˆ์Šต๋‹ˆ๋‹ค.