O txtai com OpenAI oferece uma solução completa para pesquisa semântica, orquestração de LLM e fluxos de trabalho de modelos de linguagem. Com a capacidade de hospedar um serviço FastAPI e clientes para diversas linguagens, como Python e JavaScript, o txtai facilita a integração e o uso. A compatibilidade com endpoints da OpenAI permite utilizar clientes OpenAI padrão, ideal para testar o txtai e desenvolver localmente.
Serviço de API txtai com OpenAI
Para este artigo, vamos usar o txtai através do Docker. Primeiro, salve a seguinte configuração no arquivo /tmp/config/config.yml
:
config.yml
# Enable OpenAI compat endpoint
openai: True
# Load Wikipedia Embeddings index
cloud:
provider: huggingface-hub
container: neuml/txtai-wikipedia
# LLM instance
llm:
path: llava-hf/llava-interleave-qwen-0.5b-hf
# RAG pipeline configuration
rag:
path: hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
output: flatten
system: You are a friendly assistant. You answer questions from users.
template: |
Answer the following question using only the context below. Only include information
specifically discussed.
question: {question}
context: {context}
# Text to Speech
texttospeech:
path: neuml/kokoro-fp16-onnx
# Transcription
transcription:
path: distil-whisper/distil-large-v3
Inicie o serviço Docker com o seguinte comando:
docker run -it -p 8000:8000 -v /tmp/config:/config -e CONFIG=/config/config.yml \
--entrypoint uvicorn neuml/txtai-gpu --host 0.0.0.0 txtai.api:app
Alternativamente, o txtai pode ser instalado diretamente e executado da seguinte forma:
pip install txtai[all] autoawq autoawq-kernels
CONFIG=/tmp/config/config.yml uvicorn "txtai.api:app"
A API possui autorização baseada em token integrada. Leia mais sobre isso aqui.
Executando uma conclusão de chat de texto
O primeiro exemplo executará uma conclusão de chat de texto. O modelo é um pipeline RAG – mais sofisticado do que uma simples chamada de LLM! Agentes, pipelines e fluxos de trabalho podem ser executados através desta interface!
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="api-key",
)
response = client.chat.completions.create(
messages=[{
"role": "user",
"content": "Tell me about the iPhone",
}],
model="rag",
stream=True
)
for chunk in response:
if chunk.choices:
print(chunk.choices[0].delta.content, end="")
The iPhone is a line of smartphones designed and marketed by Apple Inc. that uses Apple's iOS mobile operating system. The first-generation iPhone was announced by former Apple CEO Steve Jobs on January 9, 2007.
Since then, Apple has annually released new iPhone models and iOS updates. The most recent models being the iPhone 16 and 16 Plus, and the higher-end iPhone 16 Pro and 16 Pro Max.
More than 2.3 billion iPhones have been sold as of January 1, 2024, making Apple the largest vendor of mobile phones in 2023.
Como mencionado, o modelo suporta grande parte do que está disponível no txtai. Por exemplo, vamos executar uma conclusão de chat que executa uma pesquisa de embeddings.
response = client.chat.completions.create(
messages=[{
"role": "user",
"content": "Tell me about the iPhone",
}],
model="embeddings",
)
print(response.choices[0].message.content)
The iPhone is a line of smartphones designed and marketed by Apple Inc. that uses Apple's iOS mobile operating system. The first-generation iPhone was announced by former Apple CEO Steve Jobs on January 9, 2007. Since then, Apple has annually released new iPhone models and iOS updates. iPhone naming has followed various patterns throughout its history.
Modelos de visão
Qualquer LLM txtai suportado pode ser executado através da API de conclusão de chat. Vamos executar um exemplo que descreve uma imagem.
response = client.chat.completions.create(
model="llm",
messages = [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {
"url": "https://raw.githubusercontent.com/neuml/txtai/master/logo.png",
}}
]}
]
)
print(response.choices[0].message.content)
The image shows a logo with the text "Txtai" in blue and green colors.
API de embeddings
Em seguida, vamos gerar embeddings.
for x in client.embeddings.create(input="This is a test", model="vectors").data:
print(x)
Embedding(embedding=[-0.01969079300761223, 0.024085085839033127, 0.0043829963542521, -0.027423616498708725, 0.040405914187431335, 0.017446696758270264, 0.028464825823903084, 0.000792442646343261, -0.03107883222401142, -0.024745089933276176, -0.013542148284614086, 0.039981111884117126, -0.01401221938431263, -0.011294773779809475, -0.04346214607357979, 0.015698621049523354, 0.03775031119585037, -0.009020405821502209, 0.046784739941358566, -0.017400527372956276, -0.0670166090130806, -0.05122058466076851, 0.027725063264369965, -0.023947732523083687, -0.044582683593034744, 0.04960233345627785, 0.029517438262701035, 0.05424104258418083, -0.06027599796652794, -0.035852570086717606, 0.01336587406694889, -0.008941668085753918, 0.00014064145216252655, -0.05230511724948883, -0.02150369994342327, 0.04969678074121475, -0.05967864394187927, -0.029450856149196625, -0.01113089732825756, -0.01256561279296875, -0.012282170355319977, 0.03466389700770378, -0.005313237197697163, -0.037443146109580994, -0.04366842657327652, -0.019057273864746094, -0.04015717655420303, 0.050483088940382004, -0.011932676658034325, -0.026569517329335213, -0.048395730555057526, 0.021978085860610008, 0.03273716941475868, -0.009176520630717278, -0.05367470160126686, 0.01982428878545761, -0.00373812741599977, 0.009933742694556713, 0.044389136135578156, -0.06162404641509056, 0.03372818976640701, 0.006638737395405769, 0.029836857691407204, 0.014663859270513058, 0.04531872272491455, 0.03151382878422737, -0.007935297675430775, -0.02053055912256241, -0.06477595120668411, -0.017908524721860886, -0.014721200801432133, -0.0072138686664402485, -0.03244556114077568, -0.018965184688568115, 0.04862097278237343, -0.02961636148393154, 0.005204972345381975, 0.015699708834290504, 0.05033862590789795, -0.017976371571421623, -0.05143386870622635, -0.014295309782028198, -0.018152274191379547, 0.04641849547624588, 0.007279090117663145, -0.0060980431735515594, -0.04208022356033325, 0.05402654781937599, 0.0001357585861114785, 0.044958386570215225, -0.03261513262987137, 0.02126067876815796, 0.020893605425953865, -0.007570710498839617, -0.015284491702914238, -0.011333705857396126, -0.006006874144077301, 0.03481211140751839, 0.04163122922182083, -0.0683935135602951, -0.030256368219852448, -0.024272358044981956, 0.04630593582987785, -0.05253031477332115, 0.011599390767514706, -0.034757863730192184, 0.0033751465380191803, -0.03200560435652733, -0.04386962205171585, 0.015501669608056545, -0.01703309454023838, -0.029905665665864944, -0.03208091855049133, -0.027883553877472878, 0.007325653452426195, 0.03735042363405228, 0.08069189637899399, -0.044986918568611145, 0.030896944925189018, 0.017477652058005333, -0.0063758366741240025, 0.02287706732749939, 0.016398170962929726, 0.01946086250245571, 0.012854589149355888, 0.04439576715230942, 0.04581235349178314, -0.0008034319616854191, -0.03105296567082405, -0.024504775181412697, 0.023659462109208107, 0.04492054134607315, -0.025883017107844353, -0.002515115775167942, 0.052770763635635376, 0.009667950682342052, -0.022283289581537247, -0.07817766815423965, 0.03883073106408119, -0.04804662615060806, 0.011968107894062996, -0.03163604810833931, 0.030380938202142715, -0.022775596007704735, 0.03142687678337097, -0.11540865898132324, -0.062065351754426956, 0.003252241527661681, -0.016604064032435417, 0.046795569360256195, -0.01973356492817402, 0.005612187087535858, 0.04902602732181549, -0.029760321602225304, -0.0006560107576660812, 0.02137850970029831, 0.021465344354510307, -0.030499190092086792, -0.013952907174825668, 0.015388991683721542, -0.004734670277684927, -0.02678225375711918, 0.056917935609817505, -0.0031489196699112654, -0.000562859873753041, 0.08021821081638336, 0.045039497315883636, 0.051955677568912506, -0.06851264089345932, -0.0202163215726614, -0.020257024094462395, 0.009915929287672043, 0.027132542803883553, -0.039319392293691635, -0.06750114262104034, 0.00721193291246891, 0.011379252187907696, -0.00012379158579278737, 0.021098755300045013, -0.017165066674351692, -0.06655416637659073, 0.03575438633561134, -0.0449126660823822, 0.024580610916018486, 0.0027450474444776773, -0.07029049843549728, -0.0058233728632330894, -0.0031324869487434626, -0.022562572732567787, -0.002092051785439253, -0.01972377672791481, -0.014447340741753578, 0.02001781575381756, -0.04224644973874092, 0.08794320374727249, -0.05012425035238266, -0.03000028431415558, -0.006967171560972929, -0.0206689964979887, 0.042854372411966324, 0.018307263031601906, 0.04896565154194832, 0.025682201609015465, -0.013927857391536236, -0.026135331019759178, 0.05985535308718681, -0.022972915321588516, -0.06837267428636551, 0.03858938440680504, 0.01297465804964304, -0.01869095303118229, -0.014788917265832424, 0.05812034010887146, -0.005296449176967144, -0.03188127279281616, -0.014335273765027523, 0.029694614931941032, -0.006149643566459417, 0.0199541375041008, -0.04401557520031929, 0.08680693805217743, 0.02373044192790985, -0.05719068646430969, 0.00026498522493056953, -0.047968123108148575, 0.05128588527441025, 0.08984201401472092, 0.018948959186673164, -0.019343748688697815, -0.02114059403538704, -0.000319077109452337, -0.0483400821685791, 0.02235756441950798, -0.04526951164007187, -0.016685402020812035, 0.04920167103409767, 0.0009292830363847315, 0.0066963727585971355, 0.06434790045022964, -0.07675006985664368, 0.025055741891264915, 0.039694759994745255, -0.04413995519280434, 0.053703855723142624, 0.022806784138083458, -0.02683648094534874, 0.04088520258665085, -0.02505207061767578, 0.038970883935689926, -0.011933756060898304, 0.017762111499905586, -0.052576545625925064, -0.02732933685183525, 0.024120833724737167, -0.011316879652440548, -0.04519795626401901, 0.012005027383565903, 0.016074027866125107, -0.019522851333022118, 0.07912492007017136, -0.010790158063173294, 0.003584112972021103, -0.018683504313230515, -0.03872854635119438, -0.0293426550924778, -0.028616394847631454, 0.0034447587095201015, 0.008824280463159084, 0.0267381239682436, -0.014405295252799988, 0.01340708788484335, 0.022090492770075798, 0.041456740349531174, 0.01306570041924715, 0.012696513906121254, -0.05636722221970558, 0.05526677146553993, 0.014159836806356907, -0.05075988918542862, -0.03631533309817314, 0.04115152731537819, 0.06140957027673721, -0.06539256870746613, -0.01610933430492878, 0.08129005879163742, -0.054096464067697525, 0.021539339795708656, -0.009134260006248951, 0.04177645593881607, 0.026524635031819344, 0.016892578452825546, 0.037963252514600754, -0.06906059384346008, 0.050708942115306854, 0.06792867928743362, -0.0004703162703663111, 0.018694648519158363, -0.031178174540400505, -0.03567223250865936, -0.035771071910858154, 0.05392008647322655, 0.06253240257501602, -0.020289720967411995, 0.034436099231243134, 0.03414503112435341, 0.0034774248488247395, -0.04452746734023094, -0.03509671986103058, -0.10872040688991547, 0.016063231974840164, 0.047865595668554306, -0.04542273283004761