This project sends multiple asynchronous requests to OpenAI API 'completions' or 'embeddings' endpoints, enabling batch processing for large request volumes.
It is intended as example code and meant to be extended for your specific uses.
- Install uv
pip install uv
- Create an environment
uv venv
- Activate environment
source .venv/bin/activate
- In activated environment, install requirements
pip install -f requirements.txt
python openai_async.py --filename prompts.txt --num_requests 10 --model modelname --base_url https://your/server/v1/ --api_token $API_TOKEN --api_endpoint completions
Arguments:
--filename - Path to file with prompts, one prompt per line
--num_requests - Number of requests to send, defaults to all
--model - Model name in HuggingFace naming format, must be specified
--base_url - Base URL of OpenAI API compliant server. E.g., https://localhost:$API_PORT/v1/
--api_token - JWT token for auth header, defaults to 'None'
--api_endpoint - 'completions' or 'embeddings', defaults to 'completions'