Last login:
Thu, Oct 16, 4:05 AM
1
# caponier makes LLM serving faster and more reliable
2
3
vx --profile-model llama-3.1-8b
4
# analyzing inference performance...
5
pref --cache-predictions --model llama-3.1-8b
6
# warming cache for peak traffic...
7
8
# vx & pref coming soon
9