3 docs tagged with "monitoring"

Active Monitoring

Once your inference stack is deployed you want to set up active monitoring to ensure you are alerted to any issues before they impact your users.

Health Probes

Kubernetes offers Liveness, Readiness and Startup probes to monitor the health of your applications. These can be configured in the Inference Stack to ensure your models are running correctly.

Metrics

We use Prometheus as our chosen integration for aggregation of metrics from your LLM applications. This is an open source time series database and monitoring solution.