Product Discovery API Rate Limits – Klevu

What is a rate limit?

A rate limit defines the maximum number of API requests that can be made to our product discovery services within a specific time period. It protects our shared infrastructure from unexpected spikes in traffic, abusive usage, or accidental loops that could impact performance for all customers hosted on the same infrastructure.

Why do we have rate limits?

We host multiple customers on shared infrastructure to deliver reliable, fast search performance globally. Occasionally, certain sites may receive unusually high traffic from bots or crawlers — which can exhaust server capacity and affect other customers on the same hardware.

Rate limits help us:

Ensure stable, fair usage for all customers.
Protect service quality and response times.
Prevent misuse or accidental overloads.

What happens if a store hits a rate limit?

If store requests exceed the allowed limit, our API will respond with an HTTP 429 — Too Many Requests error. When this happens:

The excess requests will be rejected.
If a store uses a custom API implementation, it should handle HTTP 429 responses by backing off and retrying later. Please note that when using Klevu’s JavaScript library, this is not handled automatically by default — if the rate limit is exceeded, no results will be displayed on the frontend.
The limit resets automatically after a short period.

How are limits set?

Your rate limit is based on:

The capacity of the server hosting your index.| |
The type of operations and typical traffic patterns.

At this stage, limits are fixed and cannot be increased ad hoc. We carefully calculate thresholds to balance your needs and ensure overall platform stability. When we enable rate limiting, we do not expect this to affect most customers. The limits are designed to prevent excessive or abusive traffic that can overload shared infrastructure — for example, repeated hits by bots or poorly configured crawlers. For normal, intended usage, your service should continue to run without interruption. This helps us identify and block misuse so that everyone on our platform continues to benefit from reliable performance.

How to avoid hitting rate limits?

Optimize requests: Only make necessary calls.
Use caching: To avoid hitting rate limits, make sure you cache responses whenever possible. This prevents sending repeated identical requests to the server. We also use edge-caching to further minimize the number of requests that reach our servers.
Monitor usage: Track your request volumes to stay within safe thresholds.

Need Help?

If you have questions about your rate limits or need help optimizing your integration, please reach out to our support team by writing an email to support@klevu.com.