Skip to content

Commit 4fa6b10

Browse files
add huggingface_hub smart rate limit handling section in Rate Limits doc (#2089)
* add python sdk smart rate limit handling section * fix * Update docs/hub/rate-limits.md Co-authored-by: Julien Chaumond <[email protected]> --------- Co-authored-by: Julien Chaumond <[email protected]>
1 parent b1bda18 commit 4fa6b10

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

docs/hub/rate-limits.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ We define different rate limits for distinct classes of requests. We distinguish
88
- e.g. model or dataset search, repo creation, user management, etc. All endpoints that belong to this bucket are documented in [Hub API Endpoints](./api).
99
- **Resolvers**
1010
- They're all the URLs that contain a `/resolve/` segment in their path, which serve user-generated content from the Hub. Concretely, those are the URLs that are constructed by open source libraries (transformers, datasets, vLLM, llama.cpp, …) or AI applications (LM Studio, Jan, ollama, …) to download model/dataset files from HF.
11-
- Specifically, this is the ["Resolve a file" endpoint](https://lnkd.in/eesDKirG) documented in our OpenAPI spec.
11+
- Specifically, this is the ["Resolve a file" endpoint](https://huggingface-openapi.hf.space/#tag/models/get/apiresolve-cachemodelsnamespacereporevpath) documented in our OpenAPI spec.
1212
- Resolve requests are heavily used by the community, and since we optimize our infrastructure to serve them with maximum efficiency, the rate limits for Resolvers are the highest.
1313
- **Pages**
1414
- All the Web pages we host on huggingface.co.
@@ -89,6 +89,14 @@ Despite passing `HF_TOKEN` if you are still rate limited, you can:
8989
- replace Hub API calls with Resolver calls, whenever possible (Resolver rate limits are much higher and much more optimized).
9090
- upgrade to PRO, Team, or Enterprise.
9191

92+
## Smart rate limit handling with `huggingface_hub`
93+
94+
The Hub Python Library [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/index) (version **1.2.0+**) includes smart retry handling for rate limit errors.
95+
96+
When a 429 error occurs, the SDK automatically parses the `RateLimit` header to extract the exact number of seconds until the rate limit resets, then waits precisely that duration before retrying. This applies to file downloads (i.e. Resolvers) and paginated Hub API calls (list models, datasets, spaces, etc.).
97+
98+
**We strongly recommend using `huggingface_hub` for all programmatic access to the Hub** to benefit from this optimized retry behavior and avoid implementing custom rate limit handling.
99+
92100
## Granular user action Rate limits
93101

94102
In addition to those main classes of rate limits, we enforce limits on certain specific kinds of user actions, like:

0 commit comments

Comments
 (0)