huggingface.co Open in urlscan Pro
2600:9000:261f:ec00:17:b174:6d00:93a1  Public Scan

Submitted URL: https://api-inference.laylin.icu/
Effective URL: https://huggingface.co/docs/api-inference/index
Submission: On April 10 via api from US — Scanned from US

Form analysis 0 forms found in the DOM

Text Content

Hugging Face


 * Models
 * Datasets
 * Spaces
 * Posts
 * Docs
 * Solutions
 * Pricing
 * 

 * --------------------------------------------------------------------------------

 * Log In
 * Sign Up




api-inference documentation

Serverless Inference API


API-INFERENCE

🏡 View all docsAWS Trainium & InferentiaAccelerateAmazon
SageMakerAutoTrainBitsandbytesCompetitionsDataset
viewerDatasetsDiffusersEvaluateGoogle TPUsGradioHubHub Python
LibraryHuggingface.jsInference API (serverless)Inference Endpoints
(dedicated)OptimumPEFTSafetensorsSentence TransformersTRLTasksText Embeddings
InferenceText Generation InferenceTokenizersTransformersTransformers.jstimm
Search documentation
Ctrl+K
main EN

Getting started
🤗 Accelerated Inference API Overview Detailed parameters Parallelism and batch
jobs Detailed usage and pinned models More information about the API
Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces
Faster examples with accelerated inference
Switch between documentation themes
Sign Up

to get started


SERVERLESS INFERENCE API

Test and evaluate, for free, over 150,000 publicly accessible machine learning
models, or your own private models, via simple HTTP requests, with fast
inference hosted on Hugging Face shared infrastructure.

The Inference API is free to use, and rate limited. If you need an inference
solution for production, check out our Inference Endpoints service. With
Inference Endpoints, you can easily deploy any machine learning model on
dedicated and fully managed infrastructure. Select the cloud, region, compute
instance, autoscaling range and security level to match your model, latency,
throughput, and compliance needs.


MAIN FEATURES:

 * Get predictions from 150,000+ Transformers, Diffusers, or Timm models (T5,
   Blenderbot, Bart, GPT-2, Pegasus...)
 * Use built-in integrations with over 20 Open-Source libraries (spaCy,
   SpeechBrain, Keras, etc).
 * Switch from one model to the next by just switching the model ID
 * Upload, manage and serve your own models privately
 * Run Classification, Image Segmentation, Automatic Speech Recognition, NER,
   Conversational, Summarization, Translation, Question-Answering, Embeddings
   Extraction tasks
 * Out of the box accelerated inference on CPU powered by Intel Xeon Ice Lake


THIRD-PARTY LIBRARY MODELS:

 * The Hub supports many new libraries, such as SpaCy, Timm, Keras, fastai, and
   more. You can read the full list here.

 * Those models are enabled on the API thanks to some docker integration
   api-inference-community.

Please note however, that these models will not allow you (tracking issue):

 * To get full optimization
 * To run private models
 * To get access to GPU inference


IF YOU ARE LOOKING FOR CUSTOM SUPPORT FROM THE HUGGING FACE TEAM



HUGGING FACE IS TRUSTED IN PRODUCTION BY OVER 10,000 COMPANIES




Overview→
Serverless Inference API Main features: Third-party library models: If you are
looking for custom support from the Hugging Face team Hugging Face is trusted in
production by over 10,000 companies