Model
A model is a machine learning algorithm (like BERT, GPT, etc.) trained to perform a task: classification, translation, generation, etc.
Example: "bert-base-uncased"
is a pre-trained BERT model.
Tokenizer
A tokenizer converts raw text into tokens (numbers) that models can understand.
Example: "Hello"
→ [101, 7592, 102]
for BERT
Dataset
A dataset is a collection of labeled or unlabeled examples used to train or evaluate models.
Example: "imdb"
for sentiment analysis, "squad"
for Q&A.
Pipeline
A pipeline is a high-level interface that wraps everything (tokenizer + model) for quick inference.
Example: pipeline("sentiment-analysis")("I love this!")
Space
A Space is a public or private app hosted on Hugging Face (like a mini web app), often built using Gradio or Streamlit.
Example: A web demo where you paste text and get a sentiment prediction.
Token (API Key)
A token is your personal access key to use Hugging Face Hub programmatically (e.g., to upload models or access private ones).
Get it from https://huggingface.co/settings/tokens