YouTip LogoYouTip

Hugging Face Transformers

Hugging Face Transformers is currently the most popular open-source NLP/AI library, providing thousands of pre-trained models covering almost all AI tasks including text, images, audio, and multimodal. Its core value: encapsulating complex model loading, inference, and training workflows into just a few lines of code. ### Supported Task Types * * * ## Core Principles of Transformer Architecture Before using the library, understanding the underlying architecture will help you understand why parameters are tuned this way. ### Overall Architecture: Encoder-Decoder !(https://example.com/wp-content/uploads/2026/05/tutorial-cf08ae6f-a6fc-4a1b-8977-364df.png) ### Three Major Model Families * * * ## Installation and Environment Setup ### Installation # Basic installation pip install transformers # Full installation (includes training dependencies) pip install transformers # PyTorch backend (recommended) pip install transformers # TensorFlow backend pip install transformers # JAX/Flax backend # Common companion libraries pip install datasets # HuggingFace datasets library pip install evaluate # Model evaluation metrics pip install accelerate # Multi-GPU/mixed precision training pip install peft # Parameter-efficient fine-tuning (LoRA, etc.) pip install tokenizers # High-performance tokenizer pip install sentencepiece # Required for some models (T5/LLaMA) # Verify installation python -c "import transformers; print(transformers.__version__)" ### Environment Variable Configuration # Set model cache directory (models are cached here after download, default ~/.cache/huggingface) export HF_HOME=/data/huggingface_cache # For Chinese users: Use mirror site to speed up downloads (recommended hf-mirror.com) export HF_ENDPOINT=https://hf-mirror.com # Offline mode (when network is unavailable, only use cached models) export TRANSFORMERS_OFFLINE=1 # Disable progress bar (CI/CD environment) export DISABLE_TQDM=1 ## Examples # Can also be set in code import os os.environ="https://hf-mirror.com" # View current cache directory from transformers.utils import TRANSFORMERS_CACHE print(TRANSFORMERS_CACHE) * * * ## Pipeline: Run AI in Five Lines of Code Pipeline is the highest-level abstraction in Transformers, encapsulating model loading, preprocessing, inference, and post-processing. Inference can be completed in just three to five lines of code. ### Pipeline Quick Examples Collection ## Examples from transformers import pipeline # 1. Sentiment analysis (text classification) classifier = pipeline("sentiment-analysis") result = classifier("I love using Hugging Face Transformers!") # -> [{'label': 'POSITIVE', 'score': 0.9998}] # 2. Text generation generator = pipeline("text-generation", model="gpt2") result = generator("Once upon a time in a land far away,", max_new_tokens=50, num_return_sequences=1, temperature=0.8) # 3. Fill in the blank (masked language model) unmasker = pipeline("fill-mask", model="bert-base-uncased") result = unmasker("The capital of France is .") # -> [{'token_str': 'paris', 'score': 0.9823}, ...] # 4. Named Entity Recognition (NER) ner = pipeline("ner", aggregation_strategy="simple") result = ner("My name is John and I work at Google in New York.") # -> [{'entity_group': 'PER', 'word': 'John', 'score': 0.998}, ...] # 5. Extractive QA qa = pipeline("question-answering") result = qa(question="Who invented Python?", context="Python was created by Guido van Rossum in 1991.") # -> {'answer': 'Guido van Rossum', 'score': 0.9887} # 6. Text summarization summarizer = pipeline("summarization", model="facebook/bart-large-cnn") result = summarizer(article, max_length=60, min_length=20) # 7. Machine translation translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-zh") result = translator("Hello, how are you today?") # -> [{'translation_text': 'Hello, how are you today?'}] # 8. Zero-shot classification (no specialized training required) zero_shot = pipeline("zero-shot-classification") result = zero_shot("I love playing football", candidate_labels=["sports","politics","technology"]) # -> {'labels': ['sports', ...], 'scores': [0.972, ...]} ### Advanced Pipeline Configuration ## Examples import torch from transformers import pipeline # Specify GPU pipe = pipeline("text-generation", model="gpt2", device=0) # Specify precision (save VRAM) pipe = pipeline("text-generation", model="meta-llama/Llama-2-7b-hf", torch_dtype=torch.float16, device_map="auto") # Batch processing (improve throughput) pipe = pipeline("sentiment-analysis", batch_size=32) results = pipe(large_text_list) # Automatic batched inference # Large text chunking asr = pipeline("automatic-speech-recognition", model="openai/whisper-large-v2", chunk_length_s=30, stride_length_s=5) result = asr("long_audio.wav", return_timestamps=True) * * * ## Deep Dive into Tokenizer Tokenizer is the first step in NLP: converting raw text into a sequence of numbers that the model can understand. ### Complete Tokenization Workflow ### Core Tokenizer Usage ## Examples from transformers import AutoTokenizer # Load Tokenizer tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") # One-step encoding encoding = tokenizer( "Hello, I'm learning Transformers!", return_tensors="pt",# Return PyTorch tensor padding=True,# Pad to longest sequence truncation=True,# Truncate when exceeding length max_length=128,# Maximum length ) print(encoding.keys()) # -> dict_keys(['input_ids', 'token_type_ids', 'attention_mask']) print(encoding[:8]) # -> tensor([101, 7592, 1010, 1045, 1005, 1049, 4083, 19081]) print(encoding[:8]) # -> tensor([1, 1, 1, 1, 1, 1, 1, 1]) # 1=real token, 0=padding # Decode (ID -> text) decoded = tokenizer.decode(encoding, skip_special_tokens=True) print(decoded)# -> "hello, i'm learning transformers!" # Batch encoding (automatic padding alignment) texts =["Short.","This is a much longer sentence for testing."] batch = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") print(batch.shape)# -> torch.Size([2, 10]) # Vocabulary information print(f"Vocabulary size: {tokenizer.vocab_size}")# -> 30522 print(f" ID: {tokenizer.cls_token_id}")# -> 101 print(f" ID: {tokenizer.sep_token_id}")# -> 102 print(f"Max length: {tokenizer.model_max_length}")# -> 512 ### Common Tokenizer Type Comparison * * * ## Model Loading and Inference ### AutoClass: Automatically Select the Correct Model Class ## Examples import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification model_name ="bert-base-uncased" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained( model_name, num_labels=2, torch_dtype=torch.float16, device_map="auto" ) # Manual inference complete workflow text ="Transformers is an amazing library!" # 1. Encode inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) inputs ={k: v.to(model.device)for k, v in inputs.items()} # 2. Forward pass with torch.no_grad(): outputs = model(**inputs) # 3. Parse output logits = outputs.logits# shape: [1, 2] probs = torch.softmax(logits, dim=-1) pred = torch.argmax(probs, dim=-1).item() id2label = model.config.id2label# {0: 'LABEL_0
← Slash CommandsVector Database β†’