This book takes you step-by-step through creating your own AI models that can generate images from text. You’ll explore two methods of image generation—vision transformers and diffusion models—and learn vital AI development techniques as you go.

Dive into the powerful models behind AI image generators. The best way to learn is to build something from scratch, and in this book you’ll build your very own diffusion model and vision transformer. As you work through each stage of development, you’ll develop an understanding of how these models can be customized, applied, and integrated for impressive multimodal AI.

Build a Text-to-Image Generator (from Scratch) teaches you how to:

• Build and train models to generate high resolution images based on text descriptions
• Edit an existing image based on text prompts
• Build and train a model to add captions to images
• Build and train a vision transformer to classify images
• Fine-tune LLMs for downstream tasks such as classification, text or image generation
• Better differentiate real images from deepfakes

About the technology

AI-generated images appear everywhere from high-end advertising to casual social media feeds. Text-to-image tools like Dall-e, Midjourney, and Flux make it easy to create AI art, but how do they work? In this book, you’ll find out by building your own text-to-image generator!

About the book

Build a Text-to-Image Generator (from Scratch) explores both transformer-based image generation and diffusion models. You’ll work hands-on to build a pair of simple generation models that can classify images, automatically add captions, reconstruct images, and enhance existing graphics. Author Mark Liu guides you every step of the way with clear explanations, informative diagrams, and eye-opening examples you can build on your own laptop.

What's inside

• Build a vision transformer to classify images
• Edit images using text prompts
• Fine-tune image models

About the reader

Requires basic knowledge of generative AI models and intermediate Python skills.

About the author

Mark Liu is the founding director of the Master of Science in Finance program at the University of Kentucky. He is also the author of Learn Generative AI with PyTorch.

Table of Contents

Part 1
1 A tale of two models: Transformers and diffusions
2 Build a transformer
3 Classify images with a vision transformer
4 Add captions to images
Part 2
5 Generate images with diffusion models
6 Control what images to generate in diffusion models
7 Generate high-resolution images with diffusion models
Part 3
8 CLIP: A model to measure the similarity between image and text
9 Text-to-image generation with latent diffusion
10 A deep dive into Stable Diffusion
Part 4
11 VQGAN: Convert images into sequences of integers
12 A minimal implementation of DALL-E
Part 5
13 New developments and challenges in text-to-image generation
A Installing PyTorch and enabling GPU training locally and in Colab

Specificaties

Betrokkenen

Auteur(s):

Uitgeverij:

Inhoud

Aantal bladzijden:: 360

Taal:: Engels

Eigenschappen

Productcode (EAN):: 9781638357803

Verschijningsdatum:: 12/01/2026

Uitvoering:: E-book

Beveiligd met:: Adobe DRM

Formaat:: ePub

Alleen bij Standaard Boekhandel

Alleen bij Standaard Boekhandel

Alleen bij Standaard Boekhandel

Alleen bij Standaard Boekhandel

Alleen bij Standaard Boekhandel

Boeken

E-readers

Cadeauboxen

Spelen

Papierwaren

Alle producten

E-books

Boekenbon

Promoties

Volg ons op

140 winkels

Over ons

B2B

Categorieën

Boeken

Populaire categorieën

Bestsellers

Nieuw & Exclusief

Digitaal lezen

Kinder- & Jeugdboeken per leeftijd

Alleen bij Standaard Boekhandel

Categorieën

E-readers

Populaire e-book categorieën

Alleen bij Standaard Boekhandel

Categorieën

Cadeauboxen

Populaire categorieën

Alleen bij Standaard Boekhandel

Categorieën

Spelen

Puzzels

Spellen

Categorieën

Papierwaren

Agenda's, Kalenders & Organisers

Kantoormateriaal

Hobby & Kunst

Informatica

Albums & Kaders

Categorieën

Alle producten

Boeken

Papierwaren

Spelen

Thuis & Onderweg

Muziek

Geschenken

E-readers & Accessoires

Wenskaarten

Build a Text-to-Image Generator (from Scratch) E-BOOK

Omschrijving

Specificaties

Betrokkenen

Inhoud

Eigenschappen

Alleen bij Standaard Boekhandel

Alleen in onze winkels: Win een weekend voor twee in Parijs

Beoordelingen

Uitgelichte categorieën

Gratis levering in België

Afhalen na 1 uur

Ruim aanbod

Veilig betalen

Over ons

B2B