Popular Lesson

Day 11 – Introduction to AI Image Generation with ChatGPT & Gemini Lesson

Create images from text with the two most popular, accessible tools today. You will compare ChatGPT’s built-in image generator and Gemini’s Nano Banana model, then apply a simple prompt formula. Watch the video for the full walkthrough and live examples.

What you'll learn

  • Generate images: Use plain-text prompts to create images in both ChatGPT and Gemini.

  • Choose models wisely: Select the image tool in each platform and understand free vs paid limits and quality.

  • Build better prompts: Apply a reusable formula that includes image type, subject, key features, setting, color and mood.

  • Control style and focus: Add artist references, color palettes, and emphasis so the right elements stand out.

  • Edit with text: Make follow-up changes to images, including adding or removing text, without starting over.

  • Decide on outputs: Know when to use aspect ratios, how downloads work, and what Gemini’s watermark means.

Lesson Overview

Text models like ChatGPT and Google Gemini write words. Diffusion models create images and video by learning from millions of images. Today you will work with both worlds by generating images directly from text inside ChatGPT and inside Gemini using its Nano Banana image model. Both tools are free to try. Paid tiers raise your usage limits and, in Gemini, unlock a stronger image model that handles things like aspect ratio more reliably.

This lesson matters because the output quality depends far more on your prompt than on the button you click. You will see how simple prompts produce generic results, while descriptive prompts with clear style and focus produce exactly what you want. You will also learn how to edit an image with a follow-up text instruction, which is fast and practical for real projects.

The lesson fits neatly after your earlier experience with ChatGPT and Gemini for text. Now you will use them purely for images, including realistic photos, illustrations, logos, and even graphics with accurate text such as YouTube thumbnails. If you need realistic results, Gemini often performs well, though it adds a small watermark. By the end, you will be ready to move on to a more controllable image tool next.

Who This Is For

If you need quality visuals quickly and want control through clear instructions, this lesson is for you. It is especially useful if you:

  • Create social or marketing visuals from scratch
  • Need quick logos, icons, or branded graphics for drafts or concepts
  • Produce thumbnails or banners with specific text
  • Build mood boards or style references for stakeholders
  • Want realistic photos or stylized art without professional design software
  • Teach or present with custom visuals tailored to a lesson or talk
Skill Leap AI For Business
  • Comprehensive, Business-Centric Curriculum
  • Fast-Track Your AI Skills
  • Build Custom AI Tools for Your Business
  • AI-Driven Visual & Presentation Creation

Where This Fits in a Workflow

Use what you learn here early in a project to explore ideas, test styles, and produce first drafts. It is great for concepting a logo, producing a quick hero image, or building a thumbnail that must include precise text. You can generate a few options, refine the strongest result with short follow-up prompts, then export what you need.

For example, a content creator can draft a YouTube thumbnail with the exact title text, then ask the model to remove a year or swap logos. A small business can create a clean product image with a consistent background and lighting, then add variations by changing color tones or mood. The prompting formula keeps outputs consistent session to session, which helps teams agree on a clear look.

Technical & Workflow Benefits

Old way: staring at a blank canvas, searching stock sites, or passing multiple briefs back and forth to get the right look. New way: describe the image, specify the type, style, and focus, then iterate with short text updates. You can produce many variations quickly and keep the best version moving with small edits.

Two improvements stand out:

  • Speed and iteration: You can generate four or more options in seconds and nudge them toward your goal with follow-up prompts instead of rebuilding from scratch.
  • Control and clarity: The prompt formula pushes you to define subject, key features, setting, color, mood, optional style, and emphasis. This reduces vague prompts and inconsistent results.

Gemini often produces highly realistic images and supports accurate in-image text, though it adds a small watermark. With a paid plan, its Nano Banana Pro model responds better to aspect ratios such as 16:9. ChatGPT also creates and edits images from prompts and is easy to use directly inside your conversation thread.

Practice Exercise

Scenario: Create a YouTube thumbnail concept for a video about AI image tools.

  • Step 1: Draft two prompts, one for each platform.
    • ChatGPT prompt: Photo, a clean studio desk with a laptop showing AI art on screen, include large text “Best Tools for AI Images”, minimal modern style, high contrast, black and white with a red accent, dramatic lighting, emphasis on readable text.
    • Gemini prompt: Illustration, a bold flat-design collage of AI icons around a camera, include the exact text “Best Tools for AI Images”, vibrant colors with a cool-blue palette, emphasis on the title text, 16:9 aspect ratio if your plan supports it.
  • Step 2: Iterate.
    • Ask for one round of changes by text. For example: remove the year, swap any outdated model names to MidJourney, ChatGPT, Gemini, and Flux, and keep layout the same.
  • Step 3: Export and compare.
    • Download both final images. If Gemini adds a watermark, note where it appears.

Reflection: Which tool produced more accurate text and the style you wanted? Did specifying image type, emphasis, and color improve clarity? If aspect ratio was required, did your result match?

Course Context Recap

This lesson comes after your first look at ChatGPT and Gemini for text generation. Today you focused on image creation, prompt structure, and quick edits by text inside both tools. Next, you will see a third image model that offers more control over settings and highly realistic results. Continue through the course to compare workflows and decide which model best fits your projects. The full boot camp expands these skills with deeper examples and practical use cases.