I spent a week with GPT Image 2 — here's what surprised me — GPT Image 2

I picked up GPT Image 2 last Monday because I had a backlog. Two YouTube thumbnails, one paid social set for a client, and a book cover I had been avoiding for about three weeks. I wanted to see if I could actually move through that list with this thing, not just generate pretty pictures for a Twitter screenshot.

So this is not a benchmark post. This is a "did it survive a real week" post.

The first day was annoying

I tried to make the YouTube thumbnail first. The video was about a new espresso machine I had been testing. I typed something like "a moody, cinematic shot of a chrome espresso machine on a dark wooden bar, steam rising, dramatic lighting" and got back something that looked like stock photography from 2014. Not bad. Just generic.

I almost gave up. Then I remembered something a friend told me about prompting these new models — be specific about the camera, not just the scene. I added "shot on a 35mm lens, shallow depth of field, the back of the machine slightly out of focus" and the second attempt was usable. Not finished, but usable.

The lesson, I guess, is that GPT Image 2 rewards specificity in a way I wasn't used to with older models. Vague prompts give vague results. With Midjourney I could throw "moody cinematic" at the wall and it would do something interesting. Here I had to actually direct it.

Where it actually surprised me: text

The thing I had been dreading was the book cover. The author wanted the title rendered in a specific way — handwritten, but legible, "like a postcard from someone you don't know very well yet." Try prompting that.

I expected the usual mess of letterforms that almost spell the title. What I got, on the third try, was a cover where the title actually said the right words. In a script that was close to what the author wanted. I sent it to her and she replied with the wide-eyes emoji and "WAIT we can use this?"

So we used it. We tweaked it in Affinity afterwards — she wanted the kerning a little tighter — but the heavy lifting was done by the model. I have not been able to say that about any other image model when text is involved.

Where I had to fight it

The client work was harder. The brand has a very specific visual identity — a particular muted teal, a particular type of grain texture, a particular way they shoot product. I tried to describe all of this in a prompt and the results were all over the place. Two looked like Apple ads. One looked like a Kinfolk page. One looked like nothing in particular.

Reference images saved me. I dropped in three of their previous posts and asked the model to "match the color palette and grain of these references, but build a new composition for a winter campaign." That worked. Not on the first try. Maybe the fourth. But it worked, and the final asset went out the door.

If you do client work and you care about brand consistency: do not skip the reference image feature. It is the difference between this being a toy and being a tool.

The credit thing

I was on the Pro plan for the week. I burned through about 800 credits. That sounds like a lot but most of it was iteration — I was doing maybe 6 to 8 attempts per final asset, which is honestly fewer than I would do in Photoshop just exploring directions. The math worked out for me. If you only need one image a month, the pay-as-you-go pack is the obvious choice.

Would I keep using it?

Yes, but I am not throwing out my other tools. GPT Image 2 is now my first stop for anything with text on it, anything where I need to riff on a reference, and anything that needs to look "designed" rather than "rendered." For raw photographic realism I still reach for other models. For illustration with a specific artist's hand I still draw it myself, or hire someone who can.

This is a tool. A very capable one, with rough edges in places I didn't expect and surprising polish in places I did. Worth a week of your time if you are doing anything visual for work.

Mira is a freelance brand designer based in Lisbon. She writes about the tools she uses, mostly so she remembers what worked.