This study examined three AI image synthesis models—Dall-E 2, Stable Diffusion, and Midjourney—for generating urban design imagery from scene descriptions. 240 images were evaluated using a modified Sensibleness and Specificity Average (SSA) metric by two independent evaluators. Results revealed significant differences among the AI models, with varying scores across urban scenes, indicating challenges in representing certain design elements. While common features like skyscrapers and lawns were accurately depicted, unique elements such as sculptures and transit stops were less frequent. AI-generated urban designs offer potential for rapid ideation and visual brainstorming in early exploration stages. Future research should expand style range and incorporate diverse evaluative metrics to enhance AI models for nuanced urban design applications, benefiting architects and urban planners.