CLOSE AD ×

Translating from ChatGPT to Midjourney could be a productive part of early design thinking

AI All the Way

Translating from ChatGPT to Midjourney could be a productive part of early design thinking

A mixed-use development generated by DALL-E 2. (Courtesy Hickok Cole)

By now, most of us have been exposed to the rapid rise of artificial intelligence. Since the November 2022 launch of ChatGPT from U.S.-based artificial intelligence research company OpenAI, the subject has inundated mainstream media with opinions from everyday users, designers, and even “congressmen who code.” For those unfamiliar, the titular GPT is an acronym for “Generative Pre-Trained Transformer,” which leverages deep learning to provide real-time, human-like responses in text-based conversations; essentially, a sentient form of instant messaging.

The extent of ChatGPT capabilities have been astonishing, to say the least, as echoed by thousands across the internet. For architecture, the profession has been thrust unwillingly into a crossroads debating the optimal use of such software in the real-world capacity of designing and constructing buildings. Thus, advanced AI saturates the air with familiar feelings of existential dread, confusion, and excitement as it stands on the precipice of becoming either the next great leap forward or the final nail in the coffin for the ever-deliberate “hand of the architect.”

With the future of my livelihood in the balance, I sought to face ChatGPT head-on and took the opportunity to test its abilities firsthand through a research grant awarded to me by my firm Hickok Cole. The results are more nuanced than initially expected.

What began as exploring a “fad” evolved into an unexpected display of competence and creativity. My takeaway was not so much a frightful panic that automation might render my role obsolete, nor had I found a free and convenient means of cutting corners; rather, I signed off with a creative burst of new ideas and a fresh perspective. ChatGPT didn’t tell me what was right or wrong, nor incept its own miraculous notions of design from scratch. It certainly cannot build a building or present inspiring designs—yet. ChatGPT simply scoured the internet for references, research, and proofs-of-concept to point me in various directions, compounding on previous discourse to build a layered process of iterative design suggestions.

AI generation of a mixed use building
Prompt 1 (Keyword: Colorful, produced more diagrammatic looking images w/ colors similar to MVRDV), produced in Midjourney. The full text of the prompt is provided at the end of the article. (Courtesy Hickok Cole)

The initial goal of (what became) our two-hour conversation was straightforward: Design a building. The opening dialogue was vague and arbitrary. “What do you know about vertical mixed-use buildings in an urban setting?” “Do you have any favorite architects or design styles?”  “What technical challenges need to be addressed in a complex building structure?”  The questions were meant to gauge the AI’s deep-learning capabilities and reveal any hidden biases embedded in its code. Because they were open-ended questions, I got open-ended answers.

Up to this point, I’d categorized ChatGPT as a research tool; it’s fast, provides real-time information, and, for the most part, refrains from subjective thought. But architecture encompasses a wide breadth of specialized operations. What begins with research leads to design thinking and the creative formulation (and construction) of space; to materiality, color, and atmospheric sensitivities; to detailed project documentation and systems organization; to responsible business practices, scheduling, and pricing. Architects rely on layers of information and inherent knowledge to bring a concept to life. The same could be said of AI.

AI generation of a mixed use building
Prompt 2 (Removed Keyword: Colorful, produced more “realistic” material images stripped of odd color overlays) (includes other minor modifications to the input description) produced in Midjourney. The full text of the prompt is provided at the end of the article. (Courtesy Hickok Cole)

It wasn’t until I began introducing constraints that the AI began producing astounding textual responses. It listed potential locations and floors when asked to demonstrate the ideal distribution of separate programs in a mixed-use building. After some discussion about program adjacencies—which, after all, was the basis of my research here at Hickok Cole—I added another constraint, limiting the building to 24 stories, and asked the AI to resolve its design accordingly. Then I introduced area restrictions for specific spaces and the software continued its revisions. This iterative back-and-forth proceeded for over an hour as ChatGPT refined what it colloquially referred to as “our design” based on a growing list of parameters—which included privacy, acoustics, and accessibility—while providing explanations for each modification. Its simple, direct responses eventually evolved into proactive ideas for architectural features, program breakdowns, structural considerations, and even code restrictions. When I broached the subject of stakeholders, suggesting the building be a joint venture of various developers owning respective zones, ChatGPT drastically revised its written response, relocating programs on a vertical and horizontal level to address the recently introduced developers and their interests, mindful of where spatial overlaps and proximity of entries occurred. The response suggests AI is not only capable of detecting that the introduction of stakeholders implies a shift in priorities but that it also understands there is a difference between speculative design and the real world application of design.

AI generation of a mixed use building
A mixed-use development generated by DALL-E 2. (Courtesy Hickok Cole)

By conversing with ChatGPT as I might a colleague in a conference room—postulating ideas and sketching out concepts—it achieved, upon conclusion, an admirable summary of a vertical mixed-use building approaching its schematic design phase.

I took the AI a step further by copying ChatGPT’s summary into other AI-based visual softwares including DALL-E and Midjourney. Both utilize the same GPT-3 code developed by OpenAI. Unlike text-based ChatGPT software, Midjourney converts text input into images through a deep-learning neural net of reference images sourced from the web. In minutes, Midjourney produced rendered depictions of the projects ChatGPT had described.

Perhaps it’s when the design thinking and visualization components of an architect’s job are on the cusp of automation, leaving only the dreary grind of project documentation, that the existentialism creeps into an architect’s psyche. Even among my colleagues, a combination of intrigue, excitement, and skepticism surfaced. “Well, I’d like to see AI solve solutions as complicated as this!” an esteemed principal exclaimed in jest during a particularly convoluted presentation.

While some fear AI might make architects’ jobs obsolete, in its current state AI is not equipped to navigate the complexities of design. Throughout our conversation, ChatGPT regularly yielded responses that were simply wrong. It made suggestions that were at odds with existing conventions, tripped up on its recollection of established constraints, and even went against its previous statements. In one instance, it defended a misguided decision, failing to read between the lines when I suggested potential inconsistencies in its logic and pivoting when the error was addressed directly.

A mixed-use development generated by DALL-E 2. (Courtesy Hickok Cole)
A mixed-use development generated by DALL-E 2. (Courtesy Hickok Cole)

There’s certainly a learning curve in likeness to the advent and proliferation of computing technology not all that long ago. Today’s takeaway though, is that despite its abilities, AI severely lacks intuition. Shortcomings aside, the moral and ethical implications are valid and warrant further investigation.

Still, the ChatGPT-to-visualization chain of AI produced a design. I walked away from the conversation excited about the possibilities ChatGPT offers architecture: A means of record-keeping, information sourcing, and design ideation that could easily integrate into architecture processes. I’d like to imagine AI incorporated into the day-to-day of an architect as a tool—particularly alongside its visual AI counterpart—similar to BIM or CAD: an improvement on our tasks in project management, construction documentation, and yes, even design thinking. Rather than manually designing by hand, the conversational tone and forum in which we interact with a chatbot puts architects in a narrative role, supervising development and riffing off the open-sourced web to formulate and advance unforeseen imaginings.

The expertise, creative intuition, and social engagement we as designers bring to the table are qualities that will continue to keep the profession afloat—at least for now.

John W. Lynch is a project architect at Hickok Cole, where he specializes in multifamily projects with a background in master planning, landscape design, project development, and consulting.

Prompt 1 (Keyword: Colorful, produced more diagrammatic looking images w/ colors similar to MVRDV): A high-resolution, colorful rendering of a Vertical Mixed-Use building in an urban setting using the following information: The building should be 24 stories high with retail, office, residential, hotel, and library programs housed within. The wellness center and pool have been placed near the top of the building, with the green roof above, to create a visual connection between the indoor and outdoor spaces. The building’s look should involve a majority of the façade as floor-to-ceiling glass windows to allow natural light into the building and offer views of the surrounding cityscape. The base of the building, up to the third floor, would be clad in a darker-colored stone or masonry material to provide a solid, grounded feel to the building. The upper floors would be covered in a perforated metal panel system that would create a unique texture and allow for some degree of transparency while also providing sun shading for the spaces behind. This combination of materials would allow for a modern and sleek aesthetic while also providing some variety in texture and depth to the building’s overall appearance. The building’s form is relatively simple and efficient, extruding straight up with some subtle gestures or articulations added to the building’s form to break up the massing and add visual interest. For example, the corners of the building could be chamfered or rounded to create a softer, more dynamic silhouette. The corners could also be angled or rotated slightly to create different views and vantage points from the interior spaces. Additionally, the height of the building can be varied slightly to create a more interesting roofline. This could involve stepping back the upper floors slightly to create a roof terrace or garden space, or simply varying the height of the top floor to create a more dynamic silhouette.

Prompt 2 (Removed Keyword: Colorful, produced more “realistic” material images stripped of odd color overlays) (includes other minor modifications to the input description): A high-resolution, rendering of a Vertical Mixed-Use building in the downtown of a city that is 24 stories high with retail, office, residential, hotel, and library programs inside; with a wellness center and pool placed near the top of the building, with the green roof above, to create a visual connection between the indoor and outdoor spaces. The middle floors of the building has floor-to-ceiling glass windows to allow natural light into the building and offer views of the surrounding cityscape. The base of the building, up to the third floor, would be clad in a darker-colored stone or masonry material to provide a solid, grounded feel to the building. The upper floors would be covered in a perforated metal panel system that would create a unique texture and allow for some degree of transparency while also providing sun shading for the spaces behind. This combination of materials would allow for a modern and sleek aesthetic while also providing some variety in texture and depth to the building’s overall appearance. The building’s form is relatively simple and efficient, extruding straight up with some subtle gestures or articulations added to the building’s form to break up the massing and add visual interest. For example, the corners of the building could be chamfered or rounded to create a softer, more dynamic silhouette. The corners could also be angled or rotated slightly to create different views and vantage points from the interior spaces. Additionally, the height of the building can be varied slightly to create a more interesting roofline.

CLOSE AD ×