stub Mastering AI Art: A Concise Guide to Midjourney and Prompt Engineering - Unite.AI
Connect with us

Prompt Engineering

Mastering AI Art: A Concise Guide to Midjourney and Prompt Engineering

mm
Updated on
Midjourney Generated UNITE AI LOGO

Introduction to MidJourney AI-Generated Art

AI is swiftly breaking through the barriers of impossibility and has most recently invaded the domain of art, transforming it entirely. Now, you need not be a master artist or a Photoshop expert to bring the figments of your imagination to life. A simple, well-articulated prompt is all you need, thanks to Midjourney.

It all began with the introduction of groundbreaking technologies like DALL-E, Midjourney, and StableDiffusion back in 2022. While each of these innovations brought its distinct touch to the canvas of Generative AI, Midjourney, in particular, has continued its compelling journey, making noteworthy strides.

Midjourney is currently the leading high-resolution text-to-image AI generator in the market and it stands tall with its unique blend of text-to-image generation, media editing and upscaling, and active art community access, all starting at $10 per month. This comprehensive suite of features presents an exciting canvas for artists, tech enthusiasts, and AI professionals alike, building an environment for creativity and innovation.

The art world is certainly taking notice, with generative AI in the art market projected to witness a staggering growth of 40.5% CAGR. Midjourney stands unrivaled in crafting the most realistic and high-quality visuals using AI.

Effectual prompt engineering goes beyond mere creation; it encompasses best practices. Prompts should offer clarity, and be succinct, yet provide the AI with enough guidance without excessive prescription. Also, the target audience must be considered during design, taking into account variables such as age, gender, and cultural background, among others.

How does MidJourney work?

Mid-Journey leverages two novel machine learning technologies – large language and diffusion models. The language model, similar to AI chatbots like ChatGPT, aids Mid-Journey in interpreting the meaning of your prompts and converting them into vectors. This vector then guides the diffusion process.

Midjourney's inner workings are largely undisclosed. Nevertheless, it's evident that it uses text-to-image generation from two relatively novel machine-learning technologies: large language models and diffusion models. The former is perhaps familiar to users of AI platforms like ChatGPT, and the latter is a promising addition to the AI art generation sector. The entire system relies on the CLIP dataset for training, which can be found on OpenAI's research page.

Despite the limited information, it's possible to sketch a broad picture of Midjourney's diffusion model, aptly named ‘Stable Diffusion'. Essentially, Stable Diffusion is an open-source model that skillfully transforms text prompts into images of varying styles and content. This sophisticated procedure is achieved through a diffusion model, a generative model that bridges the dependencies between textual inputs and image outputs.

Diffusion models are built on the foundation of the Denoising Diffusion method, an approach influenced by non-equilibrium thermodynamics. This method systematically dismantles the structure of data and later restores it. This approach was adapted for image generation by Ho et al. in 2020, leading to the inception of the diffusion models we see today.

Training diffusion models involve two primary stages. Initially, the forward or diffusion process involves the incremental addition of random noise to the input image until it completely morphs into noise. This process is governed by a fixed Markov chain, which consistently adds Gaussian noise across several successive steps.

Midjourney working demonstration

Subsequently, in the reverse or reconstruction phase, the model restores the original data from the noise-dominated state achieved in the diffusion process. This process is driven by a Markov chain with learned Gaussian transitions, implying that the prediction of probability density at any given time is solely reliant on the state attained in the preceding time step. As the latent ‘x1, …, xT' share the same dimensionality as the data, diffusion models classify as latent variable models.

Cost and Subscription of Mid-Journey

While many chatbots like ChatGPT and Bing Chat offer almost unlimited usage for free, the scenario differs for image generators like Mid-Journey. Due to the substantial computing power required, especially from the graphics processing units (GPUs) and video memory usage for the denoising process, Mid-Journey's service comes with a price tag.

The basic plan starts from $10 per month, providing around 3.3 hours of GPU time, enough for approximately 200 image generations. However, there are higher-end plans offering unlimited images in Relaxed mode, albeit with a longer waiting time.

Setting Up Your MidJourney

  1. Starting with MidJourney involves signing up on their official website, subscribing to a plan, and then being redirected to Discord.
  2. Once you locate the Mid-Journey channel on Discord, navigate to the Newcomer Groups on the left side. From there, you can observe other users creating prompts, learn the mechanics of Mid-Journey, and interact in a bustling environment.
  3. After familiarizing yourself with the environment, invite the bot to your private server to create images undisturbed.  The bot generates four preview images based on your prompt, allowing you to select the closest match to your original idea and further refine the image.

Prompt Structure for Midjourney

  1. The /imagine command at a discord channel inside the Midjourney channel generates a unique image from a short text description (Prompt).
  2. To recreate a specific style across various images, simply input the image URL alongside your text prompt. Your new, consistent outputs will merge elements from both your chosen image and text.
    /imagine http://link-to-your-image <image description>  –parameter1 –parameter2
    You can generate a link to your image by uploading it to the Discord channel. Once uploaded, right-click the image and select ‘Copy Link'.
    Here http://link-to-your-image and parameters are optional.
  3. Following this, the Bot gets to work on your image, taking approximately a minute to offer four alternatives. This process involves the use of robust Graphics Processing Units (GPUs) to process and interpret each prompt.
  4. Keep track of your GPU usage by using the /info command. It allows you to check your ‘Fast Time Remaining' and monitor your subscription's GPU time.

/info prompt midjourney

Image Upscaling and Alterations

For a more refined image, use the ‘U' buttons under the images to upscale your preferred choice. You can also use the ‘V' buttons to make adjustments to specific images. For further changes to an upscaled image, use the ‘Make variations', ‘Light Upscale Redo', and ‘Beta Upscale Redo' options. The ‘Web' button allows you to view the image in a larger size in a separate window.

Midjourney allows for image upscaling to 2048×2048 (square) and 2720×1530 (widescreen) resolutions via its beta upscale redo feature, with a default generation grid size of 1024×1024 (square) and 1456×816 (widescreen). Each image can be further enhanced through the “U” upscale options, which improve specific parts of the image.

Take a look at this prompt that produces fantastic artwork with Midjourney's V5.2 version.

/imagine Artwork portrays a solitary tree under a starlit sky, with a child reading beneath, in the hues of serene blue and warm orange, inspired by the brushstrokes of French Impressionism, Persian miniatures, Bauhaus simplicity, evocative of classic children's fairy tale illustrations, achieving an asymmetrical harmony, expressed in an enchanting, folk/ naïve: –ar 15:19 –upbeta –q 2

Midjourney Prompt Guide example

Creating your First Midjourney AI Art

  1. Crafting the Basic Blueprint: Think of yourself as an artist. Begin with a straightforward, vivid description of the image you aspire to bring to life. Outline the main subject, the ambiance, or even the minute details you wish to embed. Use punctuation such as commas, brackets, and hyphens to structure your thoughts. For improved results, be explicit about your design's context and details. Elements such as subject (e.g., Dragon, vintage car, Abraham Lincoln), medium (e.g., digital art, pencil sketch), environment (e.g., outer space, underwater, bustling city), lighting (e.g., soft, neon, backlit), color (e.g., earth tones, vibrant, muted), mood (e.g., melancholic, whimsical, peaceful), and composition (e.g., landscape, closeup, wide-angle) can be critical. Examples:
    • An idyllic forest bathed in sunlight, a footpath meandering into the distance
    • A city that never sleeps, with neon lights reflecting off the pavements and a diverse crowd milling about
  2. Infusing Style and Keywords: Midjourney's AI is capable of illustrating images in a myriad of styles such as abstract, surreal, or realistic. By integrating a style or related keywords, you can guide the AI to create an image that mirrors your vision. Experiment with various styles and keywords to discover the perfect blend. Examples:
    • A landscape painting depicting a desert at dawn, mirroring the style of Georgia O'Keeffe, featuring a pastel color palette and organic forms.
    • An abstract rendering of a peaceful forest, with geometric patterns forming trees and foliage, inspired by Piet Mondrian's compositions.
  3. Harnessing Advanced Settings: Consider Midjourney as your creative toolbox, brimming with advanced settings that allow you to fine-tune your generated images. It’s like wielding a magic wand, enabling you to conjure the ideal balance of randomness, stylization, and image variation. Unleash your creative prowess by tinkering with these settings until you find the perfect mix that resonates with your vision. Examples:
    • A serene Japanese garden with a pond reflecting the cherry blossom trees –seed 22 –s 150 –c 40
    • A dystopian cyberpunk city, illuminated by neon lights –seed 88 –s 600 –c 60
  4. Highlighting Elements with Weights: Visualize your image as a symphony, with every element contributing to the grand ensemble. Using the “::” notation, you can dictate the significance of various elements in your image, allowing you to control the spotlight. Examples:
    • [An elegant peacock]::3 perched on a [wisteria tree]::1 blooming with vibrant flowers
    • [A majestic elephant]::2 basking in the glow of a [setting sun]::1 in the savannah
  5. Midjourney is the process of trial and error: Experimenting with different elements and features is necessary. Each iteration will bring you closer to the image you imagined to bring alive.

Mid-Journey parameters

The model of Midjourney operates using adjustable parameters that control the outcome of the image generation process. These parameters allow users to tweak and tailor their generated art, fine-tuning the model to create outputs that perfectly suit their goal.

Below are the basic and the advanced parameters, their functions, and how to use them to fully harness Midjourney's capabilities:

  • Aspect Ratios (–aspect or –ar): This parameter controls the ratio between the width and height of the generated image. For example, a ratio of 16:9 is perfect for YouTube thumbnails, while 1:1 produces a square image great for Instagram.
  • Chaos (–chaos): This parameter adjusts the diversity of the initial image grid and ranges from 0 to 100. Higher chaos values will give you unpredictable and unique outcomes, while lower values will ensure more consistent results.
  • No (–no): This parameter helps you eliminate specific elements or characteristics from the generated image. For instance, if you want a picture without any red, you can use “–no red”.
  • Quality (–quality or –q): This setting adjusts the time required to generate an image. Higher quality requires more processing time but yields intricate details. This parameter can take on values of .25, .5, 1, or 2.
  • Seed (–seed): This parameter determines the starting visual noise, acting as a baseline for the generated image. Using the same seed number with the same prompt will give similar outputs. It accepts integer values between 0–4294967295.
  • Stop (–stop): With this parameter, you can prematurely terminate a job, producing less detailed but potentially interesting outputs. The range is 10-100. For instance, if you specify ‘–stop 50', the image generation process will halt at 50% completion, resulting in a less detailed, possibly abstract image.
  • Stylize (–stylize or –s): This controls the level of artistic application on the generated image. Lower stylization values yield results closer to the initial prompt, while higher values result in more abstract and artistic interpretations. In v5, the default value is 100, but you can set it anywhere from 0-1000.
  • Model Version: You can select from various versions of the Midjourney model by using the –version or –v parameter.
  • Niji: A model specialized in anime-style images. It can be accessed using the –niji parameter.
  • Highmi Definition: For abstract and landscape images, the –hd parameter activates an early model version that yields larger, less consistent images.
  • Test Models: Midjourney offers special models for specific use cases. –test and –testp activate the standard and photography-focused test models, respectively.
  • Upscaler: Midjourney algorithm starts with a low-resolution image grid. It offers several upscaling models to enhance image size and detail.
    • Uplight: An alternative light upscaler (–uplight) provides upscaled images that are less detailed but smoother.
    • Upbeta: The –upbeta parameter leads to images with significantly fewer additional details, staying closer to the original grid image.
    • Upanime: The –upanime upscaler is designed specifically to work with the –niji Midjourney Model.
  • Image Weight: Use –iw to adjust the image prompt weight relative to text weight. The default value is 0.25.
  • Sameseed: The –sameseed parameter ensures that all images in the initial grid use the same starting noise, creating very similar generated images.
  • Video: Midjourney can save a progress video of the initial image grid generation process using the –video parameter.
  • Creative: With the –creative parameter, the test and testp models output more varied and creative images.

Midjourney consistently rolls out updates to enhance user experience, with the latest being version 5.2, launched in June 2023. By appending –v 5.2 to your prompt or selecting it through the /settings command, users can access this advanced model. Version 5.2 offers superior image detailing and understands prompts more intuitively, bringing brighter colors and improved compositions.

Understanding Copyrights for AI-Generated Artwork

Midjourney Image of Mix of AI and copyright laws

On March 2023, the US Copyright Office clarified its stance on the copyrighting of AI-generated works. The policy states that while the human-made elements in AI creations (like writings or unique designs) can be protected, AI-produced images do not qualify for copyright, adhering to global norms that only human creations are eligible for copyright protection.

In the context of AI art, copyright is not straightforward. While digital art has the human artist's input, AI-generated art is created without direct human intervention, which complicates the issue of authorship and ownership. As per the US Copyright Office, initial ownership is granted to the work's author – a human creator. However, as AI cannot be considered an author, AI-generated art lacks clear ownership.

The latest guidance from the US Copyright Office allows copyrighting of AI art only when it contains sufficient human authorship. The level of ‘sufficient human authorship' remains undefined and depends on the degree of human involvement in creating the AI artwork.

Interestingly, Midjourney, an AI-based platform for image creation, has established its own policies for usage rights. Free trial users can use the images for non-commercial purposes under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), with proper credit to Midjourney. However, paying subscribers can use the images for any purpose, including commercial, under the General Commercial Terms. This development in the copyright space presents an intriguing dynamic between AI and human creativity.

Utilizing Midjourney for Dynamic UI Designs and Creative Logo Generation

From designing intuitive UIs for websites or mobile apps to crafting unique logos and banners, Midjourney empowers content creators by generating an array of design alternatives within seconds.

Here's how it works. Each design begins with a prompt, acting as a blueprint for the AI to follow. Suppose you're designing a UI for an Online tutoring platform app. A typical prompt might be: “/imagine Online tutoring platform user interface, Dribbble, High Resolution, 4K, like khan academy”.

Initial outcomes might not hit the mark perfectly. For instance, adding “Adobe XD” into the mix may help Midjourney tailor its designs to be more Adobe XD-compatible. An optimized prompt will be:

/imagine Online tutoring platform, user interface, Adobe XD, Dribbble, High Resolution, 4K, minimalist design

Midjourney Image of Desktop App UI/UX designs

 

Text Inspired Logo or Banners using Midjourney

Let's explore how to create a banner with a logo for UNITE AI.

First, you need to have a simple image of the text you want to display. You can create this using any graphic design tool or text editor and upload it to your Discord channel.

sample text for UNITE LOGO
A simple image of text used to create UNITE Logo

The prompt to create the banner is:

/imagine Letters: <link to a simple image of text to be displayed> UNITE in a futuristic, AI-inspired typeface logo with letters UNITE –v 5 –ar 16:9

Midjourney Prompt Guide Feature Screen

Take a look at these example prompts for more ideas:

/imagine A lone musician performing a serene melody on a floating city at dusk, art nouveau style

Midjourney Prompt Guide: Image of Indian Art

 

/imagine A image of a future person working on a futuristic desk, surrounded by holographic screens and advanced technology. The person is wearing a sleek, silver jumpsuit and has virtual reality goggles on. The environment is filled with neon lights and floating holograms. The atmosphere is futuristic and high – tech, with a sense of excitement and innovation. The camera is a high – resolution digital camera, capturing every detail with precision. The artistic style is a blend of cyberpunk and minimalism, with a focus on clean lines and bold colors. The directors, cinematographers, photographers, fashion designers, cartoonists, and artists collaborating in this unique juxtaposition are Christopher Nolan, Roger Deakins, Annie Leibovitz, Virgil Abloh, Hayao Miyazaki, and Kaws.

Midjourney prompt for a future person working

/imagine 1940s – style Barbie as a wartime nurse, in a vintage army hospital setting, tending to the wounded soldiers, in the style of classic Mattel illustrations, with the atmosphere of sepia-toned World War II photography 8k –v 5 –ar 16:9

Midjourney Prompt Guide: Image of Barbie in Unique settings

/imagine Frame of a woman leaning against a cyberpunk, hoverbike, Japanese anime, sprawling cityscapes, 32k, intricate spaceport, fleeting, skyscraper panoramas, sleek

Midjourney Image of cyberpunk style girl

 

Final Thoughts: Navigating the AI Art World with Midjourney

Remember, “A picture is worth a thousand words”. A detailed, vibrant description can work wonders. Yes, Midjourney is not free to use. Yet it is revolutionizing the art world and expanding our creative possibilities through its state-of-the-art text-to-image AI technology. With the ability to convert a simple text prompt into a high-resolution image, it's a tool that promises boundless opportunities, not just for artists, but also for UI/UX designers, tech enthusiasts, and AI professionals.

Here are some essential takeaways to remember as you embark on your Midjourney adventure:

  • Learn the basics of Midjourney prompt: Use clear, succinct, and comprehensive descriptions that encapsulate your vision to guide the AI effectively. Remember to consider your audience, and don't hesitate to experiment with various styles, moods, and contexts.
  • Utilize parameters: Enhance your creative experience by leveraging the multitude of advanced settings that Midjourney offers. From controlling the aspect ratio to adjusting the chaos parameter for unique outcomes, every detail can be tailored to your preference.
  • Embrace the iterative process: Your first AI-generated artwork may not be perfect. Embrace this iterative process and learn to refine and optimize your prompts for better results.
  • Understand the copyright implications: While AI-generated artworks themselves are not eligible for copyright, the human-made components within them can be protected.

In essence, the integration of AI into art has democratized creativity and blurred the lines between human and machine-made masterpieces. As we continue to witness the remarkable growth of generative AI in the art market, it is undeniable that the AI art revolution, led by platforms like Midjourney, is just beginning.

I have spent the past five years immersing myself in the fascinating world of Machine Learning and Deep Learning. My passion and expertise have led me to contribute to over 50 diverse software engineering projects, with a particular focus on AI/ML. My ongoing curiosity has also drawn me toward Natural Language Processing, a field I am eager to explore further.