Thursday, November 7, 2024

The Non-Existent Artist: A Look At AI Art

Recent Articles

an example of AI-generated art

By René Huels, Contributor to The London Financial

The online populus has recently been tooling and fooling around with an online open source AI image generator (formerly) called DALL-E Mini. If you are anything like me, you will have undoubtedly seen these outlandish, artificially generated pictures in memes; “Donald Trump in a water logged basement”, “Mario 64 Capitol Riot” and “Joe Rogan in Fallout 4” are just some examples.

 But while resources like DALL-E Mini, now Craiyon, are more widely used for comedic effect among internet personalities and those seeking online clout, AI Art has seen a recent rise and wider use among digital and traditional artists with tools like Midjourney and DALL-E 2 being adopted by many. This recent rise has revealed a new pocket of artistic creation, where artists from all traditions and teachings can use these AI systems as tools to further their creative output.

Firstly, we’ll take a look at the Generators themselves and how they work from a user’s perspective. To start, let’s look at the aforementioned Craiyon.

Craiyon, originally called DALL-E Mini, got its name from an entirely separate and unaffiliated AI image generation system of the same name. The original DALL-E was created by OpenAI, an AI research and development company out of San Francisco, while the “Mini” version is an independently developed and reverse engineered version of the original OpenAI system, created by Boris Dayma, a French programmer based out of Houston. Dayma’s version was made as a way to give the public access to what is a mostly exclusive utility, limited to those who have applied for the closed beta of most of these AI generation systems. In both the independent Mini version and the original OpenAI version (in fact most AI image generation systems) all that is required from the user is a line of text. In certain systems, this can be as simple as one sentence directly describing what the user wants to be generated, while others can take a bit more artistic liberty, such as additional adjectives, less directly descriptive prompts, or having the main prompt itself being an eclectic phrase or even short form poetry, allowing the AI to interpret the text for itself. While primitive and not as accurate compared to other AI tools, Dayma’s Mini version certainly has the capacity to create accurate enough images based on a given text, but, due to being the pet project of a sole developer, it does miss the mark occasionally, often generating a given request very literally or confusing a word in a prompt for a similar but separate definition. 

OpenAI’s first iteration of DALL-E was a much more simplistic system, freely available on their website and limited to a set number of variations in a sentence, such as “An Aerial Shot of a Capybara, sitting on a Mountain”. The text in bold and underlined are the sections of the sentence that can be changed to a set number of variables – such as exchanging an aerial shot for a wide shot, or a close up. This version of the system, while simple in nature, acts as a proof of concept, generating fairly accurate images, despite the user only being given a limited scope. The second iteration of the DALL-E system, DALL-E 2, is the much more advanced and much more accurate older sibling system to the first. This system is a part of that “exclusive” group of AI systems that requires an invitation to join. Mostly, these invitations are needed due to bandwidth limitations, as to not overload the system’s servers, which is a problem with the publicly accessible Craiyon. After all, having almost the entire internet constantly requesting to see “Kermit the Frog beating up Jimmy Neutron”, among other incessantly online ideas, can overload an independent server very fast.

DALL-E 2 has proven to be a beast of AI image generation, often creating hauntingly accurate pictures of exactly what you request. Some of these can be seen as the samples shown on the DALL-E 2’s webpage, but a quick search on most social medias for #dalle2 will show you even more images generated through the system, all by real users given access to DALL-E 2. Some of these can be more direct but very realistic images of “Elf Girl on a Beach” or “Hamster with a backpack”, while some others are more artistically stimulating but still very realistic. However, some have vocalised a wariness to the new breed of AI generated content, since these tools are so accurate, it could put commissioning artists and even stock image photographers out of a job. While, yes, some of the user base of these tools such as DALL-E 2 and even Google’s own AI Generator, Imagen, are business owners or company representatives that plan to use these systems as a cheaper alternative to hiring an artist to make their logo or pay a stock image website for a picture of “a dog playing the piano”, I’ll argue that these tools are more suited towards artists to experiment with and create.

To expand on that, I want to take a look at an AI system that I was recently given the personal privilege of toying around with, Midjourney. 

Midjourney, like DALL-E Mini, is an independently developed AI Image Generator hosted out of an private-invite-only discord server, developed by 8 programmers, known on the server as Seb (SebbyLaw), Sam (gadgetsam), Red, Nadir (NCPlayz), Jack, Dominique (duschendestroyer) and Daniel (danielrussruss). Midjourney initially started their closed beta tests in February to March 2022, though their official twitter account has been around since November 2020, presumably around the time where they did private alpha tests among the developers. Midjourney, though working through a discord server (which operates remarkably well) operates the same way DALL-E 2, DALL-E Mini and Google Imagen works, a line of text as a prompt to start the AI algorithm to generate an image, also allowing additional commands and programming arguments such as adjusting the aspect ratio, the importance of certain words in the prompt, among other things. However, Midjourney has a unique quirk to its image generation that other AI systems don’t have, that gears it more towards the more creative types. Whether it’s up to the material used to train the algorithm or whatever lines of code the developers programmed into the AI, Midjourney’s generation often creates these very uniquely artistic and almost human like images, in the sense that the things it creates gives me as a user the impression that the AI itself has its own quasi-style and artistic philosophy, like random strands and grains that often generates in Midjourney.

While generating scenery or objects, it can be pretty straight forward, generating what you want with its own occasional and specific flair, but with humanoid figures, it can create these mashed and amalgamated bodies and silhouettes. 

Midjourney also operates particularly well with more expressive and open-ended prompts. I often use random bits of poetry or poetic-like sentences when generating, sometimes using singular lines from songs, in addition to more adjectives and prompt words at the end to specify the atmosphere and mood.

It’s unfortunate that AI Image Generators are seen by many as an opportunity of monetary gain (via Cryptocurrency) more often than they are seen as devices to further art. Perhaps the name “Image Generator” does invoke a formal and strict function, almost industrial, which understandably also causes some worry among creators that even the job of an artist can be replaced by machines. While this is a reasonable suspicion, especially in a time where creativity is overlooked for exceptional profit, with a simple user interface and a quick google away (sometimes for a price), AI Art invites any and all, professionals and hobbyists, to create fantastical pictures and to perpetuate the creative flow.

René Huels
+ posts

1 COMMENT

  1. I’m impressed, I need to say. Actually rarely do I encounter a blog that’s both educative and entertaining, and let me let you know, you might have hit the nail on the head. Your thought is outstanding; the difficulty is one thing that not sufficient people are speaking intelligently about. I’m very glad that I stumbled across this in my search for one thing relating to this.

LEAVE A REPLY

Please enter your comment!
Please enter your name here