A Beginners Experience using Generative AI text-to-image modeling

26

September

2023

No ratings yet.

Personally, I have little experience using generative AI to create images and visuals; therefore, I decided to do some exploring. There are currently already a couple big generative AI programs, such as Midjourney and Dalle E, so I decided to first explore those. However, to my disappointment, these programs did offer free versions for users to try out at this point in time. According to Midjourney, this is because there were too many users, and therefore you must subscribe in order to use their services. The fact that these technologies are no longer available for free shows the growth of these technologies in the last few years and the growth of users. In the past, many of these technologies were free to use and try out, but that is no longer the case due to their growth in popularity. For the generative AI image programs I could find that were free to try, they often offered a couple free image generations before you had to pay for more.

After some more research, I found a website called Feng My Shui, where I was able to experiment using the Stable Diffusion XL model. Stable Diffusion is an open-source AI platform that allows users to create images using prompts and descriptions. I experimented with some basic prompts to test how the software worked and was highly impressed with the results. One example of the prompts I entered was “Rotterdam Skyline in Cartoon Style.”. From that, I got a selection of images that were accurate and generally of good quality. (Figures 1 and 2)

I decided to try another program to see how it would compare to the results I received with Stable Diffusion XL. I found another free trial program called Freepik that also allowed me to use prompts and descriptions to generate images. I again used the prompt “Rotterdam Skyline in a Cartoon Style” and received much different results than before. The main difference I noticed was that Freepik had a much different style to that of Stable Diffusion XL, even though given the same prompt. (Figures 3 and 4) The images were again high-quality renders and but less accurate than to what Stable Diffusion XL provided. It is clear to see that it is a skyline of some sort in Cartoon style but it is hard to tell it is Rotterdam compared to Figure 1 and 2. I also tried some more complex prompts using Freepik to test the accuracy of the generations. I used the prompts “A student studying economics using a computer in a library”  and “A forest with vibrant plants and a small stream of water running through it during a storm”. (Figures 5 and 6) Again, I was impressed by the accuracy of the images. Both generations were accurate and of high quality, but they did not exactly manage all the details. For example, it is not clear that the student is studying economics, and certain parts of the image were not well generated, like the fingers of the student.

After trying these generative AI image programs, I was very impressed and look forward to keep using them in the future. I did notice that not every program has the same type of result with the same prompts; therefore, it is important to explore and find a generative AI image program that fits your personal needs and preferences. However, like discussed before, these programs keep getting harder to use for free, and I noticed that the biggest platforms are not currently available to try for free. I hope to be able to try these in the future to compare them to the programs that I have used so far. I plan to keep on using programs like this to experiment and learn about them, and I expect that they will continue to develop at a great rate.

Please rate this

1 thought on “A Beginners Experience using Generative AI text-to-image modeling”

  1. Thank you for this interesting case of exploring generative AI images! The comparison between Stable Diffusion XL and Freepik using the Rotterdam Skyline prompt really shows how different platforms can interpret the same input in unique ways and how unlimited the options are when it comes to generative AI. I agree with your observation about the shift towards subscription-based models as these technologies gain popularity. It’s a natural progression, but it does make it more challenging for users looking for free options to experiment and learn. Which has its positive and negative sides.

    Thanks again for sharing your post!

Leave a Reply

Your email address will not be published. Required fields are marked *