Insights

ChatGPT, Bing AI, Midjourney, and what comes next

Mike Redden — 14 March, 2023

Artificial Intelligence

When I wrote my last blog on GPT-3 back in October 2021, I had no idea that ChatGPT, Dall-E, Midjourney, Stable Diffusion, or the rest were going to become the household names they now are. That wasn’t because it wasn’t obvious how disruptive those technologies would be, but we’ve had plenty of industry disruption previously that has never entered the common vernacular. There are still people out here that would struggle to explain how you can take a photo on your phone and then open it on your computer a few minutes later.

We’re now seeing iterative development come faster than anyone could have imagined. We went from GPT-3 being a whitepaper to ChatGPT in a year or so, but then Bing Chat, GPT-4, and amazing tools like Midjourney v5 all coming out after what feels like weeks. Microsoft has just announced that their ‘Copilot’ system will be ubiquitous across all the Office products, and Google has just started testing their ‘Bard’ product with users in the US and UK. Facebook won’t be far behind with their ’LLaMA’, and then the open source models will undoubtedly follow soon behind.

Today, we’ve got Getty Images refusing to accept AI-generated images (Getty Images bans AI-generated content over fears of legal challenges – The Verge). Clarkesworld , a science-fiction story publisher, has banned AI generated or enhanced stories (‘Out-of-hand’ flood of ChatGPT-like A.I.-generated stories forces prominent science fiction magazine to stop accepting submissions | Fortune). Stack Overflow is banning users that submit AI generated responses (AI-generated answers temporarily banned on coding Q&A site Stack Overflow – The Verge), and China (Yes – the entire country) has banned all AI-generated media that isn’t watermarked (China bans AI-generated media without watermarks | Ars Technica).

Entire online communities are up in arms. For every person posting AI-generated content as their own masterpiece, there is someone criticising it as ‘Soulless’ or ‘Plagiarism’. The problem is – it’s already hard to know the difference between genuine new art, and something AI generated, and it’s only set to become harder. It poses the question – ‘What happens when there is no human way to identify what a human has done, versus an AI?’ Even Sam Altman – the CEO of OpenAI (the company behind ChatGPT and driving all of Microsoft’s innovation in this space) is ‘A little bit scared’ of what might happen next (‘We are a little bit scared’: OpenAI CEO warns of risks of artificial intelligence | Artificial intelligence (AI) | The Guardian).

An AI generated image of a woman at a desk with miniature robots.

An AI generated drawn image of a woman at a desk with a robot standing beside her.

AI generated images

An AI generated image of a woman at a computer with a robot beside her.

But it’s important to note that these tools are limited in their ability. Whether it’s text-based tools like ChatGPT, or image-based tools like Midjourney, they are limited to being able to reproduce something where there are plenty of examples existing content on the subject Sure, the tools are fantastic at introducing randomness, and detail that you might have provided, but the entire process could be summed up as follows:

If you formed a committee of everyone that had previously written on topics similar to your prompt, how would they vote to respond to it?

Naturally, anything that is produced by the above process won’t be exceptional. It won’t be revolutionary, novel, or (technically) creative. It will be, by the mathematical definition of the word: Average. It’s the middle point between everything that came before it. The first versions of these tools did exactly and only that – given the same prompt, it would return the same output every time (unless the underlying model updated), but the outputs were considered ‘wooden’. To introduce ‘warmth’ to the responses, the models simply added Randomness. Going back to the committee metaphor, rather than always simply going with what generated the most ‘Votes’, you randomly selected from the top 20% of suggestions. You’re still getting something that a lot of the experts think is correct – it’s only slightly less popular than the ‘Best’ choice – but now you’ve introduced an element of ‘Creativity’ into the outputs.

Understanding that it’s purely derivative with an element of randomness certainly doesn’t make it useless, though. What it means is that you can effectively build this ‘Committee of experts’ whenever you want and get them to generate content for you. The practical applications for this across the world are incredible – any time a document is written, unless you are the leading expert in the world for that specific paragraph, you can get an AI to write much of it for you. Even if the committee struggle to find consensus and what comes out is a bit garbled, in seconds you can read and tweak what has been written, rather than losing hours in trying to work out the absolute best way to phrase your point in the first place.

Throughout my entire consulting career, I’ve always found it to be the case that if you want to get feedback from anyone on anything, a blank slate is the worst place to start. Once you have something to work with, expanding, contracting, tweaking, shifting, or adapting that to be what you want is simple by comparison. Use AI to get your starting position, and then adapt what has been generated to get where you want to be.

This becomes even more useful if you can train the AI itself across your own data. Again – coming back to the ‘Committee of experts’, what if those experts were the people that have written documents, knowledge articles, intranet pages, news bulletins, and any other public-internal documents? What if it was the people that had written the standards and practices for software development? What if you could train it on everyone’s instant messages or meeting transcription?

Right now, the examples we’ve seen have been specifically trained to provide two outputs: Conversation and Pictures. But there’s absolutely nothing about GPT technology that will limit it to that. I guarantee you there are already models being produced to create 3D models, business documentation, schematics and engineering diagrams, or anything else we can produce. It’s even theoretically possible to connect this type of technology up with hardware and have it produce physical outputs based on whatever inputs you give it, although I suspect that’s a little further down the road.

The use of AI technology has exploded and it’s not going to slow any time soon. If you’d like to know how your organisation can use this technology to significantly speed up almost any part of your organisation, get in touch. We’d love to chat.

Insights

ChatGPT, Bing AI, Midjourney, and what comes next

Related Articles

Power Platform – How You Can Speed Up Development and Take Functionality to the Next Level with AI

Microsoft Build 2024: Reimagining productivity with Copilot Extensions and Plugins

Harnessing AI Enabled Citizen Developer Potential: A Balancing Act Between Innovation and Governance