Mees – Social Blog

No ratings yet.

As we all know generative AI, like Chat-GPT, is disrupting many industries at the current moment. But, as we have all seen by now it has some flukes. The models are prone to ‘hallucinations’, give inaccurate answers, have limited specialized knowledge, and struggle to create coherent long-form content. This takes away from the value that can be created by the model in businesses.

To remedy this Companies can fine-tune the model by providing a large dataset with questions and answers to teach the model how it should behave. Also creating a reference window by vectorizing text and storing it in a database can provide the model with company-specific specialized data based on the questions asked, which can then be interpreted by the model further reducing the number of hallucinations and inaccurate answers the model might generate.

This improves the performance of the model significantly in shorter chat-based interactions, but the model still struggles with coherent large replies to complex topics. So, as you might be able to imagine, Chat-GPT cannot make your video games or write large multipage reports. But, with the advent of LLM AI agents this might change.

Agents are actors who perceive their environment and make decisions based on this environment. A chatbot is a kind of agent that perceives the input question and provides the answer. This is nothing new, but creating a system of AI chatbots that interact with each other, each having a different role, and iteratively working to perfect a project is. ChatDev, a development platform for Multi-agent framework, was able to create a game by designing an AI LLM workforce of developers, a CTO, a COO, and a CEO where each was given certain responsibilities.

By iteratively generating code, getting it audited by other models, and reimplementing new code the model was able to solve a complex topic like game development. The game might have been fairly simple but imagine this technology developing over the next two years like generative AI has developed over the last year. In the future entire companies of sections of companies might be run by ai.

In my opinion, this technology has great potential and might change the way we work in the future, so i advise you strongly to conduct your own research into this topic.

5/5 (1)

So, here I am, sitting in my student room, watching the Information Strategy lectures on Canvas. I am doing my very best to soak up every single bit of knowledge Professor Li bestows upon me, but I am unable to focus. The inevitable seems to have finally happened: short-form content on TikTok and YouTube has reduced my attention span to that of a goldfish.

With the last two brain cells I could muster, I thought to myself: “I bet you could write a script in Python to summarize videos in some way or form… If only I knew how to do that…”.

As I don’t know much about coding I decided to do research, give up, and ask Chat-GPT instead. And? Surprise surprise, together with chat-GPT I succeeded in writing a program that takes an MP4 as input and writes a summary as output. Here is how I did it.

First I needed to write a function that takes an MP4 and extracts the audio. This was really easy (for Chat-GPT). Within 10 seconds I had a working code snippet. The next steps required me to actually think for myself. I know! unimaginable!

Next, I asked Chat-GPT how to make an API request for the OpenAI whisper model. But, with the information cut-off of 2021, this large language model doesn’t even know how to access its own API. The API documentation led me to copy the example code and change the variables to fit in my code. Chat-GPT helped me troubleshoot the code when it was not working and helped me define the API call into a Python function I could use later in the script.

The next task is summarisation. This is done with OpenAI API as well. Here I copied the example code and changed the parameters and variables. The code needs to be adjusted to use the text transcribed by the OpenAI whisper model. According to the Chat-GPT, we can do this by implementing the transcribed text into the messages parameter with a formatted string. Here, ‘content’ is the string contained within the output text file.

Now that these three functions are defined we can use them together at the end of the script. Here the functions you define actually get executed. This all results in the following python code:

import subprocess
import openai
import os

openai.api_key = 'Your_api_key'

def extract_audio(input_file, output_file):
    try:
        subprocess.run(['ffmpeg', '-i', input_file, '-vn', '-acodec', 'libmp3lame', '-ar', '16000', '-ac',
        '1', output_file], check=True)
    except subprocess.CalledProcessError as e:
        print("Error converting video to audio:", e)
        exit(1)


def transcribe_audio(audio_file):
    with open(audio_file, 'rb') as audio:
        return openai.Audio.transcribe(
            model = 'whisper-1',
            file = audio
        )

def summarize_text_from_file(filename):
    # Read the content of the file
    with open(filename, 'r') as file:
        content = file.read()

    # Use GPT-3 to summarize the content
    response = openai.ChatCompletion.create(
        model= "gpt-3.5-turbo",
        messages = [
            {"role": "system", "content": "You are a summarisation expert."},
            {"role": "user", "content": f"Summarize the following text focussing on the acedemic principle that might be relevant for Business Information Management students:\n\n{content}\n"},
        ],
        max_tokens= 1000  # Adjust as needed
    )

    # Extract the summary from the response
    summary = response['choices'][0]['message']['content']
    return summary


if __name__ == "__main__":
   
    input_file = input("Enter the filepath to your video: ") # Ask you were to find the MP4. 
    audio_file = 'output.MP3'  # Temporary audio file
    output_text_file = input_file + '.txt'  # Output text file

    extract_audio(input_file, audio_file)
    transcribed_text = transcribe_audio(audio_file)

    with open(output_text_file, 'w', encoding='utf-8') as text_file:
        text_file.write(transcribed_text['text'])

    # Clean up temporary audio file
    os.remove(audio_file)

    print("Transcription complete. Text saved to", output_text_file)

    filename = output_text_file
    summary = summarize_text_from_file(filename)
    print("\nSummary:")
    print(summary)

When you run the code, it asks you for the file path to your MP4 video. In my Code editor (VSCode) you can copy and paste the relative path of documents in the directory making this pretty easy. Otherwise, move the MP4 file to the same folder as the script and just copy-paste the name of the file then press enter and watch some TikToks, Shorts, and Reals while the script does all the work.

So, the big reveal! Here is a summary of the WeChat Video from session 2.5 generated by this script:

“Summary:
The text focuses on the academic principle of the emergence and influence of Chinese apps, particularly WeChat, on the development of the Internet and its potential implications for Western tech companies and individuals. It highlights the unique characteristics of the Chinese Internet, which operates as more of an intranet due to the Great Firewall that blocks foreign sites. As a result, Chinese copycat apps have filled this void and evolved to become successful companies. WeChat, in particular, is described as a super app that offers a wide range of services within one platform. The convenience and transformative nature of this technology are emphasized. However, concerns are raised regarding data privacy and governmental control due to the concentration of personal data in the hands of these companies and the Chinese government’s history of human rights violations. It is suggested that Western tech companies are now attempting to replicate the success of super apps like WeChat, which could have both powerful and problematic implications.”

So yeah writing this code took way longer than just watching the lectures but that’s not the point! There is a wise lesson to be learned here: Never hold yourself accountable for any problems that arise from your excessive use of social media and great things will come from it! Therefore, just keep scrolling.

P.S.: If you have any other ideas for automation that increase the amount of time I can spend on my phone, post them in the comments. Would love to try your ideas next.

AI agents. Gateway into autonomous companies?

22

Please rate this

Tired of watching lectures? why not write a Python script for automated summarisation using GTP-4

29

Please rate this