Bits & Neurons
Posts
Is GPT-Vision on the horizon?

Is GPT-Vision on the horizon?

Ts Yotov
September 19, 2023

Today's issue is brought to you by our sponsor - Koala. Accelerate Your Content Creation with this (better than ChatGPT) AI Tool!

In today’s edition:

🗞️ News

📰 Microsoft’s ‘special’ Surface and AI event (in two days)
📰 Google’s Bard chatbot can now find answers in your Gmail, Docs, Drive
📰 OpenAI races to launch multimodal LLM GPT-Vision, aiming to beat Google Gemini’s debut

🛠️ 2 Trending Tools

😎 Nerd section - What is LoRA and AI for Developers

🐦 Interesting from the Community - Registration for OpenAI DevDay is now open, and more

💡 Small bits - ChatGPT Cut-Off Date is Now January 2022, and more

🎨 BONUS - Viral Spiral Prompt - see it at the end :D

📰 Microsoft’s ‘special’ Surface and AI event (in two days)

Microsoft is gearing up for a "special event" in New York City, expected to focus on Surface devices and AI-powered features across their software suite, including Windows, Office, and Bing. The event comes on the heels of the resignation of Panos Panay, former head of Windows and Surface at Microsoft.

What to expect (AI related):

Windows AI: The company is likely to unveil Windows Copilot, an AI "personal assistant," as part of an update to Windows 11.
Surface AI: Features like Windows Studio Effects that use a dedicated Neural Processing Unit (NPU) could make their way into the new Surface Laptop Studio 2.
Office and Bing: The event might also shed light on Microsoft’s Copilot plans for Office and provide updates on Bing Chat Enterprise.

Why it Matters: The event is highly significant given the leadership changes at Microsoft and the company’s ambitious plans to integrate AI deeply into both hardware and software products. Plus Microsoft is the main OpenAI (ChatGPT) partner.

📰 Google’s Bard chatbot can now find answers in your Gmail, Docs, Drive

Google's Bard AI chatbot now extends its search capabilities to your Gmail, Docs, and Drive, offering a range of new functionalities and integrating with your personal data.

Bard can scan Gmail, Docs, and Drive to find and summarize information.
The feature is opt-in and can be disabled at any time for privacy concerns.
Beyond Gmail, Docs, and Drive, Bard will also connect with Maps, YouTube, and Google Flights.
A "Google It" button allows users to validate Bard's responses using Google Search.
Google has been progressively enhancing Bard's capabilities, including code generation and Google Lens support.

Why it Matters: This move transforms Bard from a simple web-based chatbot to an integrated assistant capable of interacting with your personal data. It streamlines information gathering and task management, but also brings new layers of privacy considerations.

📰 OpenAI races to launch multimodal LLM GPT-Vision, aiming to beat Google Gemini’s debut

In a race against Google's forthcoming large language model, Gemini, OpenAI is speeding up its development of multimodal LLMs, according to reports. Codenamed Gobi, these next-generation models will go beyond text, also handling images and other types of data, aiming to maintain OpenAI's lead in the AI industry.

OpenAI's next-generation model, codenamed Gobi, is a multimodal LLM that can handle text and images.
It aims to outpace the launch of Google's Gemini, which is currently in testing with enterprise clients.
Unlike traditional language models, Gobi can interpret images and text, suitable for various applications including code generation and image interpretation.
The features were previously limited to the company Be My Eyes, but are now preparing for a wider rollout, dubbed as GPT-Vision.
Google CEO Sundar Pichai acknowledged OpenAI's speed in innovation but emphasized Google's cautious approach.

Why It Matters: This puts OpenAI and Google in a head-to-head race for advanced AI capabilities. OpenAI's multimodal model could redefine how we interact with technology, offering a more holistic and integrated user experience. It also pressures both companies to balance rapid innovation with ethical considerations.

🛠️ Trending Tools 🛠️

WatermarkRemover.io - Get rid of the watermarks from your images using their powerful AI stack.

MonsterImage.AI - for creating those viral spiral images 😄

😎 Nerd section 😎

📺 What is LoRA? Low-Rank Adaptation for finetuning LLMs EXPLAINED
Low-Rank Adaptation for Parameter-Efficient LLM Finetuning explained.

⌨️ AI for Developers - Leverage AI and x10 your output! Don't be afraid to explore different AI helpers, multiply your output, and eliminate writing boring/basic code.

⌨️ Exploring GitHub Copilot - as continuation on the topic 🙂

🐦 From the Community 🐦

🚀 Registration to attend OpenAI DevDay in-person is now live! Apply here: https://devday.openai.com

🚀 Registration to attend OpenAI DevDay in-person is now live! Apply here: devday.openai.com
— OpenAI (@OpenAI)
5:07 PM • Sep 18, 2023

🆕 /r InstaFlow: A Novel One-Step Generative AI Model Derived from the Open-Source StableDiffusion (SD) - outperforms existing techniques in text-to-image generation by scoring significantly better on key benchmarks while requiring less computational power.

✖️ (Our own) Matrix is around the corner - 3D scanning and rendering is moving that fast - check this mind-blowing 3D rendering made of just 334 photos.

💡 Small Bits 💡

👋 OpenAI releases new language model InstructGPT-3.5 - a new instruction-focused language model designed to replace older models, offering the same cost and performance as GPT-3.5 Turbo and optimized for directly answering questions or completing tasks.

📅 ChatGPT Cut-Off Date Now January 2022 - for ChatGPT Plus based on GPT-4 to January 2022, while free users of ChatGPT based on GPT-3.5 remain at a September 2021 cutoff. This move has led to speculation about OpenAI's increased capabilities and possible pricing changes.

🦉 Bird Buddy, the AI-powered bird feeder startup, now lets anyone use its app to birdwatch - relax watching some birds :)

To date, Bird Buddy customers have installed 150,000 feeders around the world, which are capable of recognizing more than 1,000 species of birds

👨‍⚕️ Google and the Department of Defense are building an AI-powered microscope to help doctors spot cancer - good to see usage of AI for good and knowing that the development started before the AI hype.

📀 Microsoft exposed 38 terabytes of sensitive data while working on AI model - Basic cloud security practices were lacking - basic cloud security practices were lacking

🎨 BONUS - Viral Spiral Prompt 🎨

Using the Trending Tool (MonsterImage.AI), generated this image and then upscaled it:

Prompt: Medieval German village scene with castle in the distance, hyper realistic, ultra realistic, high quality

Negative prompt: watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, out of frame

Original output:

Upscaled output:

Thank you for reading today’s edition!

We love to hear back from you!

Feel free to reply to our emails with questions, suggestions, or topics you'd like to see covered, or drop us a message on Twitter or Facebook.

Until tomorrow,
- Tsvetelin (Bits and Neurons)

Day