A Daily chronicle of AI Innovations May 10th 2024:
TikTok introduces "AI-generated" labels for third-party content
Anthropic CEO defends dual funding from Google and Amazon
Krea AI introduces video generation for paid subscribers
Synthflow launches AI voice agent teams for streamlined customer support
Microsoft and LinkedIn have published their ‘2024 Work Trend Index Annual Report’, revealing the rapid adoption of AI tools by employees, with 75% of knowledge workers using AI and nearly half starting within the last six months.
Here are the key points:
Why does this matter?
The study serves as a wake-up call for organizations to move beyond experimentation and develop comprehensive strategies for AI implementation. As AI permeates all sectors, generations, and skill sets, early adopters will likely emerge as the leaders, while those hesitant to explore could risk falling behind.
Stability AI has launched Stable Artisan, a multimodal-gen AI Discord bot that enables users to create images and videos using the Stable Diffusion 3 (SD3) and Stable Video Diffusion (SVD) models.
https://youtu.be/MWfb30kWqTM?si=_TePwQX1A8xEj3hU
Stable Artisan incorporates several editing and customization features, including Search and Replace, Remove Background, Creative Upscale, Outpaint, Control Sketch, and Control Structure. The service is available through a paid subscription, with monthly plans ranging from $9 to $99, and a 3-day free trial.
Stability AI is also working on a larger conversational chatbot called Stable Assistant, which will incorporate the company's text-to-image and LLM technologies to assist users with various tasks through natural language conversations. While Stable Artisan currently does not include access to Stable Audio, Stable Code, or Stable LM, these features may be added in the future as the service continues to evolve.
Why does this matter?
Stable Artisan could empower creators lacking experience with complex AI models to generate high-quality content directly within their familiar Discord environment.
However, the paid subscription model could limit access, and the missing features hint at a future with a complete AI creative suite.
https://stability.ai/news/stable-artisan
ElevenLabs, a company that specializes in AI-powered voice cloning and synthesis, has revealed a new model that creates song lyrics based on user prompts.
With this new model, ElevenLabs aims to impact the music industry by allowing users to generate custom lullabies, jingles, podcast intros, and potentially even popular songs. The company also plans to launch a marketplace where users can sell their AI-generated music.
While ElevenLabs has not yet shared details about the maximum length of songs the AI can generate, an example posted by the company's Head of Design suggests that it will likely produce lyrics for a standard three-minute song.
Why does this matter?
This AI tool has the potential to democratize songwriting, allowing even those without musical expertise to craft lyrics. This could be particularly impactful for budget-conscious creators or those with specific lyrical needs. However, it remains to be seen if it will integrate with composing melodies like Udio or Suno, which offer a more complete song creation experience.
Also, the AI lyrics' originality and the tool's training data on copyrighted music might raise uncertainties.
TikTok introduces "AI-generated" labels for third-party content
TikTok will automatically label AI-generated content on its platform and on third-party platforms, becoming the first social media platform to support Content Credentials metadata for AI transparency. (Link)
Anthropic CEO defends dual funding from Google and Amazon
Anthropic's CEO says partnering with Google and Amazon ensures more independence than OpenAI's Microsoft reliance. However, regulators are examining the impact on AI competition as Anthropic's future training costs could reach $100 billion. (Link)
Krea AI introduces video generation for paid subscribers
Krea AI, a generative AI startup, has launched video generation capabilities for its highest-tier subscribers. The new feature allows users to create videos using a combination of key frame images and text prompts, with a timeline-based interface reminiscent of traditional video editing software. (Link)
Synthflow launches AI voice agent teams for streamlined customer support
Synthflow launches “Conversational AI Teams,” a feature that allows businesses to create multiple AI voice assistants to interact with customers and each other, all through a single phone number. These intelligent agents can handle tasks like scheduling, updating CRMs, and more, providing a seamless and efficient customer support experience. (Link)
A lesser-known feature of ChatGPT’s new Memory feature is that it can be programmed to store shortcuts, which can save you a lot of time in chat conversations when used effectively.
Llama-3’s ability to compete with top-tier models in certain areas is a testament to the rapid progress of open-source — and that’s with Meta’s largest model still pending. The more granular comparison also provides useful details often lost in more general model benchmarking.
A Daily chronicle of AI Innovations May 09th 2024:
Enjoying these updates? Support us by visiting our App and subscribing at Read Aloud For Me for Daily AI News, Tools, Games and Bedtime stories
OpenAI posts Model Spec revealing how it wants AI to behave
https://youtu.be/9ufplEgtq8w?si=XAREDQVZwYLBtcss
Microsoft has developed a top-secret generative AI model entirely disconnected from the internet so US intelligence agencies can safely harness the powerful technology to analyze top-secret info. The model based on GPT-4 is now live, answering questions, and will also write code.
Microsoft spent 18 months developing the model, which is "air-gapped" to ensure it is secure. This is the first time a model is fully isolated– meaning it's not connected to the internet but is on a special network that's only accessible by the U.S. government.
It can read and analyze files but cannot learn from them to stop sensitive information from entering the platform. It is yet to be tested and accredited by the intelligence agencies.
Why does this matter?
Intelligence agencies all over the world have been racing to be the first to harness generative AI. I guess we know who’s going to be the winner. If this AI tool is successful, it will fundamentally change the way intelligence agencies operate.
Copilot for Microsoft 365 to get auto-complete and rewrite to improve prompts
In coming months, Microsoft Copilot will be updated with new features like auto-complete and ‘elaborate your prompt’ that offer suggestions to improve AI prompts. It aims to solve the problem of coming up with good prompts for generative AI. (Link)
New AI data center to be built at the failed Foxconn project site in Wisconsin
President Joe Biden announced an AI data center to be built on the same site as the failed Foxconn project in Racine, Wisconsin. According to a White House press release, Microsoft is investing $3.3B in the project, creating up to 2,000 permanent jobs. (Link)
Sam Altman says we are not taking AI’s impact on the economy seriously
At a Brooking's Institute panel about AI and geopolitics on Tuesday, Altman said the discussions around AI's effect on the economy– like how it may lead to mass job replacement– died down this year compared to last. He said if we don’t take these concerns seriously enough going forward, it could be a massive issue. (Link)
Typeface Arc replaces prompts; uses AI agent approach to ease marketing workflows
It is launching Typeface Arc technology, which enables a user to state a high-level marketing objective and then have the system automatically plan and generate all the assets, including emails, images, and notifications that are all connected. (Link)
Altera’s gaming AI agents get backed by Eric Schmidt, Former Google CEO
Altera is the newest startup joining the fray to build a new guard of AI agents. It raised $9 million in an oversubscribed seed round, co-led by Eric Schmidt’s deep-tech fund, First Spark Ventures and Patron, the seed-stage fund co-founded by Riot Games alums. (Link)
Midjourney’s website is now accessible to anyone with more than 100 generated images, improving the experience when prompting images over its standard Discord group.
Microsoft and LinkedIn just published their Work Trend Index Annual Report, revealing that AI adoption is surging in the workplace — calling 2024 the ‘year AI at work gets real’.
Why it matters: Employees are adopting AI at a rapid pace, regardless of if their own organizations are ready for the shift. As AI spreads across all sectors, generations, and skillsets, the early adopters are rising to the top — while those that aren’t at least exploring the tech are quickly running out of time
Enjoying these updates? Support us by visiting our App and subscribing at Read Aloud For Me for Daily AI News, Tools, Games and Bedtime stories
A Daily chronicle of AI Innovations May 08th 2024:
Enjoying these updates? Support us by visiting our App and subscribing at Read Aloud For Me for Daily AI News, Tools, Games and Bedtime stories
OpenAI has developed a new tool to detect if an image was created by DALL-E 3, its AI image generator. The tool can detect DALL-E 3 images with around 98% accuracy, even if the image has been cropped, compressed, or had its saturation changed. However, the tool is not as effective at detecting images generated by other AI models, only flagging 5-10% of images.
This image detection classifier is only available to a group of testers, including research labs and research-oriented journalism nonprofits through OpenAI’s Research Access Program.
OpenAI has also added watermarking to Voice Engine, its text-to-speech platform, which is currently in limited research preview.
Why does it matter?
Early experiences have shown that AI detectors don’t work. In fact, if they have high error rates, they could lead to false accusations. In 2023, OpenAI had to shut down its own AI detection software for text because of its poor accuracy.
So, if this detector is as good as OpenAI claims, we may be on the precipice of a revolutionary new capability to reliably detect AI-generated content, with huge implications across domains.
Meta has expanded its generative AI tools for advertisers. Advertisers can request AI to generate entirely new images, including product variations in different colors, angles, and scenarios. The AI tools can add text overlays with different fonts, expand images to fit different aspect ratios like Reels and Feed, and generate ad headlines that match the brand's voice.
The AI features will roll out globally to advertisers by the end of 2024.
Meta is also expanding its paid Meta Verified service for businesses to more countries. Different pricing tiers offer features like account support, profile enhancements, and better customer service access.
Why does it matter?
Integrating such powerful AI features could lead to more effective advertising campaigns and improved customer engagement with targeted marketing and personalized ads. However, it could also raise questions about transparency and potential misuse.
OpenAI is developing Media Manager, a tool that will enable creators and content owners to decide what they own and specify how they want their works to be included or excluded from machine learning research and training. This first-ever tool of its kind will help OpenAI identify copyrighted text, images, audio, and video across multiple sources and reflect creator preferences.
OpenAI aims to have the tool in place by 2025 and set a standard across the AI industry with it.
Why does it matter?
Media Manager seems to be OpenAI’s response to growing criticism of its approach to developing AI models, which heavily scraps publicly available data from the web for training. Recently, 8 prominent U.S. newspapers sued OpenAI for copyright infringement.
On the other hand, OpenAI has formed mutually beneficial partnerships with platforms like Stack Overflow, Shutterstock, The Financial Times, and more to use their content.
So, OpenAI may be trying to meet creators in the middle, but if it is positioning itself as a fully ethical actor with this, we’ll take it with a grain of salt.
Apple just revealed its new line of iPads at a company event in Cupertino, CA — featuring a custom M4 chip that enables advanced AI capabilities and a slew of new AI-powered features.
https://www.youtube.com/live/f1J38FlDKxo?si=eiFyCYXTYgqvcV7i
🍎 Apple releases M4 chip at the ‘Let Loose' event with powerful AI capabilities
Apple released its much-anticipated M4 chip at the "Let Loose" event. M4 is slated to spearhead Apple's next generation of devices, with the iPad Pro leading the charge and powering the forthcoming OLED iPad Pro which is meticulously engineered to elevate the user experience to unprecedented heights. (Link)
📰 OpenAI strikes licensing deal with People magazine publisher
OpenAI has inked a licensing deal with Dotdash Meredith to bring the People magazine publisher’s content to ChatGPT and help train it’s AI models. Under the partnership, OpenAI will be able to display lifestyle and entertainment content in its chatbot from the many websites of one of the US's largest digital and print publishers. (Link)
🤖 Amazon announces Bedrock Studio to simplify Gen AI app development
Amazon is launching a new tool, Bedrock Studio, designed to let organizations experiment with generative AI models, collaborate on those models, and ultimately build generative AI-powered apps. Bedrock Studio is a “rapid prototyping environment” for generative AI. It also guides developers in evaluating, analyzing, fine-tuning, and sharing generative AI models. (Link)
👨�� Oracle introduces Code Assitant to accelerate enterprise software development
Oracle has announced Code Assitant, an AI-powered service to help developers rapidly program apps based on Java, SQL, and the Oracle Cloud infrastructure. It will join tools like GitHub Copilot and Amazon CodeWhisperer to accelerate the app development lifecycle. However, Oracle hasn’t yet specified when this feature will be released. (Link)
🚀 Red Hat launches RHEL AI and InstructLab to democratize enterprise AI
At Red Hat Summit 2024, RedHat announced two major initiatives to bring the power of generative AI to the enterprise. Red Hat Enterprise Linux AI (RHEL AI), a foundation model platform for developing and running open-source language models, and InstructLab, a community project to enable domain experts to enhance AI models with their knowledge. (Link)
Google Gemini’s new “Extensions” feature allows users to access external tools such as YouTube to chat with videos and get answers for free.
Step-by-step:
Pro tip: Try asking Gemini to explain advanced concepts discussed in a video, generating concrete examples, creating practice questions, and even asking for code snippets
Here’s what you need to know:
🌐 By accurately modeling the shapes of proteins, DNA, RNA, and more, this next generation model could help scientists unlock new discoveries in biology.
🔬 We have also launched AlphaFold Server, a free platform that scientists around the world can use for non-commercial research. They can harness AlphaFold 3’s predictions and test hypotheses with just a few clicks - no matter their technical expertise.
💊 Isomorphic Labs Isomorphic labs is applying this next generation model to design new drugs and tackle real-world therapeutic challenges
🧪 Millions of researchers around the world have used AlphaFold predictions in areas like developing an experimental malaria vaccine, designing plastic-eating enzymes and more. Find out more ↓ https://dpmd.ai/3URDiNo
Paper ➡️ https://go.fb.me/o0yd06
We show that replacing next token prediction tasks with multiple token prediction can result in substantially better code generation performance with the exact same training budget and data — while also increasing inference performance by 3x. While similar approaches have previously been used in fine-tuning, this new paper expands to pre-training for large models, showing notable behaviors and results at these scales.
A Daily chronicle of AI Innovations May 07th 2024:
Enjoying these updates? Support us by visiting our App and subscribing at Read Aloud For Me - AI Dashboard - AI Hub - AI Tools Catalog - Top GPTs - Best LLMs - All AI Tools in One Place for Daily AI News, Tools, Games and Bedtime stories
Apple is developing its own AI chip for data center servers, known internally as Project ACDC (Apple Chips in Data Center). The chip will likely focus on running AI models (inference) rather than training them, which is where Nvidia currently dominates.
The company is working closely with TSMC (Taiwan Semiconductor Manufacturing Co) to design and produce these chips, although the timeline for launch is uncertain. With this move, the company aims to keep up with rivals like Microsoft and Meta, who have made significant investments in generative AI.
Why does it matter?
Apple has a long history of designing custom chips for its devices like iPhones, iPads, and Macs, which is probably what makes them stand out. Having custom AI chips could allow the tech giant more control over its "AI destiny" versus relying on suppliers like Nvidia.
OpenAI will use OverflowAPI to improve model performance and provide attribution to the Stack Overflow community within ChatGPT. Stack Overflow will use OpenAI models to develop OverflowAI and to maximize model performance.
The partnership aims to improve the user and developer experience on both platforms. The first set of integrations and capabilities will be available in the first half of 2024, and the partnership will enable Stack Overflow to reinvest in community-driven features.
Why does this matter?
Stack Overflow partnered with Google Cloud to develop Overflow API and to give Google’s Gemini models access to its knowledge communities. Now it is forming a similar partnership with OpenAI. Despite concerns about copyright breaches, such partnerships seem to be trending where both the parties have much to gain, but it just reaffirms that the big AI players remain hungry for data.
Microsoft is developing a new, large-scale AI language model called MAI-1 to compete with Google and OpenAI. The model is overseen by Mustafa Suleyman, recently hired co-founder of Google DeepMind.
MAI-1 will be larger and more expensive than Microsoft's previous smaller, open-source models, with roughly 500 billion parameters. Microsoft could preview the new model as soon as its Build developer conference later this month.
Why does this matter?
Microsoft's development of MAI-1 shows that it is not entirely relying on it's OpenAI investment to go big in AI. Now, it has entered the AI race truly, competing with state-of-the-art models from Google, Anthropic, even Meta's Llama 400B which is in training, and OpenAI itself.
Hugging Face has launched LeRobot, an open-source robotics toolkit
It is a comprehensive platform for developers, researchers, and hobbyists to train AI models, share data, and simulate environments, all while seamlessly integrating with various robotic hardware. The toolkit offers pre-trained models and integrates with physics simulators for testing without physical robots. Hugging Face is also collaborating with diverse partners to build the largest crowdsourced robotics dataset. (Link)
Apple is testing a new "Clean Up" feature in its Photos app
By using gen AI for advanced image editing, this feature will allow you to effortlessly remove unwanted objects from your photos using a simple brush. Apple may preview this new feature during its upcoming "Let Loose" iPad event or at WWDC in June. (Link)
Google has launched Google Threat Intelligence
|
|
|
It is a combination of Mandiant's expertise, VirusTotal's community insights, and Google's vast threat visibility. Google Threat Intelligence assists with external threat monitoring, attack surface management, digital risk protection, IoC analysis, and expertise. With Gemini, organizations can now quickly search through vast amounts of threat data to protect against cyber threats. (Link)
US invests $285M in AI 'Digital Twin' technology
The Biden administration is investing $285 million for a new “CHIPS Manufacturing USA institute” focused on digital twins for the semiconductor industry. This approach uses AI to create virtual chip replicas, accelerating the production of next-gen processors. Intel and Micron are also set to receive funding to boost the development of new processors. (Link)
Anduril Industries introduces Pulsar: AI modular electromagnetic warfare (EW) systems
Pulsar uses AI to quickly identify and counter current and future threats across the electromagnetic spectrum, including small and medium-size drones. With its integration of software-defined radio, GPUs, and diverse compute capabilities, Pulsar is changing how we defend against rapidly evolving threats in an increasingly complex battlefield. (Link)
Adobe's AI-powered ‘Enhance Speech’ tool dramatically improves the quality of audio voice recordings with just a few clicks.
Step-by-step:
Pro tip: If you have a video file, you can extract the audio using free websites that extract audio from video and add the enhanced audio back to your video using free video editors like CapCut
A series of studies from several German universities found that both novice and experienced teachers struggled to accurately distinguish between student-written and AI-generated texts.
The details:
Why it matters: AI’s writing capabilities are only getting better — and relying on teacher intuition or unreliable tools may be no more effective than guessing. Unless better tools become available, it may be time to pivot to enabling students to work with AI instead of penalizing them for it.
Enjoying these updates? Support us by visiting our App and subscribing at Read Aloud For Me - AI Dashboard - AI Hub - AI Tools Catalog - Top GPTs - Best LLMs - All AI Tools in One Place for Daily AI News, Tools, Games and Bedtime stories
A Daily chronicle of AI Innovations May 06th 2024:
Enjoying these updates? Support us by visiting our App and subscribing at https://readaloudforme.com for Daily AI News, Tools, Games and Bedtime stories
In robotics, one of the biggest challenges is transferring skills learned in simulation to real-world environments. NVIDIA researchers have developed a groundbreaking algorithm called DrEureka that uses LLMs to automate the design of reward functions and domain randomization parameters—key components in the sim-to-real transfer process.
https://youtu.be/vCYsKCbPTTU?si=QnYyWkA4BojE4YFL
The algorithm works in three stages: first, it creates reward functions with built-in safety instructions; then, it runs simulations to determine the best range of physics parameters; finally, it generates domain randomization configurations based on the data gathered in the previous stages.
When tested on various robots, including quadrupeds and dexterous manipulators, DrEureka-trained policies outperformed those designed by human experts.
Why does it matter?
DrEureka makes robot training accessible and cost-effective for businesses and researchers alike. We may witness increased adoption of robotics in industries that have previously been hesitant to invest in the technology due to the complexity and cost of training robots for real-world applications.
Prometheus 2, a free and open-source language model developed by KAIST AI, has shown impressive capabilities in evaluating other language models, approaching the performance of commercial models like GPT-4.
The model was trained on a new pairwise comparison dataset called the "Preference Collection," which includes over 1,000 evaluation criteria beyond basic characteristics. By combining two separate models - one for direct ratings and another for pairwise comparisons - the researchers achieved the best results.
|
|
|
In tests across eight datasets, Prometheus 2 showed the highest agreement with human judgments and commercial language models among all freely available rating models, significantly closing the gap with proprietary models.
Why does this matter?
By enabling user-defined evaluation criteria, Prometheus 2 can be tailored to assess language models based on specific preferences and real-life scenarios, opening up new possibilities for developing specialized AI applications across various domains. It’s also an opportunity to create niche models that are culturally sensitive and relevant.
X (formerly Twitter) has launched a new feature, Stories, that provides AI-generated summaries of trending news on the platform. Powered by Elon Musk's chatbot Grok, Stories offers Premium subscribers brief overviews of the most popular posts and conversations happening on X.
With Stories, users can quickly catch up on the day's trending topics without having to scroll through countless posts. Grok generates these summaries based solely on the conversations happening on X about each news story rather than analyzing the original news articles themselves. While this approach is controversial, X believes it will pique users' curiosity and potentially drive them deeper into the source material.
Why does this matter?
X's Grok-powered Stories feature may reshape the way we consume news. As more platforms integrate AI news summarization tools, traditional media outlets may face challenges in maintaining reader engagement and revenue. However, the reliance on platform-specific conversations for generating summaries raises concerns about the potential spread of misinformation and the creation of echo chambers.
Privacy complaint filed against OpenAI
The maker of ChatGPT is facing a privacy complaint in the European Union (EU) for its "hallucination problem." The complaint alleges violations of GDPR, including misinformation generation and lack of transparency on data sources. The report highlights concerns about accuracy, data access, and the inability of ChatGPT to correct incorrect information. (Link)
JPMorgan launches an AI-powered tool for thematic investing
IndexGPT is a new range of thematic investment baskets created using OpenAI's GPT-4 model. The tool generates keywords associated with a theme, which are then used to identify relevant companies through natural language processing of news articles. IndexGPT aims to improve the selection of stocks for thematic indexes, going beyond obvious choices and potentially enhancing trend-following strategies. (Link)
YouTube Premium introduces AI-powered "Jump ahead" feature
The AI-powered feature allows users to skip past commonly skipped sections of a video and jump to the next best point. It is currently available for the YouTube Android app in the US with English videos and can be enabled through the experiments page. (Link)
AI is now set to transform the drug discovery industry
Generative AI is now rapidly generating novel molecules and proteins that humans may not have considered. AI models, such as Google's AlphaFold, are accelerating the drug development process from years to months while increasing success rates. Experts predict that AI-designed drugs will become the norm in the near future, but they will still need to prove their efficacy in human trials. (Link)
AI helps bring back Randy Travis' voice in new song
Country singer Randy Travis has released a new song, "Where That Came From," his first since losing his voice to a stroke in 2013.
The vocals were created using AI software and a surrogate singer under the supervision of Travis and his producer. The result is a gentle tune that captures Travis' relaxed style, reinforcing the potential of AI voice cloning in the right hands. (Link)
OpenAI has rolled out a new feature called “Memory” for ChatGPT plus users, enabling it to remember specific user details across chats. |
Step-by-step: |
|
That’s it! You can now have more personalized conversations across all your conversations |
Enjoying these updates? Support us by visiting our App and subscribing at https://readaloudforme.com for Daily AI News, Tools, Games and Bedtime stories.
A Daily chronicle of AI Innovations May 04th 2024:
University of Pennsylvania engineers have developed a new chip that radically accelerates processing by using light waves rather than electricity to perform the complex math essential to training AI.
The silicon-photonic (SiPh) chip has the potential to accelerate the processing speed of computers while also reducing their energy consumption, according to the researchers.
Read more here:
iOS 18 may have OpenAI-powered gen AI Capabilities
China's Vidu generates 16-second 1080P videos, matching OpenAI's Sora
New S1 robot mimics human-like movements, speed, and precision
Gradient AI releases Llama-3 8B with 1M context
Mysterious “gpt2-chatbot” AI model bemuses experts
GitHub’s Copilot Workspace turns ideas into AI-powered software
Amazon launches Amazon Q, the world’s most capable Gen AI assistant
Google’s Med-Gemini models outperform doctors
Apple has set up a secretive AI lab in Switzerland
Better and faster LLMs via multi-token prediction: New research
Anthropic launches an iOS app and a new plan for teams
Google's AI advancements urged Microsoft's billion-$ OpenAI investment
Scale AI’s study finds popular LLMs overfit public benchmarks
Ukraine debuts the world's first AI diplomat, Victoria Shi
Sam Altman is ready to spend $50 billion a year to build AGI
A Daily chronicle of AI Innovations May 03rd 2024:
A new study by Scale AI raises concerns about the reliability of LLM benchmark tests. It uncovers LLM overfitting by evaluating them on a new (designed from scratch) dataset, GSM1k that mimics a popular benchmark, GSM8k.
|
|
|
Key findings:
Overall, the study highlights the need for more robust and reliable methods for evaluating LLM reasoning abilities.
Why does it matter?
The dataset proves that overfitting may be creating major false impressions of model performance. As AI capabilities continue to advance, it is crucial to develop evaluation approaches that can keep pace and provide a more accurate picture of a model's real-world potential.
During a recent appearance at Stanford University, Altman talked about the future of AI, calling GPT-4, a currently impressive AI model, to be the “dumbest model” compared to future iterations. According to Altman, the future is dominated by "intelligent agents," AI companions that can not only follow instructions but also solve problems, brainstorm solutions, and even ask clarifying questions.
OpenAI isn't just talking about the future, they're actively building it. Their next-generation model, GPT-5, is rumored for a mid-2024 release and might boast video generation capabilities alongside text and image.
But the real moonshot is their active participation in developing AGI.
Despite the significant costs involved, Altman remains undeterred. He believes that the potential benefits, such as solving complex problems across various industries, outweigh the financial burden.
Watch the whole Q&A session here.
Why does this matter?
Altman’s bold comments on GPT-4 being the dumbest model suggest that OpenAI is aiming for something even grander, and GPT-5 could be a stepping stone toward it (the next-gen AI framework).
OpenAI prepares to challenge Google with ChatGPT-powered search: OpenAI is building a search engine, search.chatgpt.com, potentially powered by Microsoft Bing. This leverages their existing web crawler and Bing's custom GPT-4 for search, posing a serious threat to Google's dominance. (Link)
Microsoft bans U.S. police use of Azure OpenAI for facial recognition
Microsoft has banned U.S. police from using Azure OpenAI Service for facial recognition, including integrations with OpenAI's image-analyzing models. The move follows Axon's controversial GPT-4-powered tool to summarize audio from the body camera. However, the ban has exceptions and doesn't cover Microsoft's other AI law enforcement contracts. (Link)
IBM expands AI and data software on AWS marketplace
IBM has significantly expanded its software offerings on the AWS Marketplace, making 44 products accessible to customers in 92 countries, up from just five. The move, part of a strategic collaboration with AWS, focuses on AI and data technologies like Watson x.data, Watson x.ai, and the upcoming Watson x.governance. (Link)
Google Cloud supports Azure and AWS; integrates AI for security
Google Cloud now supports Azure and AWS, enabling enterprises to manage security across multi-cloud platforms. AI integration with existing solutions streamlines user experience and addresses the security talent gap. The AI-powered design manages risks efficiently amid increasing cyber threats, while extensive support simplifies tasks for enterprises. (Link)
Microsoft invests $2.2B in Malaysia's cloud and AI transformation
Microsoft is investing $2.2 billion over the next four years to support Malaysia's digital transformation, its largest investment in the country's 32-year history. The investment includes building cloud and AI infrastructure, creating AI skilling opportunities for 200,000 people, establishing a national AI Centre of Excellence, enhancing cybersecurity capabilities, and supporting the growth of Malaysia's developer community. (Link)
A Daily chronicle of AI Innovations May 02nd 2024:
New research, apparently from Meta, has proposed a novel approach to training language models (LMs). It suggests that training LMs to predict multiple future tokens at once instead of predicting only the next token in a sequence results in higher sample efficiency. The architecture is simple, with no train time or memory overhead.
|
|
|
Figure: Overview of multi-token prediction
The research also provides experimental evidence that this training paradigm is increasingly useful for larger models and in particular, shows strong improvements for code tasks. Multi-token prediction also enables self-speculative decoding, making models up to 3 times faster at inference time across a wide range of batch sizes.
Why does it matter?
LLMs such as GPT and Llama rely on next-token prediction. Despite their recent impressive achievements, next-token prediction remains an inefficient way of acquiring language, world knowledge, and reasoning capabilities. It latches on local patterns and overlooks “hard” decisions.
Perhaps, multi-token prediction could bring a shift in how LMs learn. It could equip LLMs with deeper understanding and complex problem-solving capabilities. (or Meta just wasted their compute.)
Anthropic, the creator of the Claude 3 AI models, released a new iOS app named Claude. The app enables users to access AI models, chat with them, and analyze images by uploading them.
|
|
Anthropic also introduced a paid team plan, offering enhanced features like more chat queries and admin control for groups of five or more. The app is free for all users of Claude AI models, including free users, Claude Pro subscribers, and team plan members. The company will also roll out an Android version soon.
Why does it matter?
Though a little late with its mobile app, Anthropic has caught up with its competitors like OpenAI and Google, who have apps running for quite a while. The company decided to offer an app version because many users have been accessing its AI models through the web.
Internal emails have revealed that Microsoft invested $1 billion in OpenAI in 2019 out of fear that Google was significantly ahead in its AI efforts.
Microsoft CTO Kevin Scott sent a lengthy email to CEO Satya Nadella and Bill Gates stating Google’s AI-powered “auto complete in Gmail” was getting “scarily good” and added that Microsoft was years behind in terms of ML scale.
The emails, with the subject line “Thoughts on OpenAI,” were made public on Tuesday as part of the Department of Justice's antitrust case against Google. A large section of Scott's email was redacted. Check out the email here.
Why does it matter?
While some might call it paranoia, the well-timed move has undeniably paid off– the initial $1 billion has now turned into a multi-billion-dollar partnership with OpenAI.
While the email-surfacing highlights the growing scrutiny of competition in the tech industry, it also makes me wonder if Microsoft's investment in OpenAI could have influenced the overall direction of AI research and development.
Sanctuary AI teams up with Microsoft to advance general-purpose robot AI
Sanctuary AI has announced a collaboration with Microsoft to develop AI models for general-purpose humanoid robots. The partnership will leverage Microsoft's Azure cloud computing platform and AI technologies to enhance the capabilities of Sanctuary AI's robots. (Link)
Nvidia's ChatRTX now supports voice queries and Google's Gemma model
Nvidia has updated its ChatRTX chatbot to support Google's Gemma model, voice queries, and additional AI models. The chatbot, which runs locally on a PC, enables users to search personal documents and YouTube videos using various AI models, including ChatGLM3 and OpenAI's CLIP model. (Link)
Atlassian launches Rovo: An AI assistant for enhanced teamwork
Atlassian has launched Rovo, an AI assistant designed to improve teamwork and productivity. Rovo integrates with Atlassian's products and offers features such as AI-powered search, workflow automation, and integration with third-party tools like Google Drive, Microsoft SharePoint, and Slack. (Link)
MongoDB launches an AI app-building toolkit to help businesses use gen AI
It has launched the MongoDB AI Applications Program, or MAAP, to help companies accelerate building and deployment of AI-powered applications. It brings consultancies and foundation models providers, cloud infrastructure, generative AI frameworks, and model hosting together with MongoDB Atlas to develop solutions for business problems. (Link)
Ideogram introduces Pro Tier: 12,000 fast AI image generations monthly
Ideogram has launched a paid Pro tier for its AI image generation platform, allowing users to generate up to 12,000 images per month at faster speeds. The platform utilizes AI algorithms to create high-quality images for various applications, including design, marketing, and content creation. (Link)
|
The details: |
|
Why it matters: Gemini just got a whole lot more accessible — with the shortcut and integrations not only boosting the chatbot’s reach, but also introducing a wave of non-AI users to the tech. Subtle but impactful changes like these are what drive serious shifts in user habits. |
Midjourney’s new parameter feature called --sref random lets users generate images in completely random styles to help spark creativity. | |
Step-by-step: | |
| |
Example prompt: “Portrait of a woman smiling –sref https://www....” |
AI RESEARCH | |||||||
AI model predicts drug effectiveness
|
Trending AI Tools |
|
New AI Job Opportunities |
|
A Daily chronicle of AI Innovations May 01st 2024:
The details: |
|
https://youtu.be/fQITLL5WncE?si=JQPozECjFSTWsykb
Amazon has launched Amazon Q, a generative AI assistant designed for developers and businesses. It comes in three distinct offerings:
Amazon is driving real-world impact by offering a free tier for Q Developer and reporting early customer productivity gains of over 80%. Amazon Q Developer Pro is available for $19/user/month and Amazon Q Business Pro for $20/user/month. A free trial of both Pro tiers is available until June 30, 2024.
Why does it matter?
By introducing a free tier for Q Developer and the user-friendly nature of Q Apps, Amazon could accelerate innovation across the software development lifecycle and business workflows. This could empower domain experts and business leaders to use AI to solve their specific challenges directly, leading to more tailored AI applications across various industries.
Researchers from Google and DeepMind have introduced Med-Gemini, a family of highly capable multimodal AI models specialized in medicine. Based on the strengths of the Gemini models, Med-Gemini shows significant improvements in clinical reasoning, multimodal understanding, and long-context understanding. Models can be customized to fit novel medical modalities through specialized encoders, and web searches can be used for up-to-date information.
|
|
Med-Gemini has shown state-of-the-art performance on 10 of 14 medical benchmarks, including text, multimodal, and long-context applications. Moreover, the models achieved 91.1% accuracy on the MedQA (USMLE) benchmark, exceeding the previous best models by 4.6%. Its strong performance in summarizing medical notes, generating clinical referral letters, and answering electronic health record questions confirms Med-Gemini's potential real-world use.
Why does it matter?
These models can reduce the administrative burden on healthcare professionals by outperforming human experts in tasks like medical text summarization and referral letter generation. Moreover, Med-Gemini's ability to engage in multimodal medical dialogues and explain its reasoning can lead to more personalized and transparent care, reduce misdiagnosis due to lack of physician knowledge, and save lives and money.
Since 2018, the company has quietly hired 36 AI experts from Google, including notable figures like Bengio and Ruoming Pang, for its secretive "Vision Lab." The lab focuses on building advanced AI models and products, and it is particularly interested in text and visual-based AI systems akin to OpenAI's ChatGPT. Apple has also acquired AI startups FaceShift and Fashwall, which are likely contributing to the establishment of the new lab.
Why does it matter?
Apple may have been fashionably late to AI development, but quietly setting up the Zurich lab and primary AI development centers in California and Seattle signifies the company's AI ambitions.
Google to pay News Corp $5-6 million per year to develop AI content and products
While News Corp denies any specific AI licensing deal, the arrangement highlights a growing trend of tech giants licensing news archives for language model training. Similar deals were inked between OpenAI and the Financial Times, showing the importance of quality data. (Link)
Yelp is launching an AI chatbot to help consumers connect with relevant businesses
The chatbot uses OpenAI's LLMs and Yelp's data to understand user problems and provide relevant professional suggestions. Yelp also introduces a "Project Ideas" section for personalized recommendations and checklists. Meanwhile, restaurants are getting a revamped guest management system for better staff utilization, real-time table status, and customer updates. (Link)
Apple is testing Safari 18 with new features: Intelligent Search and Web Eraser
Intelligent Search uses Apple's on-device AI to identify topics and key phrases for summarization. Web Eraser allows users to persistently remove unwanted content from web pages. Apple is also working on an AI Visual Lookup feature for 2025, allowing users to obtain product information from images. These AI enhancements will debut with iOS 18 and macOS 15 at WWDC in June. (Link)
Eight US newspapers have sued Microsoft and OpenAI for copyright infringement
These newspapers, owned by Alden Global Capital's MediaNews Group, allege that the companies misused their articles to train Copilot and ChatGPT without permission or payment. The New York Times, The Intercept, Raw Story, and AlterNet have filed similar lawsuits. The newspapers claim that the AI systems reproduce their content verbatim and generate fake articles that damage their reputation. (Link)
A study of 16000 patients reveals that AI ECG alert systems significantly lower all-cause mortality
The AI was trained on over 450,000 ECG tests and survival data to predict a patient's risk of death. Physicians were alerted when a patient's ECG indicated they were in the top 5% risk category. The AI reduced overall deaths by 17% and cardiac deaths by 93% for high-risk patients. (Link)
New AI Job Opportunities |
|
Enjoying these updates?
Experience the transformative capabilities of AI with "Read Aloud For Me - AI Dashboard - AI Tools Catalog - AI Tools Recommender" – your ultimate AI Dashboard and Hub. Seamlessly access a comprehensive suite of top-tier AI tools within a single app, meticulously crafted to enhance your efficiency and streamline your digital interactions. Now available on the web at readaloudforme.com and across popular app platforms including Apple, Google, and Microsoft, "Read Aloud For Me - AI Dashboard" places the future of AI at your fingertips, blending convenience with cutting-edge innovation. Whether for professional endeavors, educational pursuits, or personal enrichment, our app serves as your portal to the forefront of AI technologies. Embrace the future today by downloading our app and revolutionize your engagement with AI tools.