2023 AI & TECH UPDATES

Collection of updates in the AI field mostly but also other tech stuff that I personally liked. There is just so much going on now that it is impossible to update about everything as a human lol. And this isn't even the singularity. 

JANUARY

1. Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

2. Muse: Text-To-Image Generation via Masked Generative Transformers

3. Massive Language Models Can Be Accurately Pruned in One-Shot

4. Teaching AI advanced mathematical reasoning

5. Apple just released AI narration of audio books that sound shockingly human

6. Microsoft VALL-E Text to Audio Generation

7. Some Cool Stuff Shown At CES 2023

8. GLM-130B Open Source Large Language Model

9. More than 450 start-ups are now working on generative AI

10. DEEPMIND Introducing DreamerV3: the first general algorithm to collect diamonds in Minecraft from scratch

11. Introducing Latent Blending: a new #stablediffusion method for generating incredibly smooth transition videos

12. One step closer to Reverse Aging in Humans

13. ChatGPT is coming soon to the Azure OpenAI Service

14. Microsoft Plans to Build OpenAI Capabilities Into All Products

15. 17- Audio Generation with Diffusion

16. 18 - ChatGPT & Wolfram Language together gives better mathematical results

17. 18 - OpenAI's Forecasting Potential Misuses of Language Models for Disinformation Campaigns —and How to Reduce Risk

18. 18 - Google is finally going to release their AI models too, in their products and through API

19. 18 - DEEPMIND Human-Timescale Adaptation in an Open-Ended Task Space

20. 19 - Boston Dynamics Atlas Grip Update

21. 19 - META AI: ESM Metagenomic Atlas: The first view of the ‘dark matter’ of the protein universe

22. 20 - Accurate Pose Estimation Via WiFi Signals

23. 21 - Introducing Eye Contact by NVIDIA

24. 22 - Instruct-Pix2Pix: image editing in natural language

25. 23 - A high-performance speech neuroprosthesis (brain computer interface)

26. 23 - META AI E3B is a method for exploring complex environments which vary across episodes

27. 23 - StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

28. 23 - InfiniCity: Infinite-Scale City Synthesis

29. 24 - Replacing clothes using AI in video. InstructPix2Pix + EbSynth

30. 24 - Scientists claim to have found the biological cause of aging

31. 24 - Deciphering Clinical Abbreviations with Privacy Protecting ML

32. 24 - A demo of TrueSync – AI-powered visual dialogue translation from Flawless AI

33. 24 - Introducing Demonstrate–Search–Predict (𝗗𝗦𝗣), a framework for composing search and LMs w/ up to 120% gains over GPT-3.5

34. 25 - New ViT-G/14 CLIP model

35. 25 - A Watermark for Large Language Models

36. 25 - Atomic AI Launches To Treat Undruggable Diseases

37. 25 - CERN's machine learning could help self-driving cars

38. 25 - Google AI Learning with queried hints

39. 26 - Google AI MusicLM: Generating Music From Text

40. 27 - Text-To-4D Dynamic Scene Generation

41. 27 - Nvidia AI introduces ORBIT on IsaacSim, a GPU-powered virtual Gym for robots to work out

42. 27 - Luma AI Text-to-3D released

43. 27 - On the Importance of Noise Scheduling for Diffusion Models

44. 28 - New Text-to-Audio model, AudioLDM

45. 28 - Noise2Music AI Music Generation From Text

46. 28 - Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion

47. 29 - Msanii: High Fidelity Music Synthesis on a Shoestring Budget

48. 29 - ElevenLabs Currently The Best Voice Cloning and Synthesis AI

49. 29 - Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

50. 29 - Channel - Asking Questions Through GPT-3 To Get Answers From Database

51. 29 - Meta AI Multiview Compressive Coding for 3D Reconstruction

52. 29 - SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

53. 30 - Glass AI can generate a differential diagnosis or clinical plan based on your problem representation

54. 30 - SingSong: Generating musical accompaniments from singing

55. 30 - BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

56. 31 - OpenAI now offers an official AI plagiarism detector with AI Text Classifier

57. 31 - PADL: Language-Directed Physics-Based Character Control

58. 31 - Introducing Rose AI — a new way to interface with data

59. 31 - FLAME: A small language model for spreadsheet formulas

60. 31 - DetectGPT, a method for detecting if text comes from an LM

61. 31 - January 2023 was already mind blowing in AI progress, hard to imagine upcoming months

FEBRUARY

1. 1 - Meta AI Emergence of Maps in the Memories of Blind Navigation Agents

2. 1 - OpenAI announces ChatGPT Plus for $20 per month

3. 1 - Samsung reveals its plans to start working on extended reality devices (AR/VR)

4. Microsoft Teams Premium: Cut costs and add AI-powered productivity (GPT-3.5)

5. Perplexity Ask is now available as a Chrome extension

6. MultiRay is Meta’s platform for efficiently running large-scale, state-of-the-art AI models

7. Google AI Open Source Vizier: Towards reliable and flexible hyperparameter and blackbox optimization

8. Microsoft boosts Viva Sales with new GPT seller experience

9. Facebook releases a 30B param “OPT+IML”

10. Dreamix: Video Diffusion Models are General Video Editors

11. SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections

12. SceneScape: Text-Driven Consistent Scene Generation

13. RobustNeRF: Ignoring Distractors with Robust Losses

14. OpenAI Self-critiquing models for assisting human evaluators

15. Amazon AI: Multimodal Chain-of-Thought Reasoning in Language Models

16. Deepmind: Accelerating Large Language Model Decoding with Speculative Sampling

17. Sundar Pichai: LaMDA is releasing for the general public in the coming weeks and months

18. AutumnSynth synthesizes the source code of a video game from seconds of play

19. Nvidia AI Synthesizing Physical Character-Scene Interactions

20. Poe, a bot from Quora that can answer questions and have conversations

21. BLIP-2 demo available on Huggingface: LLM that can understand Images

22. Humata.ai launched: Basically ChatGPT for your own files

23. Google invests $300 million in Anthropic AI

24. Google AI Real-time tracking of wildfire boundaries using satellite imagery

25. LAION AI introduces Open Assistant: chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically

26. Apple CEO Tim Cook says AI will eventually 'affect every product and service we have'

27. Epic-Sounds: A Large-scale Dataset of Actions That Sound

28. announcing stable attribution - a tool which lets anyone find the human creators behind a.i generated images

29. present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes

30. Tune-A-Video available to use and also open sourced (turns AI Generated Images into gifs or videos)

31. Filechat.io now available - ChatGPT for your own data and no limits (with premium tier)

32. BioGPT-Large was just released by Microsoft

33. Google announces Bard, powered by LaMDA coming soon as an AI conversational service. It will be integrated with Search

34. Language Models Secretly Perform Gradient Descent as Meta-Optimizers

35. Seek AI introduces DeepCuts, the AI SQL app that lets you explore your Spotify data with natural language

36. Microsoft & OpenAI: Bing and Edge + AI: a new way to search starts today

37. Introducing Polymath: The open-source tool that converts any music-library into a sample-library with machine learning

38. Galileo AI : the first AI product that uses natural language to generate UI designs

39. Introducing Genius, your AI design companion in figma: It comprehends your design and autocompletes it with elements from your design system

40. Meta AI Toolformer: Language Models Can Teach Themselves to Use Tools

41. ERNIE-Music: : music generation model to generate music audio based on free-form text

42. code release for pix2pix-zero: Zero-shot Image-to-Image Translation is out

43. Adding Conditional Control to Text-to-Image Diffusion Models

44. Q-Diffusion: Quantizing Diffusion Models

45. Announcing the launch of the MedARC

46. Opera announces plans for upcoming Generative AI integrations in their web browser

47. Scaling Vision Transformers to 22 Billion Parameters

48. 𝗖𝗼𝗻𝗰𝗲𝗽𝘁𝗙𝘂𝘀𝗶𝗼𝗻 builds open-set multimodal 3D maps by fusing features to 3D. These maps can be queried by text, image, click, and audio

49. Google research released a paper on a neural net that can forecast rain up to 12 hours ahead

50. ALAN: Autonomously Exploring Robotic Agents in the Real World

51. GitHub Copilot for Business is now available

52. Introducing "text-2-commercial" – the unique text-2-video experience we're building

53. Introducing researchGPT - An open-source research assistant that allows you to have a conversation with a research paper or any pdf

54. NASA Turns to AI to Design Mission Hardware

55. Announcing Replit Ghostwriter Chat - Generate, debug, refactor, and understand code faster than ever

56. Stability AI partners with Krikey App to launch text-to-animation tools that generate 3D, animated, interactive characters

57. Anthropic AI Paper: The Capacity for Moral Self-Correction in Large Language Models

58. Information on ChatGPT’s alignment, plans to improve it, giving users more control, and early thoughts on public input

59. Meta AI's GenAug

60. Introducing Type AI - An AI-first document editor that helps you write remarkably fast

61. pix2pix 3d: 3D-aware Conditional Image Synthesis

62. PersonNeRF: Personalized Reconstruction from Photo Collections

63. PhotoRoom is finally releasing generative AI on the web after millions of images generated on mobile

64. Genius Sheets AI: Ask questions, generate reports, and query data using text interface powered by AI

65. ZoomInfo announced that they are incorporating GPT technology into their platform now

66. Generative AI on Roblox: Vision for the Future of Creation

67. Adding Conditional Control to Text-to-Image Diffusion Models

68. ChatGPT for Robotics: Design Principles and Model Abilities

69. Bing AI Update - Increasing Limits on Chat Sessions

70. RealFusion 360 Reconstruction of Any Object from a Single Image

71. OpenAI has privately announced a new developer product called Foundry, which enables customers to run OpenAI model inference at scale w/ dedicated capacity

72. Amazon Web Services (AWS) is partnering with Hugging Face

73. Spotify launches ‘DJ,’ a new feature offering personalized music with AI-powered commentary (using OpenAI's tech)

74. T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

75. Notion AI is now available to everyone (no waitlist or limited preview)

76. Google Research: Suppressing quantum errors by scaling a surface code logical qubit (towards scalable fault-tolerant quantum computing)

77. Stanford Human Preferences Dataset (SHP) released: a collection of 385K *naturally occurring* *collective* human preferences over text (for training RLHF models)

78. Runway announces the official launch of Runway Studios (for the next generation of AI storytellers)

79. ROSIE: Scaling RObot Learning with Semantically Imagined Experience (Text-to-image generative models, meet robotics)

80. Uizard AI Autodesigner (basically ChatGPT for product design)

81. CarperAI showcases how users can RLHF their own assistants with trlX using AnthropicAI's Helpful & Harmless (HH) dataset

82. Your memories - how you remember them (Step inside your memories - on your phone, in VR, and in AR) - Wist: Immersive Memories

83. Clone announces that they are making synthetic humans, starting with a 1:1 copy of the human hand

84. RoboNinja: Learning an Adaptive Cutting Policy for Multi-Material Objects (Robots using Knives)

85. Runway's Gen-1: The Next Step Forward for Generative Al (AI Videos)

86. MimicPlay - an imitation learning algorithm that uses "cheap human play data" (teaches robots to perform long-horizon tasks efficiently and robustly)

87. Google AI: Pre-training generalist agents using offline reinforcement learning

88. Luma AI introduces full volumetric photorealistic NeRF rendering on the web in realtime

89. MERF: Memory-Efficient Radiance Fields for Real-time View Synthesis in Unbounded Scenes (MERF that achieves real-time rendering of large-scale scenes in a browser)

90. Google AI: Aligning Text-to-Image Models using Human Feedback

91. Designing an Encoder for Fast Personalization of Text-to-Image Models (basically fast Dreambooth, only 5-15 steps to train using a single image)

92. Stable Diffusion running on an Android phone through full-stack AI optimization (by Qualcomm AI Research)

93. Teaching CLIP to Count to Ten (by Google Research & More) 

94. Introducing Raycast AI: The magic of AI, right on your Mac (Write smarter, code faster and answer questions quicker with ChatGPT in Raycast)

95. Nvidia predicts Al models one million times more powerful than ChatGPT within 10 years

96. Meta AI announces a new SOTA large language model named LLaMA

97. OpenAI's Blog: Planning for AGI and beyond

98. Google AI: Spotlight, a vision-only model that achieves general user interface (UI) understanding from raw pixels

99. Composer is a large (5 billion parameters) controllable diffusion model trained on billions of (text, image) pairs

100. You can now transform text into 360-degree worlds using a tool from Blockade Labs

101. Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback (by Microsoft Research)

102. Chinese technology giant Tencent has revealed plans to develop its own large language model chatbot, “Hunyuan.”

103. Snapchat is introducing a chatbot powered by the latest version of OpenAI’s ChatGPT

104. Elon Musk is starting a new AI company to fight OpenAI & Others. I am so happy!

105. Microsoft: KOSMOS-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot)

106. Mark Zuckerberg has officially announced a team dedicated to building Generative AI for WhatsApp, Messenger and Instagram

107. ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (Basically Dreambooth but trains in 0.05 seconds)

108. Introducing Typeface AI – a generative AI app to supercharge personalized content creation for businesses

109. Bing AI added to the Windows 11 taskbar in the latest OS update

110. Introducing Xiaomi Wireless AR Glass Discovery Edition

111. Beating OpenAI CLIP with 100x less data and compute

112. Meta AI: Introducing CACTI — a framework for scalable multi-task multi-scene imitation learning

113. Google Research: Monocular Depth Estimation using Diffusion Models

114. Meta: the company’s next 4 years of AR and VR hardware plans (till 2027)

MARCH

1. OpenAI: Introducing ChatGPT and Whisper APIs

2. Scientists unveil plan to create biocomputers powered by human brain cells

3. Collage Diffusion creates globally harmonized images from complex compositions of several objects

4. WhisperX: Improves transcription quality and enables a 12x transcription speedup via batched inference

5. S-NeRF: Neural Radiance Fields for Street Views

6. Introducing KTN, a new model that transfers knowledge from label-abundant node types in a heterogeneous graph to zero-labeled node types using existing relational information (Google AI)

7. Google: Introducing MOO: Manipulation of Open-World Objects

8. Stable Diffusion's Official Integration With Blender (Generative AI meets 3D)

9. Consistency Models: a new family of generative models that achieve high sample quality without adversarial training

10. New Open Source Flan-UL2 20B checkpoints released

11. 3D generation on ImageNet

12. Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

13. Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

14. Leia Inc Partners With Stability AI: Convert Any 2D Content to 3D & Upcoming AI Powered 3D Tablet

15. Figure - the AI Robotics company building the world's first commercially viable autonomous humanoid robot

16. Stable Diffusion with Brain Activity: reconstructing visual images from functional Magnetic Resonance Imaging (fMRI) signals using SD

17. Meta's LLaMA gets leaked by 4Chan

18. Performer-MPC: A robot system that learns to move around in different situations using fast and smart machines (by Google)

19. SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with NeRFs

20. Announcing roomGPT: Redesign your room in seconds with AI! 100% free and open source

21. Introducing: Self-Learning Agent for Performing APIs (SLAPA): AI agents can now teach themselves HOW to use tools (ie. any API) in real time, completely automated!

22. OpenAl's DALL-E 3 is currently in development. They are inviting people to alpha test it

23. Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement

24. MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices

25. Unleashing Text-to-Image Diffusion Models for Visual Perception

26. Stability AI's Pick a Pic: an app for collecting human feedback on AI-generated images for supporting academic research in AI (completely open sourced)

27. AI speeds up design of new antibodies that could target breast cancer

28. Deep Agency: AI photo studio & modelling agency

29. StyO: Stylize Your Face in Only One-Shot

30. PaLM-E, a 562-billion parameter, general-purpose, embodied visual-language generalist - across robotics, vision, and language

31. Microsoft: new generative AI features for Power Virtual Agents and AI Builder, enabled by Azure Open AI service

32. Prismer: A Vision-Language Model with An Ensemble of Experts 

33. Learning Humanoid Locomotion with Transformers

34. Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision

35. Taming Stable Diffusion with Human Ranking Feedback

36. Stability AI has acquired the industry leader in AI-powered imaging tools Clipdrop app

37. Salesforce to add ChatGPT to Slack as part of OpenAI partnership

38. Google AI's Paper: Foundation Models for Decision Making: Problems, Methods, and Opportunities

39. Microsoft: Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

40. Diffusion Policy: Visuomotor Policy Learning via Action Diffusion (Robotics)

41. Word-As-Image for Semantic Typography

42. Salesforce Ventures is investing in Anthropic as part of their generative AI fund (Slack integration coming)

43. Video-P2P: Video Editing with Cross-attention Control (generating new characters while optimally preserving their original poses and scenes)

44. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models (by Microsoft)

45. TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation

46. DuckDuckGo's DuckAssist (beta), an AI-assisted search Instant Answer that uses Wikipedia to answer questions

47. Google Open Source: OpenXLA is available now to accelerate and simplify machine learning

48. Meta open sources Casual Conversations v2

49. Discord introduces new AI experiments, including an AI chatbot named Clyde, AutoMod AI, Conversation Summaries, Avatar remix, Whiteboard AI and launching an AI Incubator

50. ChatGPT is now available in Azure OpenAI Service

51. Anthropic AI's Blog: Core Views on AI Safety: When, Why, What, and How

52. Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

53. Wonder Studio: An AI tool that automatically animates, lights and composes CG characters into a live-action scene

54. PAC-NeRF: Physics Augmented Continuum Neural Radiance Fields for Geometry-Agnostic System Identification

55. Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision (by Meta AI & UC San Diego)

56. Cherry-Picking with Reinforcement Learning (Robotics)

57. MathPrompter: Mathematical Reasoning using Large Language Models (by Microsoft)

58. Cones: Concept Neurons in Diffusion Models for Customized Generation

59. GigaGAN: Scaling up GANs for Text-to-Image Synthesis

60. 3DGen: Triplane Latent Diffusion for Textured Mesh Generation

61. Fini allows you to turn your knowledge base into AI chat in 2 minutes

62. MVImgNet: A Large-scale Dataset of Multi-view Images

63. Tag2Text: Guiding Vision-Language Model via Image Tagging

64. Rewarding Chatbots for Real-World Engagement with Millions of Users

65. NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views

66. NeRFshop: Interactive Editing of Neural Radiance Fields

67. Meta's 7B LLaMA model running on 64GB M2 MacBook Pro using llama.cpp by @ggerganov

68. Meta's 7B LLaMA model running on Pixel 6

69. LLaMA 7B model on 4GB RAM Raspberry Pi 4

70. LLaMA has been fine-tuned by Stanford: Alpaca 7B (Performance similar to text text-davinci-003)

71. NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

72. Self-planning Code Generation with Large Language Model (Proposes a self-planning code generation method with LLM, which leads to substantial improvements on code generation tasks like HumanEval)

73. Erasing Concepts from Diffusion Models (Can remove concepts from a diffusion model permanently unlike previous methods)

74. Apple: Stabilizing Transformer Training by Preventing Attention Entropy Collapse

75. Introducing Dalai, a super simple way to run LLaMA AI on your computer

76. Google AI Announcement:- PaLM API & MakerSuite, AI in Gmail, Google Docs & Workspace, Generative AI support in Vertex AI, Generative AI App Builder, Partnerships, programs, and resources for each segment of the ecosystem

77. Anthropic AI's Claude now available for Early Access (Waitlist)

78. Med-PaLM 2, Google's new SOTA medical LLM (Med-PaLM 2 reaches an accuracy of over 85% on USMLE MedQA, going from "passing score" to "expert performance"!)

79. GPT-4 Release — a large multimodal model (image & text in, text out)

80. Confirmed: the new Bing AI runs on OpenAI’s GPT-4

81. Khan Academy introduces GPT-4 powered guide Khanmigo

82. Be My Eyes: Introducing Virtual Volunteer Tool Powered by OpenAI’s GPT-4 (Image to Text)

83. Poe Subscriptions: access to bots based on two powerful new language models: GPT-4 from OpenAI and Claude+ from Anthropic

84. Introducing Duolingo Max. A subscription tier that gives you access to your own personal, AI-powered language tutor through Explain My Answer and Roleplay, powered by GPT-4

85. Introducing Milo co-parent for parents, powered by GPT-4

86. Morgan Stanley is testing an OpenAI-powered chatbot (GPT-4) for its 16,000 financial advisors

87. Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation

88. Edit-A-Video: Single Video Editing with Object-Aware Consistency

89. MeshDiffusion: Score-based Generative 3D Mesh Modeling

90. ViperGPT: Visual Inference via Python Execution for Reasoning

91. FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization

92. DoNotPay is working on using GPT-4 to generate "one click lawsuits" to sue robocallers for $1,500

93. Launching Fin, a new product built on GPT-4 for AI Customer Service

94. magicaltome can ingest an entire Wikipedia article, understand it, and output a summarized tome with just the key points (GPT-4)

95. Keeper is using GPT-4 for matchmaking

96. stripe partners up with OpenAI to enhance their documentation: GPT-4 powered Stripe Docs

97. Introducing Conformer-1: our latest state-of-the-art speech recognition model

98. Midjourney v5 has released and it is mind blowing

99. Introducing a GPT-powered AI tool for text to world building (oncyber)

100. Re-ReND: Real-time Rendering of NeRFs across Devices

101. UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

102. Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

103. PyTorch 2.0 Release

104. Zipline unveiled their next generation delivery drone system, designed to provide the best home delivery service the planet has ever seen

105. AlphaFold 2 Code Update

106. Microsoft 365 Copilot & Business Chat

107. Code for reproducing the Stanford Alpaca Instruct LLaMA

108. Microsoft’s GPT 4-Powered Bing AI Is Now Available Without A Waitlist

109. NeRFMeshing: Distilling Neural Radiance Fields into Geometrically-Accurate 3D Meshes

110. LERF: Language Embedded Radiance Fields (Search objects inside NeRF)

111. FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

112. P+: Extended Textual Conditioning in Text-to-Image Generation

113. Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation

114. Stability AI is excited to announce the launch of Stable Diffusion Reimagine

115. Introducing Vid2Seq, a visual language model for dense video captioning that simply predicts all event boundaries and captions as a single sequence of tokens

116. Dalai Alpaca is here (run the Alpaca LLM on your computer (Mac, Windows, Linux) with just ONE command)

117. Stable diffusion running in browser without server

118. Modelscope's text to video model (open sourced)

119. Open Assistant now available in Early Preview (open sourced)

120. Introducing ChatLLAMA, Your Custom Personal Assistant!

121. Unscripted AI NPCs in a first-of-its-kind Unreal Engine Game demo

122. CoAdapter (Composable Adapter) by jointly training T2I-Adapters and an extra fuser

123. CoLT5: Faster Long-Range Transformers with Conditional Computation (64k context window length)

124. CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos

125. HIVE: Harnessing Human Feedback for Instructional Visual Editing

126. DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion

127. Runway: Introducing, Text to Video. With Gen-2

128. Agility Robotics' Digit - World's First Commercialized Humanoid Robot

129. Legs as Manipulator: Pushing Quadrupedal Agility Beyond Locomotion (Robotics)

130. Rotating without Seeing: Towards In-hand Dexterity through Touch

131. Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers (Makes longer Text 2 videos)

132. CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition

133. Zero-1-to-3: Zero-shot One Image to 3D Object

134. Localizing Object-level Shape Variations with Text-to-Image Diffusion Models

135. SVDiff: proposed SVDiff method has a significantly smaller model size (1.7MB for StableDiffusion)

136. SKED: Sketch-guided Text-based 3D Editing

137. Create images with your words – Bing Image Creator comes to the new Bing AI

138. Google's Bard AI now available in US & UK (Waitlist)

139. Adobe announces Firefly Generative AI

140. Introducing GPT-4 in Azure OpenAI Service

141. Announcing Luma AR!

142. Nvidia announced AI Foundations for AI models training

143. Introducing the release 2.0 of GPT-NeoX, the open-source Megatron-DeepSpeed based library used to train GPT-NeoX-20B and the Pythia model suite

144. Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models

145. Vox-E: Text-guided Voxel Editing of 3D Objects

146. Text2Tex: Text-driven Texture Synthesis via Diffusion Models

147. Nvidia & Shutterstock partnered to create Generative AI 3D tools

148. NVIDIA and Getty Images are partnering to train responsible generative text-to-image and text-to-video foundation models

149. GitHub Copilot X (GPT-4) is a new ChatGPT-like assistant to help developers write and fix code

150. Unity teases Generative AI and it's coming soon

151. Lindy AI Assistant (That Can Use Apps)

152. Opera Browser Now Live With OpenAI Chat Assistant & ChatSonic

153. Canva announces a suite of new Generative AI tools

154. Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions

155. Wavelet Diffusion Models are fast and scalable Image Generators (0.1 seconds to generate image)

156. CC3D: Layout-Conditioned Generation of Compositional 3D Scenes

157. NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation (Reduces time to generate HQ videos)

158. Compositional 3D Scene Generation using Locally Conditioned Diffusion

159. Pix2Video: Video Editing using Image Diffusion

160. Introducing Objaverse, a massive open dataset of text-paired 3D Objects (1 Million Objects)

161. Introducing Mozilla.ai: Investing in trustworthy AI

162. Plugins For ChatGPT Announced

163. Visual language maps for robot navigation

164. Persistent Nature: A Generative Model of Unbounded 3D Worlds

165. ReBotNet: Fast Real-time Video Enhancement

166. Plotting Behind the Scenes: Towards Learnable Game Engines

167. The effectiveness of MAE pre-pretraining for billion-scale pretraining

168. ReVersion: Diffusion-Based Relation Inversion from Images

169. DreamBooth3D: Subject-Driven Text-to-3D Generation

170. CoBIT: A Contrastive Bi-directional Image-Text Generation Model

171. Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes

172. Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

173. Artificial Intelligence Predicts Genetics of Cancerous Brain Tumors in Under 90 Seconds

174. Stable Diffusion “unCLIP” model is finally released

175. We are open sourcing Dolly, a ChatGPT-like model that can do instruction following, created for $30, trained 3 hours on 1 server

176. Levi’s to Use AI-Generated Models to ‘Increase Diversity’

177. Unreal Engine 5.2 Demo

178. Introducing LLaMA voice chat

179. Grid-guided Neural Radiance Fields for Large Urban Scenes

180. Progressively Optimized Local Radiance Fields for Robust View Synthesis

181. SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates (Improves NeRF Quality)

182. High Fidelity Image Synthesis With Deep VAEs In Latent Space

183. Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

184. Luma AI releases Luma Video-to-3D API

185. GoogleAI's Pix2Struct is now available in Huggingface Transformers!

186. 'NEO' Humanoid Robot by 1XComing Summer 2023

187. Google AI's PRESTO – A multilingual dataset for parsing realistic task-oriented dialogues

188. CelebV-Text: A Large-Scale Facial Text-Video Dataset

189. Learning to Zoom and Unzoom (Improves AR/VR quality and more)

190. GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

191. PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters

192. Better Aligning Text-to-Image Models with Human Preference

193. ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks (AGI is close now)

194. Chatbase lets you create a custom ChatGPT from your data, customize its UI, and embed it on your website as a chat bubble or an iframe

195. ACT: Action Chunking with Transformers (Robotics)

196. We're excited to release Lit-LLaMA, a minimal, optimized rewrite of LLaMA for training and inference licensed under Apache 2.0

197. Replit and Google Cloud Partner to Advance Generative AI for Software Development

198. Introducing Microsoft Security Copilot: Empowering defenders at the speed of AI

199. Cerebras-GPT, a family of 7 GPT models from 111M to 13B parameters trained using the Chinchilla formula (Open Source)

200. Introducing OpenFlamingo! A framework for training and evaluating Large Multimodal Models (LMMs) capable of processing images and text

201. Announcing Genmo Chat, a creative copilot that uses GPT-4 and a large suite of generative AI tools to create and then edit any video or image you ask for

202. GPT4ALL: a chatbot trained on a massive collection of clean assistant data including code, stories and dialogue

203. F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

204. LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

205. Your Diffusion Model is Secretly a Zero-Shot Classifier

206. Introducing "Task-driven Autonomous Agent" powered by GPT-4 (AGI is closer now)

207. Sony AI: Instruct 3D-to-3D:Text Instruction Guided 3D-to-3D conversion

208. DeepMind and Google Brain have joined forces to compete with OpenAI

209. Respell's GPT-4 powered AI learning app

210. revelxyz have launched a free consumer service that enables animated avatar creation from a single pic

211. AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotator

212. TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

213. HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images

214. OPUS AI: Text-to-Video Game, the future of video gaming where you type and a 3D World emerges: A Demo

215. AnthropicAI releasing the new Claude App for Slack, in beta

216. Introducing Vicuna, an open-source chatbot impressing GPT-4

217. Language Models can Solve Computer Tasks

218. Apple: NeILF++: Inter-Reflectable Light Fields for Geometry and Material Estimation

219. BloombergGPT: A Large Language Model for Finance

220. WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

221. HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace

222. AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

223. Token Merging for Fast Stable Diffusion

224. PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models

225. DiffCollage: Parallel Generation of Large Content with Diffusion Models

226. HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion

227. Meta AI sharing two major advancements in their work towards general-purpose embodied AI agents: VC-1 & ASC

228. Ameca The Robot Update

229. LLaMA 30B model running on 5.8GB RAM

230. Claude by AnthropicAI - newest AI assistant tool integrated with Zapier

APRIL

MAY

JUNE

JULY

AUGUST

SEPTEMBER

OCTOBER

NOVEMBER

DECEMBER

Comments

Popular posts from this blog