2023 AI & TECH UPDATES
Collection of updates in the AI field mostly but also other tech stuff that I personally liked. There is just so much going on now that it is impossible to update about everything as a human lol. And this isn't even the singularity.
JANUARY
1. Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
2. Muse: Text-To-Image Generation via Masked Generative Transformers
3. Massive Language Models Can Be Accurately Pruned in One-Shot
4. Teaching AI advanced mathematical reasoning
5. Apple just released AI narration of audio books that sound shockingly human
6. Microsoft VALL-E Text to Audio Generation
7. Some Cool Stuff Shown At CES 2023
8. GLM-130B Open Source Large Language Model
9. More than 450 start-ups are now working on generative AI
12. One step closer to Reverse Aging in Humans
13. ChatGPT is coming soon to the Azure OpenAI Service
14. Microsoft Plans to Build OpenAI Capabilities Into All Products
15. 17- Audio Generation with Diffusion
16. 18 - ChatGPT & Wolfram Language together gives better mathematical results
18. 18 - Google is finally going to release their AI models too, in their products and through API
19. 18 - DEEPMIND Human-Timescale Adaptation in an Open-Ended Task Space
20. 19 - Boston Dynamics Atlas Grip Update
21. 19 - META AI: ESM Metagenomic Atlas: The first view of the ‘dark matter’ of the protein universe
22. 20 - Accurate Pose Estimation Via WiFi Signals
23. 21 - Introducing Eye Contact by NVIDIA
24. 22 - Instruct-Pix2Pix: image editing in natural language
25. 23 - A high-performance speech neuroprosthesis (brain computer interface)
26. 23 - META AI E3B is a method for exploring complex environments which vary across episodes
27. 23 - StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
28. 23 - InfiniCity: Infinite-Scale City Synthesis
29. 24 - Replacing clothes using AI in video. InstructPix2Pix + EbSynth
30. 24 - Scientists claim to have found the biological cause of aging
31. 24 - Deciphering Clinical Abbreviations with Privacy Protecting ML
32. 24 - A demo of TrueSync – AI-powered visual dialogue translation from Flawless AI
34. 25 - New ViT-G/14 CLIP model
35. 25 - A Watermark for Large Language Models
36. 25 - Atomic AI Launches To Treat Undruggable Diseases
37. 25 - CERN's machine learning could help self-driving cars
38. 25 - Google AI Learning with queried hints
39. 26 - Google AI MusicLM: Generating Music From Text
40. 27 - Text-To-4D Dynamic Scene Generation
41. 27 - Nvidia AI introduces ORBIT on IsaacSim, a GPU-powered virtual Gym for robots to work out
42. 27 - Luma AI Text-to-3D released
43. 27 - On the Importance of Noise Scheduling for Diffusion Models
44. 28 - New Text-to-Audio model, AudioLDM
45. 28 - Noise2Music AI Music Generation From Text
46. 28 - Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
47. 29 - Msanii: High Fidelity Music Synthesis on a Shoestring Budget
48. 29 - ElevenLabs Currently The Best Voice Cloning and Synthesis AI
49. 29 - Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion
50. 29 - Channel - Asking Questions Through GPT-3 To Get Answers From Database
51. 29 - Meta AI Multiview Compressive Coding for 3D Reconstruction
52. 29 - SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
54. 30 - SingSong: Generating musical accompaniments from singing
56. 31 - OpenAI now offers an official AI plagiarism detector with AI Text Classifier
57. 31 - PADL: Language-Directed Physics-Based Character Control
58. 31 - Introducing Rose AI — a new way to interface with data
59. 31 - FLAME: A small language model for spreadsheet formulas
60. 31 - DetectGPT, a method for detecting if text comes from an LM
61. 31 - January 2023 was already mind blowing in AI progress, hard to imagine upcoming months
FEBRUARY
1. 1 - Meta AI Emergence of Maps in the Memories of Blind Navigation Agents
2. 1 - OpenAI announces ChatGPT Plus for $20 per month
3. 1 - Samsung reveals its plans to start working on extended reality devices (AR/VR)
4. Microsoft Teams Premium: Cut costs and add AI-powered productivity (GPT-3.5)
5. Perplexity Ask is now available as a Chrome extension
6. MultiRay is Meta’s platform for efficiently running large-scale, state-of-the-art AI models
7. Google AI Open Source Vizier: Towards reliable and flexible hyperparameter and blackbox optimization
8. Microsoft boosts Viva Sales with new GPT seller experience
9. Facebook releases a 30B param “OPT+IML”
10. Dreamix: Video Diffusion Models are General Video Editors
11. SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections
12. SceneScape: Text-Driven Consistent Scene Generation
13. RobustNeRF: Ignoring Distractors with Robust Losses
14. OpenAI Self-critiquing models for assisting human evaluators
15. Amazon AI: Multimodal Chain-of-Thought Reasoning in Language Models
16. Deepmind: Accelerating Large Language Model Decoding with Speculative Sampling
17. Sundar Pichai: LaMDA is releasing for the general public in the coming weeks and months
18. AutumnSynth synthesizes the source code of a video game from seconds of play
19. Nvidia AI Synthesizing Physical Character-Scene Interactions
20. Poe, a bot from Quora that can answer questions and have conversations
21. BLIP-2 demo available on Huggingface: LLM that can understand Images
22. Humata.ai launched: Basically ChatGPT for your own files
23. Google invests $300 million in Anthropic AI
24. Google AI Real-time tracking of wildfire boundaries using satellite imagery
26. Apple CEO Tim Cook says AI will eventually 'affect every product and service we have'
27. Epic-Sounds: A Large-scale Dataset of Actions That Sound
30. Tune-A-Video available to use and also open sourced (turns AI Generated Images into gifs or videos)
31. Filechat.io now available - ChatGPT for your own data and no limits (with premium tier)
32. BioGPT-Large was just released by Microsoft
34. Language Models Secretly Perform Gradient Descent as Meta-Optimizers
36. Microsoft & OpenAI: Bing and Edge + AI: a new way to search starts today
38. Galileo AI : the first AI product that uses natural language to generate UI designs
40. Meta AI Toolformer: Language Models Can Teach Themselves to Use Tools
41. ERNIE-Music: : music generation model to generate music audio based on free-form text
42. code release for pix2pix-zero: Zero-shot Image-to-Image Translation is out
43. Adding Conditional Control to Text-to-Image Diffusion Models
44. Q-Diffusion: Quantizing Diffusion Models
45. Announcing the launch of the MedARC
46. Opera announces plans for upcoming Generative AI integrations in their web browser
47. Scaling Vision Transformers to 22 Billion Parameters
49. Google research released a paper on a neural net that can forecast rain up to 12 hours ahead
50. ALAN: Autonomously Exploring Robotic Agents in the Real World
51. GitHub Copilot for Business is now available
52. Introducing "text-2-commercial" – the unique text-2-video experience we're building
54. NASA Turns to AI to Design Mission Hardware
55. Announcing Replit Ghostwriter Chat - Generate, debug, refactor, and understand code faster than ever
57. Anthropic AI Paper: The Capacity for Moral Self-Correction in Large Language Models
59. Meta AI's GenAug
60. Introducing Type AI - An AI-first document editor that helps you write remarkably fast
61. pix2pix 3d: 3D-aware Conditional Image Synthesis
62. PersonNeRF: Personalized Reconstruction from Photo Collections
63. PhotoRoom is finally releasing generative AI on the web after millions of images generated on mobile
64. Genius Sheets AI: Ask questions, generate reports, and query data using text interface powered by AI
65. ZoomInfo announced that they are incorporating GPT technology into their platform now
66. Generative AI on Roblox: Vision for the Future of Creation
67. Adding Conditional Control to Text-to-Image Diffusion Models
68. ChatGPT for Robotics: Design Principles and Model Abilities
69. Bing AI Update - Increasing Limits on Chat Sessions
70. RealFusion 360 Reconstruction of Any Object from a Single Image
72. Amazon Web Services (AWS) is partnering with Hugging Face
75. Notion AI is now available to everyone (no waitlist or limited preview)
78. Runway announces the official launch of Runway Studios (for the next generation of AI storytellers)
80. Uizard AI Autodesigner (basically ChatGPT for product design)
83. Clone announces that they are making synthetic humans, starting with a 1:1 copy of the human hand
84. RoboNinja: Learning an Adaptive Cutting Policy for Multi-Material Objects (Robots using Knives)
85. Runway's Gen-1: The Next Step Forward for Generative Al (AI Videos)
87. Google AI: Pre-training generalist agents using offline reinforcement learning
88. Luma AI introduces full volumetric photorealistic NeRF rendering on the web in realtime
90. Google AI: Aligning Text-to-Image Models using Human Feedback
93. Teaching CLIP to Count to Ten (by Google Research & More)
95. Nvidia predicts Al models one million times more powerful than ChatGPT within 10 years
96. Meta AI announces a new SOTA large language model named LLaMA
97. OpenAI's Blog: Planning for AGI and beyond
100. You can now transform text into 360-degree worlds using a tool from Blockade Labs
103. Snapchat is introducing a chatbot powered by the latest version of OpenAI’s ChatGPT
104. Elon Musk is starting a new AI company to fight OpenAI & Others. I am so happy!
109. Bing AI added to the Windows 11 taskbar in the latest OS update
110. Introducing Xiaomi Wireless AR Glass Discovery Edition
111. Beating OpenAI CLIP with 100x less data and compute
112. Meta AI: Introducing CACTI — a framework for scalable multi-task multi-scene imitation learning
113. Google Research: Monocular Depth Estimation using Diffusion Models
114. Meta: the company’s next 4 years of AR and VR hardware plans (till 2027)
MARCH
1. OpenAI: Introducing ChatGPT and Whisper APIs
2. Scientists unveil plan to create biocomputers powered by human brain cells
3. Collage Diffusion creates globally harmonized images from complex compositions of several objects
5. S-NeRF: Neural Radiance Fields for Street Views
7. Google: Introducing MOO: Manipulation of Open-World Objects
8. Stable Diffusion's Official Integration With Blender (Generative AI meets 3D)
10. New Open Source Flan-UL2 20B checkpoints released
12. Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
13. Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
14. Leia Inc Partners With Stability AI: Convert Any 2D Content to 3D & Upcoming AI Powered 3D Tablet
17. Meta's LLaMA gets leaked by 4Chan
19. SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with NeRFs
20. Announcing roomGPT: Redesign your room in seconds with AI! 100% free and open source
22. OpenAl's DALL-E 3 is currently in development. They are inviting people to alpha test it
23. Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement
24. MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices
25. Unleashing Text-to-Image Diffusion Models for Visual Perception
27. AI speeds up design of new antibodies that could target breast cancer
28. Deep Agency: AI photo studio & modelling agency
29. StyO: Stylize Your Face in Only One-Shot
32. Prismer: A Vision-Language Model with An Ensemble of Experts
33. Learning Humanoid Locomotion with Transformers
35. Taming Stable Diffusion with Human Ranking Feedback
36. Stability AI has acquired the industry leader in AI-powered imaging tools Clipdrop app
37. Salesforce to add ChatGPT to Slack as part of OpenAI partnership
38. Google AI's Paper: Foundation Models for Decision Making: Problems, Methods, and Opportunities
39. Microsoft: Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
40. Diffusion Policy: Visuomotor Policy Learning via Action Diffusion (Robotics)
41. Word-As-Image for Semantic Typography
44. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models (by Microsoft)
45. TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation
47. Google Open Source: OpenXLA is available now to accelerate and simplify machine learning
48. Meta open sources Casual Conversations v2
50. ChatGPT is now available in Azure OpenAI Service
51. Anthropic AI's Blog: Core Views on AI Safety: When, Why, What, and How
52. Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
56. Cherry-Picking with Reinforcement Learning (Robotics)
57. MathPrompter: Mathematical Reasoning using Large Language Models (by Microsoft)
58. Cones: Concept Neurons in Diffusion Models for Customized Generation
59. GigaGAN: Scaling up GANs for Text-to-Image Synthesis
60. 3DGen: Triplane Latent Diffusion for Textured Mesh Generation
61. Fini allows you to turn your knowledge base into AI chat in 2 minutes
62. MVImgNet: A Large-scale Dataset of Multi-view Images
63. Tag2Text: Guiding Vision-Language Model via Image Tagging
64. Rewarding Chatbots for Real-World Engagement with Millions of Users
65. NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views
66. NeRFshop: Interactive Editing of Neural Radiance Fields
67. Meta's 7B LLaMA model running on 64GB M2 MacBook Pro using llama.cpp by @ggerganov
68. Meta's 7B LLaMA model running on Pixel 6
69. LLaMA 7B model on 4GB RAM Raspberry Pi 4
70. LLaMA has been fine-tuned by Stanford: Alpaca 7B (Performance similar to text text-davinci-003)
71. NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
74. Apple: Stabilizing Transformer Training by Preventing Attention Entropy Collapse
75. Introducing Dalai, a super simple way to run LLaMA AI on your computer
77. Anthropic AI's Claude now available for Early Access (Waitlist)
79. GPT-4 Release — a large multimodal model (image & text in, text out)
80. Confirmed: the new Bing AI runs on OpenAI’s GPT-4
81. Khan Academy introduces GPT-4 powered guide Khanmigo
82. Be My Eyes: Introducing Virtual Volunteer Tool Powered by OpenAI’s GPT-4 (Image to Text)
85. Introducing Milo co-parent for parents, powered by GPT-4
86. Morgan Stanley is testing an OpenAI-powered chatbot (GPT-4) for its 16,000 financial advisors
87. Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
88. Edit-A-Video: Single Video Editing with Object-Aware Consistency
89. MeshDiffusion: Score-based Generative 3D Mesh Modeling
90. ViperGPT: Visual Inference via Python Execution for Reasoning
91. FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization
92. DoNotPay is working on using GPT-4 to generate "one click lawsuits" to sue robocallers for $1,500
93. Launching Fin, a new product built on GPT-4 for AI Customer Service
95. Keeper is using GPT-4 for matchmaking
96. stripe partners up with OpenAI to enhance their documentation: GPT-4 powered Stripe Docs
97. Introducing Conformer-1: our latest state-of-the-art speech recognition model
98. Midjourney v5 has released and it is mind blowing
99. Introducing a GPT-powered AI tool for text to world building (oncyber)
100. Re-ReND: Real-time Rendering of NeRFs across Devices
101. UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation
102. Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion
103. PyTorch 2.0 Release
106. Microsoft 365 Copilot & Business Chat
107. Code for reproducing the Stanford Alpaca Instruct LLaMA
108. Microsoft’s GPT 4-Powered Bing AI Is Now Available Without A Waitlist
109. NeRFMeshing: Distilling Neural Radiance Fields into Geometrically-Accurate 3D Meshes
110. LERF: Language Embedded Radiance Fields (Search objects inside NeRF)
111. FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
112. P+: Extended Textual Conditioning in Text-to-Image Generation
113. Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation
114. Stability AI is excited to announce the launch of Stable Diffusion Reimagine
117. Stable diffusion running in browser without server
118. Modelscope's text to video model (open sourced)
119. Open Assistant now available in Early Preview (open sourced)
120. Introducing ChatLLAMA, Your Custom Personal Assistant!
121. Unscripted AI NPCs in a first-of-its-kind Unreal Engine Game demo
122. CoAdapter (Composable Adapter) by jointly training T2I-Adapters and an extra fuser
123. CoLT5: Faster Long-Range Transformers with Conditional Computation (64k context window length)
124. CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos
125. HIVE: Harnessing Human Feedback for Instructional Visual Editing
126. DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion
127. Runway: Introducing, Text to Video. With Gen-2
128. Agility Robotics' Digit - World's First Commercialized Humanoid Robot
129. Legs as Manipulator: Pushing Quadrupedal Agility Beyond Locomotion (Robotics)
130. Rotating without Seeing: Towards In-hand Dexterity through Touch
132. CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition
133. Zero-1-to-3: Zero-shot One Image to 3D Object
134. Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
135. SVDiff: proposed SVDiff method has a significantly smaller model size (1.7MB for StableDiffusion)
136. SKED: Sketch-guided Text-based 3D Editing
137. Create images with your words – Bing Image Creator comes to the new Bing AI
138. Google's Bard AI now available in US & UK (Waitlist)
139. Adobe announces Firefly Generative AI
140. Introducing GPT-4 in Azure OpenAI Service
141. Announcing Luma AR!
142. Nvidia announced AI Foundations for AI models training
144. Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
145. Vox-E: Text-guided Voxel Editing of 3D Objects
146. Text2Tex: Text-driven Texture Synthesis via Diffusion Models
147. Nvidia & Shutterstock partnered to create Generative AI 3D tools
149. GitHub Copilot X (GPT-4) is a new ChatGPT-like assistant to help developers write and fix code
150. Unity teases Generative AI and it's coming soon
151. Lindy AI Assistant (That Can Use Apps)
152. Opera Browser Now Live With OpenAI Chat Assistant & ChatSonic
153. Canva announces a suite of new Generative AI tools
154. Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
155. Wavelet Diffusion Models are fast and scalable Image Generators (0.1 seconds to generate image)
156. CC3D: Layout-Conditioned Generation of Compositional 3D Scenes
158. Compositional 3D Scene Generation using Locally Conditioned Diffusion
159. Pix2Video: Video Editing using Image Diffusion
160. Introducing Objaverse, a massive open dataset of text-paired 3D Objects (1 Million Objects)
161. Introducing Mozilla.ai: Investing in trustworthy AI
162. Plugins For ChatGPT Announced
163. Visual language maps for robot navigation
164. Persistent Nature: A Generative Model of Unbounded 3D Worlds
165. ReBotNet: Fast Real-time Video Enhancement
166. Plotting Behind the Scenes: Towards Learnable Game Engines
167. The effectiveness of MAE pre-pretraining for billion-scale pretraining
168. ReVersion: Diffusion-Based Relation Inversion from Images
169. DreamBooth3D: Subject-Driven Text-to-3D Generation
170. CoBIT: A Contrastive Bi-directional Image-Text Generation Model
171. Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes
172. Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
173. Artificial Intelligence Predicts Genetics of Cancerous Brain Tumors in Under 90 Seconds
174. Stable Diffusion “unCLIP” model is finally released
176. Levi’s to Use AI-Generated Models to ‘Increase Diversity’
178. Introducing LLaMA voice chat
179. Grid-guided Neural Radiance Fields for Large Urban Scenes
180. Progressively Optimized Local Radiance Fields for Robust View Synthesis
181. SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates (Improves NeRF Quality)
182. High Fidelity Image Synthesis With Deep VAEs In Latent Space
183. Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
184. Luma AI releases Luma Video-to-3D API
185. GoogleAI's Pix2Struct is now available in Huggingface Transformers!
186. 'NEO' Humanoid Robot by 1XComing Summer 2023
187. Google AI's PRESTO – A multilingual dataset for parsing realistic task-oriented dialogues
188. CelebV-Text: A Large-Scale Facial Text-Video Dataset
189. Learning to Zoom and Unzoom (Improves AR/VR quality and more)
190. GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
191. PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters
192. Better Aligning Text-to-Image Models with Human Preference
193. ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks (AGI is close now)
195. ACT: Action Chunking with Transformers (Robotics)
197. Replit and Google Cloud Partner to Advance Generative AI for Software Development
198. Introducing Microsoft Security Copilot: Empowering defenders at the speed of AI
203. F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories
204. LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
205. Your Diffusion Model is Secretly a Zero-Shot Classifier
206. Introducing "Task-driven Autonomous Agent" powered by GPT-4 (AGI is closer now)
207. Sony AI: Instruct 3D-to-3D:Text Instruction Guided 3D-to-3D conversion
208. DeepMind and Google Brain have joined forces to compete with OpenAI
209. Respell's GPT-4 powered AI learning app
211. AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotator
212. TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
213. HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
215. AnthropicAI releasing the new Claude App for Slack, in beta
216. Introducing Vicuna, an open-source chatbot impressing GPT-4
217. Language Models can Solve Computer Tasks
218. Apple: NeILF++: Inter-Reflectable Light Fields for Geometry and Material Estimation
219. BloombergGPT: A Large Language Model for Finance
221. HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
222. AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
223. Token Merging for Fast Stable Diffusion
224. PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models
225. DiffCollage: Parallel Generation of Large Content with Diffusion Models
226. HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion
229. LLaMA 30B model running on 5.8GB RAM
230. Claude by AnthropicAI - newest AI assistant tool integrated with Zapier
Comments
Post a Comment