APRIL
1. CarperAI New trlX release! v0.6.0 is out now
4. Self-Refine: Iterative Refinement with Self-Feedback
5. 3D-aware Image Generation using 2D Diffusion Models
11. Latent Video Diffusion Models for High-Fidelity Long Video Generation
13. Kandinsky 2.1: A new open source image generation model
14. Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
15. OpenAI's Blog - Our approach to AI Safety
16. Introducing DribbleBot: A robot that can dribble a soccer ball on diverse natural terrains
17. SegGPT: Segmenting Everything In Context
18. DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
21. Google AI: How Project Starline improves remote communication
22. Nvidia's Generative Novel View Synthesis with 3D-Aware Diffusion Models
23. Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
25. InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning
27. OpenAGI: When LLM Meets Domain Experts
31. Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
34. Teaching Large Language Models to Self-Debug
35. DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
37. OpenAI released a implementation of Consistency Models
38. Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators
39. Emergent autonomous scientific research capabilities of large language models
40. ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
41. DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion
44. Microsoft DeepSpeed Chat offers an end-to-end RLHF pipeline to train ChatGPT-like Models
46. Segment Everything Everywhere All at Once
47. Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields
48. Expressive Text-to-Image Generation with Rich Text
50. OpenAssistant Conversational AI Release
51. A NEW PAELLA: SIMPLE & EFFICIENT TEXT-TO-IMAGE GENERATION
52. MiniGPT-4, an open-sourced model performing complex vision-language tasks like GPT-4!
53. Inpaint Anything: Segment Anything Meets Image Inpainting
54. Google's Delta Denoising Score
55. Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
56. Meta's DINOv2: Learning Robust Visual Features without Supervision
58. Elon Musk to create TruthGPT in rival to ChatGPT
60. LLaVA - Visual Instruction Tuning
61. MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
62. Tool Learning with Foundation Models
63. Google AI - Video Generation Beyond a Single Clip
65. Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models (Text to Video)
66. Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model
67. Generative Disco: Text-to-Video Generation for Music Visualization
68. NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
69. Today, Airbus announced its LOOP space station module designed for long term space missions
72. Bark - an Open Source Audio Generation Model
74. AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
75. Reference-based Image Composition with Sketch via Structure-aware Diffusion Model
76. Reference-guided Controllable Inpainting of Neural Radiance Fields
77. Pretrained Language Models as Visual Planners for Human Assistance
78. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
79. NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
80. DeepMind and the Brain team from Google Research will become a new unit: Google DeepMind
81. Google's Bard adds coding & debug abilities in 20+ languages
82. 3DCoMPaT++: a richly annotated, multimodal 2D/3D dataset of more than 10 million stylized 3D shapes
83. Ask-Anything, tool for chatting about video with chatGPT, miniGPT4 and StableLM
84. Scaling Transformer to 2 Million tokens and beyond with RMT
85. Introducing Chatbot Arena - Which LLM is better?
86. This AI Can Design Complex Proteins Perfectly Tailored to Our Needs
87. Track Anything: Segment Anything Meets Videos
88. Segment Anything in 3D with NeRFs
89. HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
90. AutoNeRF: Training Implicit Scene Representations with Autonomous Agents
91. Speed Is All You Need: Optimizing Stable Diffusion For Mobile Devices (under 12 seconds)
92. HuggingChat: open source alternative to ChatGPT
93. Stability AI's Image Upscaling API - upscale any image without losing any sharpness
94. Towards Realistic Generative 3D Face Models
95. Patch-based 3D Natural Scene Generation from a Single Example
96. AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
97. Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models
98. TextMesh: Generation of Realistic 3D Meshes From Text Prompts
99. Replit announces replit-code-v1-3b : a code language model that is 2.7B parameters and Open Source
100. Code for IF by deepfloydai is up - a text to image diffusion model (open source)
101. TextDeformer: Geometry Manipulation using Text Guidance
102. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
103. Deepmind: Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
105. Microsoft Designer expands preview with new AI design features
106. Introducing Eleven Multilingual v1: Our New Speech Synthesis Model
107. ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System
Comments
Post a Comment