PrO_RaZe 2.0

FastChat-T5: our compact and commercial-friendly chatbot - Outperforms Dolly-V2 with 4x fewer parameters #1041 Source

Announcing StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot #1040

Announcing StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot #1040 Source

Introducing DataComp, a new benchmark for multimodal datasets #1039

Introducing DataComp, a new benchmark for multimodal datasets #1039 Source

Announcing the release of DeepFloyd IF text to image model - with research weights (open source) #1038

Announcing the release of DeepFloyd IF text to image model - with research weights #1038 Source Official Post

SAD is able to perform 3D segmentation (segment out any 3D object) with RGBD inputs (or rendered depth images only) #1037

SAD is able to perform 3D segmentation (segment out any 3D object) with RGBD inputs (or rendered depth images only) #1037 Source

Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model #1036

Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model #1036 Source

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System #1035

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System #1035 Source

Introducing Eleven Multilingual v1: Our New Speech Synthesis Model #1034

April 27, 2023

Introducing Eleven Multilingual v1: Our New Speech Synthesis Model #1034 Source

Microsoft Designer expands preview with new AI design features #1033

April 27, 2023

Microsoft Designer expands preview with new AI design features #1033 Source

Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System #1032

Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System #1032 Source

Deepmind: Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning #1031

Deepmind: Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning #1031 Source

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware #1030

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware #1030 Source

TextDeformer: Geometry Manipulation using Text Guidance #1029

TextDeformer: Geometry Manipulation using Text Guidance #1029 Source

Code for IF by deepfloydai is up - a text to image diffusion model (open source) #1028

Code for IF by deepfloydai is up - a text to image diffusion model (open source) #1028 Source

Replit announces replit-code-v1-3b : a code language model that is 2.7B parameters and Open Source #1027

Replit announces replit-code-v1-3b : a code language model that is 2.7B parameters and Open Source #1027 Source

TextMesh: Generation of Realistic 3D Meshes From Text Prompts #1026

TextMesh: Generation of Realistic 3D Meshes From Text Prompts #1026 Source

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models #1025

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models #1025 Source

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head #1024

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head #1024 Source

Patch-based 3D Natural Scene Generation from a Single Example #1023

Patch-based 3D Natural Scene Generation from a Single Example #1023 Source

Towards Realistic Generative 3D Face Models #1022

Towards Realistic Generative 3D Face Models #1022 Source

Stability AI's Image Upscaling API - upscale any image without losing any sharpness #1021

Stability AI's Image Upscaling API - upscale any image without losing any sharpness #1021 Source

Bill Gates says A.I. chatbots will teach kids to read within 18 months: "You’ll be ‘stunned by how it helps.’ #1020

Bill Gates says A.I. chatbots will teach kids to read within 18 months: "You’ll be ‘stunned by how it helps.’ #1020 Source

HuggingChat: open source alternative to ChatGPT #1019

HuggingChat: open source alternative to ChatGPT #1019 Source

‘Avengers’ Director Joe Russo Predicts AI Could Be Making Movies in ‘Two Years’: It Will ‘Engineer and Change Storytelling’ #1018

‘Avengers’ Director Joe Russo Predicts AI Could Be Making Movies in ‘Two Years’: It Will ‘Engineer and Change Storytelling’ #1018 Source My Reddit Post

Speed Is All You Need: Optimizing Stable Diffusion For Mobile Devices (under 12 seconds) #1017

Speed Is All You Need: Optimizing Stable Diffusion For Mobile Devices (under 12 seconds) #1017 Source

AutoNeRF: Training Implicit Scene Representations with Autonomous Agents #1016

AutoNeRF: Training Implicit Scene Representations with Autonomous Agents #1016 Source

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video #1015

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video #1015 Source

Segment Anything in 3D with NeRFs #1014

Segment Anything in 3D with NeRFs #1014 Source

Track Anything: Segment Anything Meets Videos #1013

Track Anything: Segment Anything Meets Videos #1013 Source

This AI Can Design Complex Proteins Perfectly Tailored to Our Needs #1012

April 24, 2023

This AI Can Design Complex Proteins Perfectly Tailored to Our Needs #1012 Source

Introducing Chatbot Arena - Which LLM is better? #1011

April 24, 2023

Introducing Chatbot Arena - Which LLM is better? #1011 Source

Some AI videos generated using Gen-2 by runwayml #1010

Some AI videos generated using Gen-2 by runwayml #1010 Vid1 Vid2 Vid3 Vid4 Vid5 Vid6 Vid7 Vid8 Vid9 Vid10

Scaling Transformer to 2 Million tokens and beyond with RMT #1009

Scaling Transformer to 2 Million tokens and beyond with RMT #1009 Source

Grimes allowing people to use her voice on AI generated songs without getting copyright #1008

Grimes allowing people to use her voice on AI generated songs without getting copyright #1008 Source

Ask-Anything, tool for chatting about video with chatGPT, miniGPT4 and StableLM #1007

Ask-Anything, tool for chatting about video with chatGPT, miniGPT4 and StableLM #1007 Source

3DCoMPaT++: a richly annotated, multimodal 2D/3D dataset of more than 10 million stylized 3D shapes #1006

April 22, 2023

3DCoMPaT++: a richly annotated, multimodal 2D/3D dataset of more than 10 million stylized 3D shapes #1006 Source

SpaceX's Starship could have 2nd launch test within the next 2 months #1005

April 21, 2023

SpaceX's Starship could have 2nd launch test within the next 2 months #1005 Source

Google's Bard adds coding & debug abilities in 20+ languages #1004

April 21, 2023

Google's Bard adds coding & debug abilities in 20+ languages #1004 Source

DeepMind and the Brain team from Google Research will become a new unit: Google DeepMind #1003

DeepMind and the Brain team from Google Research will become a new unit: Google DeepMind #1003 Source

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models #1002

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models #1002 Source

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models #1001

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models #1001 Source

Pretrained Language Models as Visual Planners for Human Assistance #1000

Pretrained Language Models as Visual Planners for Human Assistance #1000 Source

Reference-guided Controllable Inpainting of Neural Radiance Fields #999

Reference-guided Controllable Inpainting of Neural Radiance Fields #999 Source

Reference-based Image Composition with Sketch via Structure-aware Diffusion Model #998

Reference-based Image Composition with Sketch via Structure-aware Diffusion Model #998 Source

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation #997

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation #997 Source

h2oGPT is out. A new 20 billion parameter instruction-following large language model licensed for commercial use #996

h2oGPT is out. A new 20 billion parameter instruction-following large language model licensed for commercial use #996 Source

Bark - an Open Source Audio Generation Model #995

Bark - an Open Source Audio Generation Model #995 Source Examples

Whisper JAX ⚡️ is a highly optimised Whisper implementation for both GPU and TPU (70x faster than Whisper) #994

Whisper JAX ⚡️ is a highly optimised Whisper implementation for both GPU and TPU (70x faster than Whisper) #994 Source

SpaceX Starship's First Orbital Flight Test #993