Posts

Showing posts from April, 2023

FastChat-T5: our compact and commercial-friendly chatbot - Outperforms Dolly-V2 with 4x fewer parameters #1041

Image
FastChat-T5: our compact and commercial-friendly chatbot - Outperforms Dolly-V2 with 4x fewer parameters #1041 Source

Announcing StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot #1040

Image
Announcing StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot #1040 Source  

Introducing DataComp, a new benchmark for multimodal datasets #1039

Image
Introducing DataComp, a new benchmark for multimodal datasets #1039 Source  

Announcing the release of DeepFloyd IF text to image model - with research weights (open source) #1038

Image
Announcing the release of DeepFloyd IF text to image model - with research weights #1038 Source Official Post

SAD is able to perform 3D segmentation (segment out any 3D object) with RGBD inputs (or rendered depth images only) #1037

Image
SAD is able to perform 3D segmentation (segment out any 3D object) with RGBD inputs (or rendered depth images only) #1037 Source  

Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model #1036

Image
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model #1036 Source

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System #1035

Image
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System #1035 Source

Introducing Eleven Multilingual v1: Our New Speech Synthesis Model #1034

Image
Introducing Eleven Multilingual v1: Our New Speech Synthesis Model #1034 Source

Microsoft Designer expands preview with new AI design features #1033

Image
Microsoft Designer expands preview with new AI design features #1033 Source

Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System #1032

Image
Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System #1032 Source

Deepmind: Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning #1031

Image
Deepmind: Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning #1031 Source

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware #1030

Image
Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware #1030 Source  

TextDeformer: Geometry Manipulation using Text Guidance #1029

Image
TextDeformer: Geometry Manipulation using Text Guidance #1029 Source

Code for IF by deepfloydai is up - a text to image diffusion model (open source) #1028

Image
Code for IF by deepfloydai is up - a text to image diffusion model (open source) #1028 Source

Replit announces replit-code-v1-3b : a code language model that is 2.7B parameters and Open Source #1027

Image
Replit announces replit-code-v1-3b : a code language model that is 2.7B parameters and Open Source #1027 Source

TextMesh: Generation of Realistic 3D Meshes From Text Prompts #1026

Image
TextMesh: Generation of Realistic 3D Meshes From Text Prompts #1026 Source

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models #1025

Image
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models #1025 Source

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head #1024

Image
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head #1024 Source

Patch-based 3D Natural Scene Generation from a Single Example #1023

Image
Patch-based 3D Natural Scene Generation from a Single Example #1023 Source

Towards Realistic Generative 3D Face Models #1022

Image
Towards Realistic Generative 3D Face Models #1022 Source

Stability AI's Image Upscaling API - upscale any image without losing any sharpness #1021

Image
Stability AI's Image Upscaling API - upscale any image without losing any sharpness #1021 Source

Bill Gates says A.I. chatbots will teach kids to read within 18 months: "You’ll be ‘stunned by how it helps.’ #1020

Image
Bill Gates says A.I. chatbots will teach kids to read within 18 months: "You’ll be ‘stunned by how it helps.’ #1020 Source

HuggingChat: open source alternative to ChatGPT #1019

Image
HuggingChat: open source alternative to ChatGPT #1019 Source

‘Avengers’ Director Joe Russo Predicts AI Could Be Making Movies in ‘Two Years’: It Will ‘Engineer and Change Storytelling’ #1018

Image
‘Avengers’ Director Joe Russo Predicts AI Could Be Making Movies in ‘Two Years’: It Will ‘Engineer and Change Storytelling’ #1018 Source My Reddit Post  

Speed Is All You Need: Optimizing Stable Diffusion For Mobile Devices (under 12 seconds) #1017

Image
Speed Is All You Need: Optimizing Stable Diffusion For Mobile Devices (under 12 seconds) #1017 Source

AutoNeRF: Training Implicit Scene Representations with Autonomous Agents #1016

Image
AutoNeRF: Training Implicit Scene Representations with Autonomous Agents #1016 Source

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video #1015

Image
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video #1015 Source

Segment Anything in 3D with NeRFs #1014

Image
Segment Anything in 3D with NeRFs #1014 Source

Track Anything: Segment Anything Meets Videos #1013

Image
Track Anything: Segment Anything Meets Videos #1013 Source  

This AI Can Design Complex Proteins Perfectly Tailored to Our Needs #1012

Image
This AI Can Design Complex Proteins Perfectly Tailored to Our Needs #1012 Source  

Introducing Chatbot Arena - Which LLM is better? #1011

Image
Introducing Chatbot Arena - Which LLM is better? #1011 Source

Some AI videos generated using Gen-2 by runwayml #1010

Image
Some AI videos generated using Gen-2 by runwayml #1010 Vid1 Vid2 Vid3 Vid4 Vid5 Vid6 Vid7 Vid8 Vid9 Vid10

Scaling Transformer to 2 Million tokens and beyond with RMT #1009

Image
Scaling Transformer to 2 Million tokens and beyond with RMT #1009 Source

Grimes allowing people to use her voice on AI generated songs without getting copyright #1008

Image
Grimes allowing people to use her voice on AI generated songs without getting copyright  #1008 Source

Ask-Anything, tool for chatting about video with chatGPT, miniGPT4 and StableLM #1007

Image
Ask-Anything, tool for chatting about video with chatGPT, miniGPT4 and StableLM #1007 Source

3DCoMPaT++: a richly annotated, multimodal 2D/3D dataset of more than 10 million stylized 3D shapes #1006

Image
3DCoMPaT++: a richly annotated, multimodal 2D/3D dataset of more than 10 million stylized 3D shapes #1006 Source

SpaceX's Starship could have 2nd launch test within the next 2 months #1005

Image
SpaceX's Starship could have 2nd launch test within the next 2 months #1005 Source

Google's Bard adds coding & debug abilities in 20+ languages #1004

Image
Google's Bard adds coding & debug abilities in 20+ languages #1004 Source

DeepMind and the Brain team from Google Research will become a new unit: Google DeepMind #1003

Image
DeepMind and the Brain team from Google Research will become a new unit: Google DeepMind #1003 Source  

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models #1002

Image
NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models #1002 Source  

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models #1001

Image
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models #1001 Source

Pretrained Language Models as Visual Planners for Human Assistance #1000

Image
Pretrained Language Models as Visual Planners for Human Assistance #1000 Source

Reference-guided Controllable Inpainting of Neural Radiance Fields #999

Image
Reference-guided Controllable Inpainting of Neural Radiance Fields #999 Source

Reference-based Image Composition with Sketch via Structure-aware Diffusion Model #998

Image
Reference-based Image Composition with Sketch via Structure-aware Diffusion Model #998 Source  

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation #997

Image
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation #997 Source

h2oGPT is out. A new 20 billion parameter instruction-following large language model licensed for commercial use #996

Image
h2oGPT is out. A new 20 billion parameter instruction-following large language model licensed for commercial use #996 Source

Bark - an Open Source Audio Generation Model #995

Image
Bark - an Open Source Audio Generation Model #995 Source Examples

Whisper JAX ⚡️ is a highly optimised Whisper implementation for both GPU and TPU (70x faster than Whisper) #994

Image
Whisper JAX ⚡️ is a highly optimised Whisper implementation for both GPU and TPU (70x faster than Whisper) #994 Source

SpaceX Starship's First Orbital Flight Test #993

Image
SpaceX Starship's First Orbital Flight Test #993 Source