Connect with us

Unite.AI

Kunal Kejriwal

"An engineer by profession, a writer by heart". Kunal is a technical writer with a deep love & understanding of AI and ML, dedicated to simplifying complex concepts in these fields through his engaging and informative documentation.

Artificial Intelligence May 23, 2024

CameraCtrl: Enabling Camera Control for Text-to-Video Generation

Recent frameworks attempting at text to video or T2V generation leverage diffusion models to add stability in their training process, and the Video Diffusion Model, one...
Artificial Intelligence May 17, 2024

BrushNet: Plug and Play Image Inpainting with Dual Branch Diffusion

Image inpainting is one of the classic problems in computer vision, and it aims to restore masked regions in an image with plausible and natural content....
Artificial Intelligence May 3, 2024

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Over the years, the creation of realistic and expressive portraits animations from static images and audio has found a range of applications including gaming, digital media,...
Artificial Intelligence April 26, 2024

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

The advancements in large language models have significantly accelerated the development of natural language processing, or NLP. The introduction of the transformer framework proved to be...
Artificial Intelligence April 25, 2024

AIOS: Operating System for LLM Agents

Over the past six decades, operating systems have evolved progressively, advancing from basic systems to the complex and interactive operating systems that power today’s devices. Initially,...
Artificial Intelligence April 19, 2024

Instant-Style: Style-Preservation in Text-to-Image Generation

Over the past few years, tuning-based diffusion models have demonstrated remarkable progress across a wide array of image personalization and customization tasks. However, despite their potential,...
Artificial Intelligence April 18, 2024

LoReFT: Representation Finetuning for Language Models

Parameter-efficient fine-tuning or PeFT methods seek to adapt large language models via updates to a small number of weights. However, a majority of existing interpretability work...
Artificial Intelligence April 11, 2024

POKELLMON: A Human-Parity Agent for Pokemon Battles with LLMs

Large Language Models and Generative AI have demonstrated unprecedented success on a wide array of Natural Language Processing tasks. After conquering the NLP field, the next...
Artificial Intelligence April 10, 2024

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

The advent of GPT models, along with other autoregressive or AR large language models har unfurled a new epoch in the field of machine learning, and...
Artificial Intelligence April 2, 2024

InstructIR: High-Quality Image Restoration Following Human Instructions

An image can convey a great deal, yet it may also be marred by various issues such as motion blur, haze, noise, and low dynamic range....
Artificial Intelligence April 1, 2024

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Recent advancements in Large Vision Language Models (LVLMs) have shown that scaling these frameworks significantly boosts performance across a variety of downstream tasks. LVLMs, including MiniGPT,...
Artificial Intelligence March 26, 2024

BlackMamba: Mixture of Experts for State-Space Models

The development of Large Language Models (LLMs) built from decoder-only transformer models has played a crucial role in transforming the Natural Language Processing (NLP) domain, as...
Artificial Intelligence March 25, 2024

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Computer vision is one of the most exciting and well-researched fields within the AI community today, and despite the rapid enhancement of the computer vision models,...
Artificial Intelligence March 19, 2024

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models

Over the past few years, diffusion models have achieved massive success and recognition for image and video generation tasks. Video diffusion models, in particular, have been...
Artificial Intelligence March 15, 2024

YOLO-World: Real-Time Open-Vocabulary Object Detection

Object detection has been a fundamental challenge in the computer vision industry, with applications in robotics, image understanding, autonomous vehicles, and image recognition. In recent years,...

More Posts

Page 2 of 6123 4 5 6