Unity package for using LiveTalk on-device models for real-time talking head generation and character animation.
LiveTalk is a unified, high-performance talking head generation system that combines the power of LivePortrait and MuseTalk open-source repositories. The PyTorch models from these projects have been ported to ONNX format and optimized for CoreML to enable efficient on-device inference in Unity.
LivePortrait provides facial animation and expression transfer capabilities, while MuseTalk handles real-time lip synchronization with audio. Together, they create a complete pipeline for generating natural-looking talking head videos from avatar images and audio input. Spark-TTS-Unity is the dependency package for TTS generation
- 🎮 Unity-Native Integration: Complete API designed specifically for Unity with singleton pattern
- 🎠Dual-Pipeline Processing: LivePortrait for facial animation + MuseTalk for lip sync
- 👤 Advanced Character System: Create, save, and load characters with multiple expressions and voices
- đź’» Runs Offline: All processing happens on-device with ONNX Runtime
- ⚡ Real-time Performance: Optimized for real-time inference with frame streaming
- 🎨 Multiple Expression Support: 7 built-in expressions (talk-neutral, approve, disapprove, smile, sad, surprised, confused)
- 🔊 Integrated TTS: Built-in SparkTTS integration for voice generation
- 📦 Cross-Platform Character Format: Supports both folder and macOS bundle formats
- 🎥 Flexible Input: Supports images, videos, and directory-based driving frames
- đź§ Memory Management: Configurable memory usage modes for desktop and mobile devices
- 🎠Flexible Creation Modes: Voice-only, single expression, or full character creation
- AI-driven NPCs in games
- Virtual assistants and chatbots
- Real-time character animation
- Interactive storytelling applications
- Video content generation
- Accessibility features
- Virtual avatars and digital humans
- Open your Unity project
- Open the Package Manager (Window > Package Manager)
- Click the "+" button in the top-left corner
- Select "Add package from git URL..."
- Enter the repository URL:
https://github.com/arghyasur1991/LiveTalk-Unity.git - Click "Add"
- Clone this repository
- Copy the contents into your Unity project's Packages folder
This package requires the following Unity packages:
- com.genesis.sparktts.unity
Some dependencies require additional scoped registry configuration. Add the following to your project's Packages/manifest.json file:
{
"scopedRegistries": [
{
"name": "NPM",
"url": "https://registry.npmjs.com",
"scopes": [
"com.github.asus4"
]
}
],
"dependencies": {
"com.genesis.LiveTalk.unity": "https://github.com/arghyasur1991/LiveTalk-Unity.git",
// ... other dependencies
}
}Note: The git URL https://github.com/arghyasur1991/LiveTalk-Unity.git will automatically fetch the latest version of the package.
LiveTalk requires ONNX models from both LivePortrait and MuseTalk in the following location:
Assets/StreamingAssets/LiveTalk/
└── models/
├── LivePortrait/
│ ├── *.onnx
└── MuseTalk/
├── *.onnx
SparkTTS models are required for voice generation and should be placed in:
Assets/StreamingAssets/SparkTTS/
├── *.onnx
└── LLM/
├── model.onnx
├── model.onnx_data
├── ...
LiveTalk includes a built-in Editor tool that automatically analyzes your codebase and copies only the required models from Assets/Models to StreamingAssets with the correct precision settings (FP16, FP32, etc.).
Access the tool: Window > LiveTalk > Model Deployment Tool
- Precision-Aware: Copies only the required precision variants (FP16/FP32) based on code analysis
- Size Optimization: Reduces build size by excluding unused models
- Folder Structure Preservation: Maintains the correct directory structure in StreamingAssets
- Backup Support: Creates backups of existing models before overwriting
- Dry Run Mode: Preview changes without actually copying files
- Open the tool: Go to
Window > LiveTalk > Model Deployment Tool - Configure paths:
- Source:
Assets/Models(automatically detected) - Destination:
Assets/StreamingAssets/LiveTalk(automatically configured)
- Source:
- Select components: Choose which model categories to deploy:
- âś… SparkTTS Models (deployed via SparkTTS-Unity package)
- âś… LivePortrait Models (deployed directly)
- âś… MuseTalk Models (deployed directly)
- Review selection: The tool shows you exactly which LiveTalk models will be copied and their file sizes
- Deploy: Click "Deploy All Models" to copy both LiveTalk and SparkTTS models using their respective deployment systems
The tool selects the used precision for each model based on the LiveTalk codebase:
| Model Category | Precision | Execution Provider | Notes |
|---|---|---|---|
| LivePortrait | |||
| warping_spade | FP16 | CoreML | GPU-accelerated rendering |
| Other LivePortrait | FP32 | CoreML | Full precision for facial features |
| MuseTalk | |||
| unet, vae_encoder, vae_decoder | FP16 | CoreML | GPU-accelerated inference |
| whisper_encoder, positional_encoding | FP32 | CPU | Audio processing precision |
| SparkTTS | |||
| Models deployed via SparkTTS-Unity package | See SparkTTS documentation | Various | Handled by SparkTTS deployment tool |
- Overwrite Existing: Replace existing models in StreamingAssets
- Create Backup: Keep .backup copies of replaced files (includes .onnx.data files)
- Dry Run: Preview operations without copying files
The tool automatically handles large models that use separate data files:
- MuseTalk UNet:
unet.onnx(710KB) +unet.onnx.data(3.2GB) - uses dot notation - SparkTTS LLM: Handled by SparkTTS-Unity deployment tool with
model.onnx_datafiles
LiveTalk model and data files are copied together and included in size calculations and backup operations. SparkTTS models are handled by the SparkTTS-Unity package's own deployment system.
This tool ensures your Unity project includes only the models you actually need, significantly reducing build size while maintaining optimal performance.
SparkTTS models can also be deployed independently using the SparkTTS-Unity package's standalone tool:
Access: Window > SparkTTS > Model Deployment Tool
This allows you to:
- Deploy only SparkTTS models without LiveTalk models
- Use SparkTTS in projects that don't include LiveTalk
- Have fine-grained control over SparkTTS model deployment
Download the pre-exported ONNX models from Google Drive.
- Download the ZIP file from the link
- Extract the contents
- Copy the extracted
LiveTalkfolder with models to your Unity project'sAssets/Models/directory - Use the Model Deployment Tool (recommended): Go to
Window > LiveTalk > Model Deployment Toolto automatically copy only the required models with optimal precision settings
Check the Model Setup section of Spark-TTS-Unity
Coming Soon - conversion scripts to export models from the original Python repositories:
- LivePortrait: https://github.com/KwaiVGI/LivePortrait
- MuseTalk: https://github.com/TMElyralab/MuseTalk
The export scripts will convert PyTorch models to ONNX format and apply CoreML optimizations for Unity integration.
using UnityEngine;
using LiveTalk.API;
using System.Threading.Tasks;
public class LiveTalkExample : MonoBehaviour
{
async void Start()
{
// Initialize the LiveTalk system
LiveTalkAPI.Instance.Initialize(
logLevel: LogLevel.INFO,
characterSaveLocation: "", // Uses default location (persistentDataPath/Characters)
parentModelPath: "", // Uses StreamingAssets
memoryUsage: MemoryUsage.Performance // Load all models at startup
);
// Wait for all models to be loaded (Performance mode)
// This includes LivePortrait, MuseTalk, and SparkTTS models
await LiveTalkAPI.Instance.WaitForAllModelsAsync(
onProgress: (modelName, progress) => {
Debug.Log($"Loading {modelName}: {progress * 100:F0}%");
}
);
Debug.Log("All models loaded and ready!");
// Now you can create characters and generate speech...
}
}When using MemoryUsage.Performance mode, you can track model loading progress:
// Simple usage - just wait for completion
await LiveTalkAPI.Instance.WaitForAllModelsAsync();
// With progress tracking
await LiveTalkAPI.Instance.WaitForAllModelsAsync(
onProgress: (modelName, progress) => {
// progress is 0.0 to 1.0 for each model group
// modelName will be: "LivePortrait Animation", "MuseTalk Animation", or "Voice Synthesis"
UpdateLoadingUI($"Loading {modelName}", progress);
}
);Note: Model loading is only necessary in MemoryUsage.Performance mode. In Balanced and Optimal modes, models load on-demand during first use.
using UnityEngine;
using LiveTalk.API;
public class LiveTalkExample : MonoBehaviour
{
void Start()
{
// Initialize the LiveTalk system
LiveTalkAPI.Instance.Initialize(
logLevel: LogLevel.INFO,
characterSaveLocation: "", // Uses default location (persistentDataPath/Characters)
parentModelPath: "", // Uses StreamingAssets
memoryUsage: MemoryUsage.Balanced // Recommended for desktop
);
}
}LiveTalk supports different memory usage configurations optimized for various device types:
// For desktop devices (default) - balanced memory and performance
LiveTalkAPI.Instance.Initialize(memoryUsage: MemoryUsage.Balanced);
// For performance-critical applications - loads all models upfront
LiveTalkAPI.Instance.Initialize(memoryUsage: MemoryUsage.Performance);
// For mobile devices - minimal memory footprint
LiveTalkAPI.Instance.Initialize(memoryUsage: MemoryUsage.Optimal);
// Automatic selection based on platform
MemoryUsage memoryUsage = Application.isMobilePlatform
? MemoryUsage.Optimal
: MemoryUsage.Balanced;
LiveTalkAPI.Instance.Initialize(memoryUsage: memoryUsage);using UnityEngine;
using LiveTalk.API;
using System.Collections;
public class CharacterCreation : MonoBehaviour
{
[SerializeField] private Texture2D characterImage;
IEnumerator Start()
{
// Initialize API
LiveTalkAPI.Instance.Initialize();
// Create a new character with all expressions
yield return LiveTalkAPI.Instance.CreateCharacterAsync(
name: "MyCharacter",
gender: Gender.Female,
image: characterImage,
pitch: Pitch.Moderate,
speed: Speed.Moderate,
intro: "Hello, I am your virtual assistant!",
voicePromptPath: null, // Generate voice from style parameters
onComplete: (character) => {
Debug.Log($"Character created: {character.Name}");
},
onError: (error) => {
Debug.LogError($"Character creation failed: {error.Message}");
}
);
}
}LiveTalk supports different creation modes for flexibility:
// Voice Only - fastest, no visual expressions generated
yield return LiveTalkAPI.Instance.CreateCharacterAsync(
name: "VoiceOnlyCharacter",
gender: Gender.Male,
image: null, // No image needed for voice-only
pitch: Pitch.Low,
speed: Speed.Moderate,
intro: "Hello!",
voicePromptPath: null,
onComplete: OnCharacterCreated,
onError: OnError,
creationMode: CreationMode.VoiceOnly
);
// Single Expression - voice + talk-neutral expression only
yield return LiveTalkAPI.Instance.CreateCharacterAsync(
name: "QuickCharacter",
gender: Gender.Female,
image: characterImage,
pitch: Pitch.Moderate,
speed: Speed.Moderate,
intro: "Hello!",
voicePromptPath: null,
onComplete: OnCharacterCreated,
onError: OnError,
creationMode: CreationMode.SingleExpression
);
// All Expressions - full character with all 7 expressions (default)
yield return LiveTalkAPI.Instance.CreateCharacterAsync(
name: "FullCharacter",
gender: Gender.Female,
image: characterImage,
pitch: Pitch.High,
speed: Speed.High,
intro: "Hello!",
voicePromptPath: null,
onComplete: OnCharacterCreated,
onError: OnError,
creationMode: CreationMode.AllExpressions
);You can create a character voice from an existing audio reference instead of generating from style parameters:
// Create character with voice cloned from reference audio
yield return LiveTalkAPI.Instance.CreateCharacterAsync(
name: "ClonedVoiceCharacter",
gender: Gender.Female,
image: characterImage,
pitch: Pitch.Moderate, // These are ignored when voicePromptPath is provided
speed: Speed.Moderate,
intro: "Hello!",
voicePromptPath: "/path/to/reference_voice.wav", // Clone voice from this audio
onComplete: OnCharacterCreated,
onError: OnError
);using UnityEngine;
using LiveTalk.API;
using System.Collections;
public class CharacterSpeech : MonoBehaviour
{
private Character loadedCharacter;
IEnumerator Start()
{
// Initialize API
LiveTalkAPI.Instance.Initialize();
// Load an existing character by ID
string characterId = "your-character-id";
yield return LiveTalkAPI.Instance.LoadCharacterAsyncFromId(
characterId,
onComplete: (character) => {
loadedCharacter = character;
Debug.Log($"Character loaded: {character.Name}");
// Make the character speak
StartCoroutine(MakeCharacterSpeak());
},
onError: (error) => {
Debug.LogError($"Character loading failed: {error.Message}");
}
);
}
IEnumerator MakeCharacterSpeak()
{
if (loadedCharacter == null) yield break;
// Speak with lip sync animation
// Two callbacks: onAudioReady (when audio is ready), onAnimationComplete (when all frames are done)
// Audio and animation generation are pipelined - audio for next segment can generate while current animates
yield return loadedCharacter.StartSpeakWithCallbacks(
text: "Hello! I can speak with realistic lip sync!",
expressionIndex: 0, // Use talk-neutral expression
onAudioReady: (frameStream, audioClip) => {
// Called as soon as audio is ready - you can schedule next speech here
// frameStream will receive animation frames as they're generated
Debug.Log($"Audio ready, expecting {frameStream.TotalExpectedFrames} frames");
StartCoroutine(PlayGeneratedVideo(frameStream, audioClip));
},
onAnimationComplete: () => {
// Called when all animation frames have been generated
Debug.Log("Animation generation complete");
},
onError: (error) => {
Debug.LogError($"Speech generation failed: {error.Message}");
}
);
}
IEnumerator PlayGeneratedVideo(FrameStream frameStream, AudioClip audioClip)
{
// Play the audio
GetComponent<AudioSource>().clip = audioClip;
GetComponent<AudioSource>().Play();
// Process video frames as they arrive
while (frameStream.HasMoreFrames)
{
var frameAwaiter = frameStream.WaitForNext();
yield return frameAwaiter;
if (frameAwaiter.Texture != null)
{
// Display the frame (e.g., on a RawImage component)
GetComponent<UnityEngine.UI.RawImage>().texture = frameAwaiter.Texture;
}
}
}
}For scenarios where you only need audio without video frames:
// Generate voice only (no lip sync animation)
yield return loadedCharacter.StartSpeakWithCallbacks(
text: "This is voice-only output!",
expressionIndex: -1, // -1 means voice only, no video frames
onAudioReady: (frameStream, audioClip) => {
// frameStream will be empty, only audioClip is populated
GetComponent<AudioSource>().clip = audioClip;
GetComponent<AudioSource>().Play();
},
onAnimationComplete: null, // Optional - not needed for voice-only
onError: (error) => {
Debug.LogError($"Speech generation failed: {error.Message}");
}
);The StartSpeakWithCallbacks method enables pipelined audio and animation processing:
// Start first speech
yield return character.StartSpeakWithCallbacks(
text: "First sentence.",
expressionIndex: 0,
onAudioReady: (stream1, audio1) => {
PlayAudioAndAnimation(stream1, audio1);
// As soon as audio is ready, schedule next speech
// Audio for segment 2 generates while segment 1 animates
StartCoroutine(character.StartSpeakWithCallbacks(
text: "Second sentence immediately after.",
expressionIndex: 0,
onAudioReady: (stream2, audio2) => {
EnqueueAudioAndAnimation(stream2, audio2);
},
onAnimationComplete: null,
onError: OnError
));
},
onAnimationComplete: null,
onError: OnError
);This pipelining significantly improves responsiveness by overlapping audio generation for the next segment with animation generation for the current segment.
You can also load a character directly from a file path:
// Load character from a specific path
yield return LiveTalkAPI.Instance.LoadCharacterAsyncFromPath(
"/path/to/character/folder",
onComplete: (character) => {
Debug.Log($"Character loaded: {character.Name}");
},
onError: (error) => {
Debug.LogError($"Loading failed: {error.Message}");
}
);using UnityEngine;
using LiveTalk.API;
using System.Collections;
using UnityEngine.Video;
public class FacialAnimation : MonoBehaviour
{
[SerializeField] private Texture2D sourceImage;
[SerializeField] private VideoPlayer drivingVideo;
IEnumerator Start()
{
// Initialize API
LiveTalkAPI.Instance.Initialize();
// Generate animated textures using LivePortrait
var animationStream = LiveTalkAPI.Instance.GenerateAnimatedTexturesAsync(
sourceImage,
drivingVideo,
maxFrames: -1 // Process all frames
);
// Process the animated frames
while (animationStream.HasMoreFrames)
{
var frameAwaiter = animationStream.WaitForNext();
yield return frameAwaiter;
if (frameAwaiter.Texture != null)
{
// Display the animated frame
GetComponent<UnityEngine.UI.RawImage>().texture = frameAwaiter.Texture;
}
}
}
}Characters support 7 built-in expressions, each with its own index:
- 0: talk-neutral (default speaking)
- 1: approve (nodding, positive)
- 2: disapprove (negative reaction)
- 3: smile (happy expression)
- 4: sad (sorrowful expression)
- 5: surprised (shocked reaction)
- 6: confused (puzzled expression)
Use expressionIndex: -1 in SpeakAsync() to generate voice-only output without video frames.
Characters support two storage formats:
- Character data stored in a
.bundledirectory - Appears as a single file in macOS Finder
- Contains
Info.plistfor proper macOS package metadata - Automatically used on macOS platforms
- Character data stored in a regular directory
- Works on all platforms (Windows, macOS, Linux)
- Used on non-macOS platforms or when explicitly requested
Each character contains:
- character.json: Character configuration (name, gender, pitch, speed, intro)
- image.png: Character portrait image
- drivingFrames/: Expression data for each expression index
- expression-N/: Folder for expression N
- XXXXX.png: Generated driving frames
- latents.bin: Precomputed latent representations
- faces.json: Face detection and processing data
- textures/: Precomputed texture data
- expression-N/: Folder for expression N
- voice/: Voice model and configuration
- sample.wav: Reference voice sample
- voice_config.json: Voice generation parameters
LiveTalkAPI.Instance.Initialize(
LogLevel logLevel = LogLevel.INFO,
string characterSaveLocation = "",
string parentModelPath = "",
MemoryUsage memoryUsage = MemoryUsage.Balanced
)// Create character with full options
IEnumerator CreateCharacterAsync(
string name,
Gender gender,
Texture2D image,
Pitch pitch,
Speed speed,
string intro,
string voicePromptPath,
Action<Character> onComplete,
Action<Exception> onError,
CreationMode creationMode = CreationMode.AllExpressions,
bool useBundle = true
)
// Load character by ID (from save location)
IEnumerator LoadCharacterAsyncFromId(
string characterId,
Action<Character> onComplete,
Action<Exception> onError
)
// Load character from specific path
IEnumerator LoadCharacterAsyncFromPath(
string characterPath,
Action<Character> onComplete,
Action<Exception> onError
)
// Get available characters
string[] GetAvailableCharacterIds()
string CharacterSaveLocation { get; }
// Check if bundle format is supported on current platform
static bool CanUseBundle()// LivePortrait animation from texture list
FrameStream GenerateAnimatedTexturesAsync(Texture2D sourceImage, List<Texture2D> drivingFrames)
// LivePortrait animation from video
FrameStream GenerateAnimatedTexturesAsync(Texture2D sourceImage, VideoPlayer videoPlayer, int maxFrames = -1)
// LivePortrait animation from directory
FrameStream GenerateAnimatedTexturesAsync(Texture2D sourceImage, string drivingFramesPath, int maxFrames = -1)
// MuseTalk lip sync
FrameStream GenerateTalkingHeadAsync(Texture2D avatarTexture, string talkingHeadFolderPath, AudioClip audioClip)string Name { get; }
string CharacterId { get; }
Gender Gender { get; }
Texture2D Image { get; }
Pitch Pitch { get; }
Speed Speed { get; }
string Intro { get; }
bool IsDataLoaded { get; }// Make character speak with animation (pipelined audio/animation)
IEnumerator StartSpeakWithCallbacks(
string text,
int expressionIndex = 0, // Use -1 for voice-only
Action<FrameStream, AudioClip> onAudioReady = null, // Called when audio is ready
Action onAnimationComplete = null, // Called when all frames are generated
Action<Exception> onError = null
)
// Legacy method (still supported but StartSpeakWithCallbacks is preferred)
IEnumerator SpeakAsync(
string text,
int expressionIndex = 0, // Use -1 for voice-only
Action<FrameStream, AudioClip> onComplete = null,
Action<Exception> onError = null
)
// Static methods for loading
static IEnumerator LoadCharacterAsyncFromPath(
string characterPath,
Action<Character> onComplete,
Action<Exception> onError
)
static IEnumerator LoadCharacterAsyncFromId(
string characterId,
Action<Character> onComplete,
Action<Exception> onError
)A reusable MonoBehaviour that handles character loading, idle animation, speech playback, and smooth transitions. This is the recommended way to integrate LiveTalk characters into your Unity scenes.
- Auto-loads character when assigned
- Idle animation playback (expression 0) at 25 FPS with ping-pong cycling
- Speech queueing with automatic playback
- Smooth transitions between idle and speech states
- Event-driven architecture for UI integration
- Audio-only mode for characters without avatars (narrators, phone voices, etc.)
- Static image fallback for characters without idle animations
using LiveTalk.API;
using UnityEngine;
using UnityEngine.UI;
public class CharacterPlayerExample : MonoBehaviour
{
[SerializeField] private Character myCharacter;
void Start()
{
// CharacterPlayer is automatically created by the Character
var player = myCharacter.CharacterPlayer;
// Subscribe to events
player.OnFrameUpdate += (frame) => {
// Display frame in your UI (e.g., RawImage)
GetComponent<RawImage>().texture = frame;
};
player.OnSpeechStarted += () => Debug.Log("Character started speaking");
player.OnSpeechEnded += () => Debug.Log("Character finished speaking");
player.OnCharacterLoaded += () => Debug.Log("Character loaded and ready");
// Queue speech (automatically plays idle animation between speeches)
player.QueueSpeech("Hello! I'm ready to talk.", expressionIndex: 0);
player.QueueSpeech("This is my second line.", expressionIndex: 3); // smile
}
}PlaybackState State { get; } // Uninitialized, Loading, Idle, Speaking, Paused
Character Character { get; } // The LiveTalk Character
Texture DisplayImage { get; } // Current display frame
bool HasQueuedSpeech { get; } // True if speech is in queue
int QueuedSpeechCount { get; } // Number of queued speech items// Queue a single speech line
void QueueSpeech(string text, int expressionIndex = 0, bool withAnimation = true)
// Queue multiple speech lines at once
void QueueSpeechBatch(List<string> textLines, int expressionIndex = 0, bool withAnimation = true)
// Control playback
void Pause()
void Resume()
void Stop() // Stops all speech and returns to idle
void ClearQueue() // Removes queued speechevent Action<Texture> OnFrameUpdate; // Fired for each new frame (idle or speech)
event Action OnSpeechStarted; // Speech playback started
event Action OnSpeechEnded; // Speech playback ended
event Action<Exception> OnError; // Error occurred
event Action OnCharacterLoaded; // Character finished loading
event Action OnIdleStarted; // Idle animation startedFor characters without avatars (narrators, phone voices, etc.):
// CharacterPlayer automatically detects if character has no idle frames
// and switches to audio-only mode
player.QueueSpeech("This is narrator voice without animation.", withAnimation: false);CharacterPlayer works seamlessly with Unity UI Toolkit by firing OnFrameUpdate events that you can display in VisualElements:
// In your UI Controller
private VisualElement _avatarDisplay;
private CharacterPlayer _player;
void SetupCharacter(Character character)
{
_player = character.CharacterPlayer;
// Display frames in UI Toolkit VisualElement
_player.OnFrameUpdate += (frame) => {
if (_avatarDisplay != null)
{
_avatarDisplay.style.backgroundImage = new StyleBackground(frame);
}
};
}Orchestrates multi-character turn-based dialogue, handling speaker switching, audio coordination, and visual display management. Perfect for conversations, cutscenes, and interactive dialogues.
- Multi-character support with automatic speaker switching
- Turn-based dialogue with automatic queuing
- Visual switching - shows speaking character's animation
- Audio coordination - ensures only one character speaks at a time
- Event-driven - integrates easily with UI systems
- Supports audio-only characters (narrators)
using LiveTalk.API;
using UnityEngine;
using UnityEngine.UI;
public class DialogueExample : MonoBehaviour
{
[SerializeField] private Character detective;
[SerializeField] private Character suspect;
[SerializeField] private RawImage dialogueDisplay;
private DialogueOrchestrator _orchestrator;
void Start()
{
// Create orchestrator
var orchestratorObj = new GameObject("DialogueOrchestrator");
_orchestrator = orchestratorObj.AddComponent<DialogueOrchestrator>();
// Register characters
_orchestrator.RegisterCharacter("detective", detective.CharacterPlayer);
_orchestrator.RegisterCharacter("suspect", suspect.CharacterPlayer);
// Subscribe to events
_orchestrator.OnFrameUpdate += (frame) => {
dialogueDisplay.texture = frame;
};
_orchestrator.OnSpeakerChanged += (speakerId) => {
Debug.Log($"Now speaking: {speakerId}");
};
// Queue a conversation
PlayConversation();
}
void PlayConversation()
{
_orchestrator.QueueDialogue("detective", "Where were you on the night of the murder?", expressionIndex: 0);
_orchestrator.QueueDialogue("suspect", "I was at home, I swear!", expressionIndex: 5); // surprised
_orchestrator.QueueDialogue("detective", "Can anyone confirm that?", expressionIndex: 0);
_orchestrator.QueueDialogue("suspect", "My... my wife can.", expressionIndex: 4); // sad
}
}Queue multiple dialogue segments at once:
var conversation = new List<DialogueOrchestrator.DialogueSegment>
{
new() { CharacterId = "detective", Text = "Let's review the evidence.", ExpressionIndex = 0 },
new() { CharacterId = "suspect", Text = "I didn't do it!", ExpressionIndex = 5 },
new() { CharacterId = "detective", Text = "Then explain this.", ExpressionIndex = 0 }
};
_orchestrator.QueueDialogueBatch(conversation);bool IsPlaying { get; } // True if dialogue is currently playing
int QueuedDialogueCount { get; } // Number of queued dialogue segments
string CurrentSpeakerId { get; } // ID of currently speaking character// Register/unregister characters
void RegisterCharacter(string characterId, CharacterPlayer player)
void UnregisterCharacter(string characterId)
// Queue dialogue
void QueueDialogue(string characterId, string text, int expressionIndex = 0, bool withAnimation = true)
void QueueDialogueBatch(List<DialogueSegment> segments)
// Control
void Stop() // Stop all dialogue
void ClearQueue() // Remove queued dialogueevent Action<string> OnSpeakerChanged; // Fired when active speaker changes (characterId)
event Action<Texture> OnFrameUpdate; // Fired for each frame of current speaker
event Action OnDialogueStarted; // Dialogue sequence started
event Action OnDialogueEnded; // All dialogue completed
event Action<Exception> OnError; // Error occurred// Register narrator as audio-only character
_orchestrator.RegisterCharacter("narrator", narratorCharacter.CharacterPlayer);
// Queue narrator dialogue without animation
_orchestrator.QueueDialogue("narrator", "Meanwhile, in another part of town...",
expressionIndex: 0, withAnimation: false);// In your UI Controller
private DialogueOrchestrator _orchestrator;
private VisualElement _speakerDisplay;
private Label _speakerNameLabel;
void SetupDialogue()
{
// Subscribe to frame updates
_orchestrator.OnFrameUpdate += (frame) => {
_speakerDisplay.style.backgroundImage = new StyleBackground(frame);
};
// Update UI when speaker changes
_orchestrator.OnSpeakerChanged += (speakerId) => {
_speakerNameLabel.text = speakerId;
};
_orchestrator.OnDialogueEnded += () => {
Debug.Log("Conversation finished!");
};
}int TotalExpectedFrames { get; set; }
bool HasMoreFrames { get; }FrameAwaiter WaitForNext() // For use in coroutines
bool TryGetNext(out Texture2D texture) // Non-blocking retrievalVERBOSE: Detailed debugging informationINFO: General information messages (default)WARNING: Warning messages onlyERROR: Error messages only
Quality: Uses FP32 models for maximum quality (not recommended - minimal quality improvement)Performance: Loads all models upfront for faster first-time inference (desktop)Balanced: Loads models on-demand, recommended for desktop devices (default)Optimal: Minimal memory footprint, recommended for mobile devices
VoiceOnly: Only generates voice, no visual expressionsSingleExpression: Generates voice and talk-neutral expression onlyAllExpressions: Generates voice and all 7 expressions (default)
- Gender:
Male,Female - Pitch:
VeryLow,Low,Moderate,High,VeryHigh - Speed:
VeryLow,Low,Moderate,High,VeryHigh
- Unity 6000.0.46f1 or later
- Platforms: macOS (CPU/CoreML), Windows (Not tested)
- Minimum 32GB RAM recommended for character creations
- Storage space for models (~10GB total: ~7GB LiveTalk + ~3GB SparkTTS)
MacBook Pro M4 Max (Onnx with CoreML Execution Provider):
- Speech With LipSync generation - 10-11 FPS
- Character Creation - 10 minutes per character (all expressions)
- Character Creation - ~2 minutes per character (single expression)
LivePortrait Pipeline - 4 FPS:
motion_extractor(FP32): 30-60mswarping_spade(FP16): 180-250mslandmark_runner(FP32): 2-3ms
MuseTalk Pipeline - 11-12 FPS:
vae_encoder(FP16): 20-30msunet(FP16): 30-40msvae_decoder(FP16): 30-50ms
This project is licensed under the MIT License, following the licensing of the underlying technologies:
- LivePortrait: Licensed under the MIT License
- MuseTalk: Licensed under the MIT License
- SparkTTS: Licensed under the Apache License 2.0
- Other dependencies: Licensed under their respective open-source licenses
See the LICENSE file for details.
This project incorporates code and models from several open-source projects:
- LivePortrait - Portrait animation technology
- MuseTalk - Real-time lip synchronization
- SparkTTS - Text-to-speech synthesis
- ONNX Runtime - Cross-platform ML inference
Contributions are welcome! Please read our contributing guidelines and submit pull requests for any improvements.
- LivePortrait Team at KwaiVGI for portrait animation technology
- MuseTalk Team at TMElyralab for lip synchronization technology
- SparkTTS Team for text-to-speech synthesis
- ONNX Runtime team for cross-platform ML inference
See CHANGELOG.md for a detailed history of changes.