Skip to content

Conversation

@BuffMcBigHuge
Copy link
Collaborator

@BuffMcBigHuge BuffMcBigHuge commented Dec 30, 2025

This PR implements full VACE (Video All-In-One Creation and Editing) support for the KREA Realtime Video pipeline. It addresses significant architectural challenges in adapting the VACE system—originally designed for the 1.3B "Longlive" model—to work with the much larger Wan 2.1 14B model used by KREA.

Signed-off-by: BuffMcBigHuge <marco@bymar.co>
Signed-off-by: BuffMcBigHuge <marco@bymar.co>
Signed-off-by: BuffMcBigHuge <marco@bymar.co>
@BuffMcBigHuge BuffMcBigHuge force-pushed the marco/feat/krea-vace-14b branch from 4bc5216 to 6d40951 Compare December 30, 2025 22:53
Signed-off-by: BuffMcBigHuge <marco@bymar.co>
Signed-off-by: BuffMcBigHuge <marco@bymar.co>
Signed-off-by: BuffMcBigHuge <marco@bymar.co>
.cursorindexingignore

.cursor/
.specstory/
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yondonfu I can remove this if you like. Spec story is a plugin for cursor chat history export that I use.

Signed-off-by: BuffMcBigHuge <marco@bymar.co>
@BuffMcBigHuge BuffMcBigHuge marked this pull request as ready for review December 30, 2025 23:36
@leszko
Copy link
Collaborator

leszko commented Jan 2, 2026

FWIW, I've played with this PR and I'm seeing this weird loop. @ryanontheinside mentioned it's probably that we're sending the reference image to each chunk.

demo_krea.mp4

# - Only start skipping once the cache is in steady-state:
# current_start_frame >= kv_cache_num_frames
# - Recompute every N "blocks" (a block produces num_frame_per_block frames).
every_n = int(os.getenv("SCOPE_KV_CACHE_RECOMPUTE_EVERY", "1") or "1")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we use env variables in other blocks, but I think we may want to try to use input params instead. This would help someone wanting to reuse the block or the pipeline to interact with it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is interesting - if you do move to input params would be good so see some documentation explaining what different values can be used

@ryanontheinside
Copy link
Collaborator

FWIW, I've played with this PR and I'm seeing this weird loop. @ryanontheinside mentioned it's probably that we're sending the reference image to each chunk.

demo_krea.mp4

It could also be a problem with the vae after looking at another example. I had claude replicate the test_vace.py script from longlive and tested krea offline. The demarcation at chunk boundaries is familiar and makes me think vae issue. "She jumps":

output_r2v.mp4

I also tested longlive; there is no regression there.

WARMUP_PROMPT = [{"text": "a majestic sunset", "weight": 1.0}]


class KreaRealtimeVideoPipeline(Pipeline, LoRAEnabledPipeline):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use VACEEnabledPipeline here instead of the lazy loading? We switched to the mixin recently

# This fixes the chicken-and-egg problem where VACE isn't enabled until vace_input_frames arrives
vace_enabled = self.parameters.get("vace_enabled", False)

# Robustness fallback: If vace_enabled is missing (e.g. old frontend build),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could expect the front end to match for the build here instead of adding fallbacks, wdyt?

# - Only start skipping once the cache is in steady-state:
# current_start_frame >= kv_cache_num_frames
# - Recompute every N "blocks" (a block produces num_frame_per_block frames).
every_n = int(os.getenv("SCOPE_KV_CACHE_RECOMPUTE_EVERY", "1") or "1")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is interesting - if you do move to input params would be good so see some documentation explaining what different values can be used

from scope.core.pipelines.wan2_1.utils import initialize_kv_cache


def _get_block_mask_model(model: torch.nn.Module) -> torch.nn.Module:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be at the class level as opposed to the module level?

description="Input frames for VACE conditioning (if present, indicates video input is enabled)",
),
InputParam(
"soft_transition_active",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is vestigial from experimentation - AFAIK this is never set to true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants