Age | Commit message (Collapse) | Author |
|
Modules can now use the til_module_t.finish_frame() return value
to trigger re-rendering by returning 1, returning 0 finishes the
frame.
A smattering of til_module_t.finish_frame() implementations were
largely mechanically updated to match this change by returning 0,
since nothing actually uses multi-pass rendering yet.
The impetus for this is experimenting with the flow module doing
two passes of threaded rendering per frame. A first pass to
sample the flow field and update the elements, per-cpu, but
drawing nothing. Then a second pass to render the elements in a
tiled manner.
|
|
The repro is:
--seed=0x64f6820b '--module=compose,layers=blank\,pixbounce\\\,pixmap_size\\\=0.8\\\,pixmap\\\=err\,pixbounce\\\,pixmap_size\\\=0.4\\\,pixmap\\\=ignignokt,texture=voronoi\,cells\=512\,randomize\=on' '--video=mem,size=3840x2160'
The major culprit seems to be the combination of high resolution,
and small number of voronoi cells (cells=512), with randomize=on
which exercises jumpfill every frame.
The way jumpfill is implemented currently is racy by design to
allow threading, and mostly works fine despite not really being
how the algorithm is intended to work.
The assumption has been, something like:
"the seeds are already placed before the threaded phase, so the
threaded jumpfill should at least find stable seed cells in the
face of racing against other tiles being jumpfilled
simultaneously"
But it appears that assumption isn't always true, in that we
won't necessarily find one of the seed cells at the start of the
jumpfill when there aren't that many cells (512) compared to the
area of the voronoi (3840x2160).
By noticing when we've finished a tile's jumpfill with remaining
unassigned cells, we can just repeat the jumpfill, with time
passed, and the other tiles will have made progress on their work
propagating more knowledge of where cells are... so the
subsequent pass will probably leave nothing unassigned.
This approach sucks, but stops the crashing.
It'd also be possible to just change the way cells are looked up
so there's no potential for a NULL pointer dereference, just have
some uninitialized cell color which gets shown erroneously in the
output. That avoids the computational cost of repeating the
tile's jumpfill, and likely nobody would notice the likely single
pixel of error for a single frame.
I'm just doing this quick and dirty fix to prevent the crashing
for now, and would like to just revisit voronoi more thoroughly
with an eye towards decoupling the voronoi cost from the
resolution. It's a cheap hack the way there's a distance entry
per pixel, done just to simplify the implementation when I
slapped it together on a Zephyr train ride.
|
|
More setup_func conversion to returning the failed setting on
errors during res_setup baking.
|
|
Particularly with nested modules it's annoying to have to stow
the module separate from the setup during the setup process.
If the baked setup included the module pointer in the
non-module-specific-setup part of the setup, then nested settings
could finalize using the generic module setup wrapper and just
rely on this til_setup_t.creator pointer to contain the
appropriate module. Which should enable tossing out a bunch of
copy-n-pasta surrounding nested modules setup.
Note this has to be a void* since til_setup_t is a generic thing
used equally by both the fb code and the module code. Hence why
this is called "creator" and not "module", as well as the void*
as opposed to til_module_t*.
Also if rototiller ever grows a sound backend, the setup
machinery will be reused there as well, and it'll be yet another
creator handle that isn't an til_fb_ops_t or a til_module_t.
It's assumed that the callers producing these setups won't be
trying to pass them to the wrong places i.e. a module setup
getting passed to an fb backend and vice versa.
I'm mildly annoyed about having to move the various til_module_t
blocks to above the module's foo_setup(), but it seemed like the
least annoying option. This may be revisited.
|
|
I thought the build was already using -Wall but that seems to not
be the case, maybe got lost somewhere along the line or messed up
in configure.ac
After forcing a build with -Wall -Werror, these showed up.
Fixed up in the obvious way, nothing too scary.
|
|
This changes til_setup_t* from optional to required for
til_module_context_t creation, while dropping the separate path
parameter construction and passing throughout.
|
|
This commit adds passing the settings instance to til_setup_new()
which is used for deriving a path for the setup via
til_settings_print_path() on the supplied settings.
That path gets an allocated copy left in the returned
til_setup_t at til_setup_t.path
This path will exist for the lifetime of the til_setup_t, to be
freed along with the rest of the baked setup instance when the
refcount reaches 0.
The incoming til_settings_t is only read @ til_setup_new() in
constructing the path, no reference is kept. Basically the
til_settings_t* is just passed in for convenience reasons, since
constructing the path needs memory and may fail, this approach
lets the existing til_setup_new() call error handling also
capture the path allocation failures as-is turning
til_setup_new() into a bit more of a convenience helper.
Note that now all code may assume a til_setup_t has a set and
valid til_setup_t.path, which should be useful for context
creates when a setup is available.
|
|
Commit 7c8086020 switched the til_setup_new() api to support NULL
free_func for free().
This mechanical change pivots to that instead of the awkwardly
cast free() parameters.
|
|
When voronoi is overlayed the colors get sampled, so randomizing
them is pointless work every frame.
|
|
This seems to work on my 2c/4t laptop and is certainly faster.
But I'm not being careful about using atomics loading/storing the
d->cell pointer, which seems problematic. Surprisingly things
aren't crashing here despite that, maybe on a non-x86 smp box
it'd be a different story.
|
|
Preparatory commit for doing the voronoi distance calculations in
parallel when possible, as part of render_fragment() instead of
all in prepare_frame().
Not all of the distance calculation work can easily be threaded,
but it should be possible to compute the post-seed distances
concurrently within the spatial bounds of the tiled fragment.
This commit doesn't actually change anything functionally, as
it's just splitting the old voronoi_calculate_distances() into
two and calling them both in succession still from
voronoi_prepare_frame().
Subsequent commits will work towards making the render_pass()
fragment-aware then ultimately moved to voronoi_render_fragment()
|
|
This variant is kind of a broken hack, and its brokenness becomes
more apparent in a threaded voronoi world.
So just drop it for now.
I am interested in more voronoi variants, but they can't
compromise correctness/introduce instabilities or significantly
interfere with performance improvements like threaded rendering.
The dithered-ish look of dirty=on was an interesting variant
though... bummer.
|
|
trivial change; voronoi_sample_colors() only reads from the fragment
|
|
With setup refcounting and a reference bound to the context, we
should just dereference the single instance. The way setups are
used it just as a read-only thing to affect context behavior...
Note I've left the module-type-specific setup pointer despite it
duplicating the setup pointer in the module_context. This is
just a convenience thing so the accessors don't have to cast the
general til_setup_t* to my_module_setup_t* everywhere.
|
|
This just does the obvious pulling in of til_setup_t, holding the
reference throughout the lifetime of the module context.
|
|
There was a time when it made sense for context creates needing
setups but not receiving them to still be functional with some
sane defaults.
But with recursive settings, we really shouldn't ever have
orphaned nested module uses unreachable by a proper setup.
So let's just get rid of this fallback, and exclusively rely on
the baked setups provided by the .setup() methods. They still
have preferred defaults, and the proper setup production
machinery is what should be responsible for applying those
at runtime where they may also be overridden or otherwise
influenced.
|
|
For recursive settings the individual setting being described
needs to get added to a potentially different settings instance
than the one being operated on at the top of the current
setup_func phase.
The settings instance being passed around for a setup_func to
operate on is constified, mainly to try ensure modules don't
start directly mucking with the settings. They're supposed to
just describe what they want next and iterate back and forth,
with the front-end creating the settings from the returned descs
however is appropriate, eventually building up the settings to
completion.
But since it's the setup_func that decides which settings
instance is appropriate for containing the setting.. at some
point it must associate a settings instance with the desc it's
producing, one that is going to be necessarily written to.
So here I'm just turning the existing til_setting_desc_t to a
"spec", unchanged. And introducing a new til_setting_desc_t
embedding the spec, accompanied by a non-const til_settings_t*
"container".
Now what setup_funcs use to express settings are a spec,
otherwise identically to before. Instead of cloning a desc to
allocate it for returning to the front-end, the desc is created
from a spec with the target settings instance passed in.
This turns the desc step where we take a constified settings
instance and cast it into a non-const a more formal act of going
from spec->desc, binding the spec to a specific settings
instance. It will also serve to isolate that hacky cast to a
til_settings function, and all the accessors of
til_setting_desc_t needing to operate on the containing settings
instance can just do so.
As of this commit, the container pointer is just sitting in the
desc_t but isn't being made use of or even assigned yet. This is
just to minimize the amount of churn happening in this otherwise
mostly mechanical and sprawling commit.
There's also been some small changes surrounding the desc
generators and plumbing of the settings instance where there
previously wasn't any. It's unclear to me if desc generators
will stay desc generators or turn into spec generators. For now
those are mostly just used by the drm_fb stuff anyways, modules
haven't made use of them, so they can stay a little crufty
harmlessly for now.
|
|
Let's make it so til_module_context_t as returned from
til_module_context_new() can immediately be freed via
til_module_context_free().
Previously it was only after the context propagated out to
til_module_context_create() that it could be freed that way, as
that was where the module member was being assigned.
With this change, and wiring up the module pointer into
til_module_t.create_context() as well for convenient providing to
til_module_context_new(), til_module_t.create_context() error
paths can easily cleanup via `return til_module_context_free()`
But this does require the til_module_t.destroy_context() be able
to safely handle partially constructed contexts, since the
mid-create failure freeing won't necessarily have all the members
initialized. There will probably be some NULL derefs to fix up,
but at least the contexts are zero-initialized @ new.
|
|
This was mostly done out of convenience at the expense of turning
the fragment struct into more of a junk drawer.
But properly cleaning up owned stream pipes on context destroy
makes the inappropriateness of being part of til_fb_fragment_t
glaringly apparent.
Now the stream is just a separate thing passed to context create,
with a reference kept in the context for use throughout. Cleanup
of the owned pipes on the stream supplied to context create is
automagic when the context gets destroyed.
Note that despite there being a stream in the module context, the
stream to use is still supplied to all the rendering family
functions (prepare/render/finish) and it's the passed-in stream
which should be used by these functions. This is done to support
the possibility of switching out the stream frame-to-frame, which
may be interesting. Imagine doing things like a latent stream
and a future stream and switching between them on the fly for
instance. If there's a sequencing composite module, it could
flip between multiple sets of tracks or jump around multiple
streams with the visuals immediately flipping accordingly.
This should fix the --print-pipes crashing issues caused by lack
of cleanup when contexts were removed (like rtv does so often).
|
|
There needs to be a way to address module context instances
by name externally, in a manner complementary to settings and
taps.
This commit adds a string-based path to til_module_context_t, and
modifies til_module_create_context() to accept a parent path
which is then concatenated with the name of the module to produce
the module instance's new path.
The name separator used in the paths is '/' just like filesystem
paths, but these paths have no relationship to filesystems or
files.
The root module context creation in rototiller's main simply
passes "" as the parent path, resulting in a "/" root as one
would expect.
There are some obvious complications introduced here however:
- checkers in particular creates a context per cpu, simply using
the same seed and setup to try make the contexts identical at
the same ticks value. With this commit I'm simply passing the
incoming path as the parent for creating those contexts, but
it's unclear to me if that will work OK. With an eye towards
taps deriving their parent path from the context path, I guess
these taps would all get the same parent and hash to the same
value despite being duplicated. Maybe it Just Works, but one
thing is clear - there won't be any way to address the per-cpu
taps as-is. Maybe that's desirable though, there's probably
not much use in trying to control the taps at the CPU
granularity.
- when the recursive settings stuff lands, it should bring along
the ability to explicitly name settings blocks. Those names
should override the module name in constructing the path.
I've noted as such in the code.
- these paths probably need to be hashed @ initialization time
so there needs to be a hash function added to til, and a hash
value accompanying the name in the module context. It'd be
dumb to keep recomputing the hash when these paths get used
for hash table lookups multiple times per frame...
there's probably more I'm forgetting right now, but this seems
like a good first step.
fixup root path
|
|
Preparatory commit for enabling cloneable/swappable fragments
There's an outstanding issue with the til_fb_page_t submission,
see comments. Doesn't matter for now since cloning doesn't happen
yet, but will need to be addressed before they do.
|
|
modules/checkers w/fill_module=$module requires a consistent
mapping of cpu to fragnum since it creates a per-cpu
til_module_context_t for the fill_module.
The existing implementation for threaded rendering maximizes
performance by letting *any* scheduled to run thread advance
fragnum atomically and render the acquired fragnum
indiscriminately. A side effect of this is any given frame, even
rendered by the same module, will have a random mapping of
cpus/threads to fragnums.
With this change, the simple til_module_t.prepare_frame() API of
returning a bare fragmenter function is changed to instead return
a "frame plan" in til_frame_plan_t. Right now til_frame_plan_t
just contains the same fragmenter as before, but also has a
.cpu_affinity member for setting if the frame requires a stable
relationship of cpu/thread to fragnum.
Setting .cpu_affinity should be avoided if unnecessary, and that
is the default if you don't mention .cpu_affinity at all when
initializing the plan in the ergonomic manner w/designated
initializers. This is because the way .cpu_affinity is
implemented will leave threads spinning while they poll for
*their* next fragnum using atomic intrinsics. There's probably
some room for improvement here, but this is good enough for now
to get things working and correct.
|
|
Also wire this up to the til_module_context_new() helper and
all its callers.
This is in preparation for modules doing more correct delta-T
derived animation.
|
|
- modules now allocate their contexts using
til_module_context_new() instead of [cm]alloc().
- modules simply embed til_module_context_t at the start of their
respective private context structs, if they do anything with
contexts
- modules that do nothing with contexts (lack a create_context()
method), will now *always* get a til_module_context_t supplied
to their other methods regardless of their create_context()
presence. So even if you don't have a create_context(), your
prepare_frame() and/or render_fragment() methods can still
access seed and n_cpus from within the til_module_context_t
passed in as context, *always*.
- modules that *do* have a create_context() method, implying they
have their own private context type, will have to cast the
til_module_context_t supplied to the other methods to their
private context type. By embedding the til_module_context_t at
the *start* of their private context struct, a simple cast is
all that's needed. If it's placed somewhere else, more
annoying container_of() style macros are needed - this is
strongly discouraged, just put it at the start of struct.
- til_module_create_context() now takes n_cpus, which may be set
to 0 for automatically assigning the number of threads in its
place. Any non-zero value is treated as an explicit n_cpus,
primarily intended for setting it to 1 for single-threaded
contexts necessary when embedded within an already-threaded
composite module.
- modules like montage which open-coded a single-threaded render
are now using the same til_module_render_fragment() as
everything else, since til_module_create_context() is accepting
n_cpus.
- til_module_create_context() now produces a real type, not void
*, that is til_module_context_t *. All the other module
context functions now operate on this type, and since
til_module_context_t.module tracks the module this context
relates to, those functions no longer require both the module
and context be passed in. This is especially helpful for
compositing modules which do a lot of module context creation
and destruction; the module handle is now only needed to create
the contexts. Everything else operating on that context only
needs the single context pointer, not module+context pairs,
which was unnecessarily annoying.
- if your module's context can be destroyed with a simple free(),
without any deeper knowledge or freeing of nested pointers, you
can now simply omit destroy_context() altogether. When
destroy_context() is missing, til_module_context_free() will
automatically use libc's free() on the pointer returned from
your create_context() (or on the pointer that was automatically
created if you omitted create_context() too, for the
bare til_module_context_t that got created on your behalf
anyways).
For the most part, these changes don't affect module creation.
In some ways this eases module creation by making it more
convenient access seed and n_cpus if you had no further
requirement for a context struct.
In other ways it's slightly annoying to have to do type-casts
when you're working with your own context type, since before it
was all void* and didn't require casts when assigning to your
typed context variables.
The elimination for requiring a destroy_context() method in
simple free() of private context scenarios removes some
boilerplate in simple cases.
I think it's a wash for module writers, or maybe a slight win for
the simple cases.
|
|
This is a mostly mechanical change of using rand_r() in place of
rand(), using the provided seed as the seed state.
There's some outstanding rand()s outside of create_context()
which should probably get switched over, with the seed being
stowed in the context struct. I didn't bother going deeper on
this at the moment in the interests of getting to sleep soon.
|
|
n_cells is a size_t, use %zu
clang complained, gcc doesn't, huh
|
|
In the recent surge of ADD-style rtv+compose focused development,
a bunch of modules were changed to randomize initial states at
context_create() so they wouldn't be so repetitive.
But the way this was done in a way that made it impossible to
suppress the randomized initial state, which sometimes may be
desirable in compositions. Imagine for instance something like
the checkers module, rendering one module in the odd cells, and
another module into the even cells. Imagine if these modules are
actually the same, but if checkers used one seed for all the odd
cells and another seed for all the even cells. If the modules
used actually utilized the seed provided, checkers would be able
to differentiate the odd from even by seeding them differently
even when the modules are the same.
This commit is a step in that direction, but rototiller and all
the composite modules (rtv,compose,montage) are simply passing
rand() as the seeds. Also none of the modules have yet been
modified to actually make use of these seeds.
Subsequent commits will update modules to seed their
pseudo-randomized initial state from the seed value rather than
always calling things like rand() themselves.
|
|
We don't actually want to produce indices 0-width and 0-height
|
|
Just one case, modules/submit, was using 32x32 tiles and is now
using 64x64. I don't expect it to make any difference.
While here I fixed up the num_cpus/n_cpus naming inconsistencies,
normalizing on n_cpus.
|
|
Fragmenting is often dimensioned according to the number of cpus,
and by not supplying this to the fragmenter it was made rather
common for module contexts to plumb this themselves - in some
cases incorporating a context type/create/destroy rigamarole
for the n_cpus circuit alone.
So just plumb it in libtil, and the prepare_frame functions can
choose to ignore it if they have something more desirable onhand.
Future commits will remove a bunch of n_cpus from module contexts
in favor of this.
|
|
This adds a voronoi diagram module, which when used as an overlay
produces a mosaic effect.
Some settings:
cells=N number of voronoi cells
randomize={on,off} randomizes the cell locations every frame
dirty={on,off} uses a faster sloppy/dithery-looking method
Some TODO items:
- use a more space efficient representation of the distance
buffer, maybe use uint16_t relative offsets into the cells
rather than pointers - capping their quantity to 64KiB
- anti-alias edges between cells
|