Age | Commit message (Collapse) | Author |
|
Snapshotting made page alloc/free more frequent, and
modules/mixer really hammers on snapshots in fade mode... drizzle
as well, but mixer snapshots both module outputs per frame.
So by keeping these around the reclaimed page snapshots can be
held onto at within the fb as spares for quicker allocation. In
practice it should just hit a high water mark of a working set
for spare pages, once in a steady state, assuming snapshots
aren't leaking.
Future commits will replicate this change in {drm,sdl}_fb
|
|
Slight improvement of CPU utilization for fragmenters using this
strategy...
I noticed tile64 would give better FPS in some scenarios where it
seemed obvious slice_per_cpu() was the appropriate option. And
that turned out to just be by virtue of being able to give idle
threads something to do while busy ones finished what was on
their plate.
So just make the slices a bit more granular than n_cpus... this
may have to be revisited in the future to find the sweet spot,
and may need to be more sophisticated than just multiplying by a
constant factor.
|
|
I had benchmarked this change and it showed no difference at all
on my 2c/4c i7 X230.
But having just tried it on an RPi4B where it moved the test case
from 54FPS to 60FPS, a +10% improvement, it's worth the
readability loss.
It's interesting how Intel's cleverness discourages optimizing in
ways that benefit probably *all* the competition... even when the
optimization is such a minor change in terms of effort.
|
|
Major gain comes from eliminating the cosf() from the inner loop...
There's still a bunch left on the table for moire but even just
these changes turn 19FPS into 81FPS over here for:
'--module=compose,layers=moire\\\,centers\\\=2\,moire\\\,centers\\\=2\,moire\\\,centers\\\=2\,moire\\\,centers\\\=2,texture=none' '--video=mem,size=1366x768'
|
|
This was written when module names were going to have an
@/path/to/context "handle" syntax. But instead I went the "ref"
builtin module route, with path=/path/to/context as a setting.
While it's more verbose in the settings, it "just works"
everywhere that can take a module+settings because the ref
builtin is just another module like any other.
So this TODO is referring to something that won't happen in a
"ref" builtin world.
|
|
This conditionally will end the stream on scene 99999 if
connect=off (playback mode)
When connect=on it'll just make it show an "EXIT SCENE"
diagnostic instead of the "NO SCENE" message.
Now you just stick 99999 in the rkt:scene track to end the show.
It's assumed 99999 scenes will never be needed...
|
|
I can't see this staying so simple for very long, but for now
this at least enables making rototiller-based rkt-sequenced demos
that exit gracefully when they're finished.
In a future where rtv may play embedded rkt configs+tracks, it
needs a way to detect the end of stream without making main exit.
But I'll cross that bridge when I get there...
|
|
rkt needs a way to signal the end of a sequence, this will
probably get more work in the future but something simple is fine
for now
|
|
This probably needs more work... it seems wrong to be bypassing
the taps altogether when dt is 0
|
|
And just maintain it as the last ticks value after rendering with
the context...
A couple modules were already doing this manually in an ad-hoc
fashion, just make it a general thing.
Updated those modules to reflect the new situation
Especially in a rkt world with modules::mixer doing fades, it
becomes common to render the same context twice in the same frame
for the blending. We need to prevent accelerated animations in
such situations. For now let's just rely on ticks in a delta-T
fashion to prevent animating the context when ticks is the same.
modules::stars in particular needs this fixed up, upcoming commit
|
|
I'm not sure if this will stay long-term, but it seems not be
disrupting linux builds and with open_memstream all gone now
mingw is succeeding with this present.
It would really make sense to simply replace all rand_r() usage
with a libtil rand_r equivalent. That way it can guaranteen
consistent behavior across platforms...
|
|
This moves the tap updating to a function shared by rendering and
context create... so we can have a valid externally-driven tap
value before rendering a single frame if possible. (this is
important for not having spurious frames/flickers in rkt
sequences)
|
|
It's problematic getting this stuff online at render time,
because the modules end up rendering with uninitialized tap
values in that first frame.
With this change modules can get their taps on-stream at context
create, and we'll notice and do the initial rocket update just
before returning the rkt context to prime everyone on-stream.
|
|
This really needs SIMD to fly on-cpu, but this improves things
some.
Using `--module=mixer,style=fade,a_module=roto,b_module=roto\
--video=mem,size=1366x768 --defaults --go` to test:
Before FPS: 92-95 floating mostly around 94-95
After FPS: 107-111 floating mostly around 108-109
so +14.8% FPS
(2c/4t i7 X230)
|
|
It's problematic/racy to always be dereferencing the tap pointer
to access it's current value.
Running checkers:fill_module=mixer on a 2c/4t (n_cpus=4) machine
was quite effective at crashing in mixer::render_fragment()
due to dereferencing a NULL snapshot fragment.
The "tween" T value is being used to indicate when interpolation
of the snapshots is necessary. But repeatedly re-reading the T
value via the tap pointer would race with the driving tap. In
the case of checkers, which is a threaded module, the mixer
contexts are all rendering in parallel at the same path so
they're sharing a single pipe for their T taps.
This situation of sets of contexts sharing a single path,
resulting in their taps all landing on a single pipe, is still
largely up in the air and might be actively prevented in the
future. But until then, it can be made far less crashy and
problematic by just being more careful about dereferencing the
tap to access its latest value just once at the prepare frame
stage, storing it in the local T variable in the context. Then
all the render_fragment() accesses can at least find a stable
value in the context from prepare-to-render, so at least the
snapshots are there when they should be according to the T value
etc.
Something like this will probably need to be done regardless of
what happens with the context sets sharing the same path. Since
when a given tap isn't the driver, it still has to take care to
just grab the updated value once for the frame... The tap API
can't really automagically do that single update of the local
variable when passenger though, since it's been deliberately kept
devoid of all the type-specific accessor junk (ptr/elems etc are
all void **/void* in the tap api side). Hence why this commit is
just ad-hoc updating the local variable in the else branch; it's
best positioned to do so directly. But there still needs to be
more consideration for thread-safety.
|
|
this was breaking mixer as checkers::fill_module when the
checkers were centered via shifting on the Y axis
|
|
Maybe earlier versions used the absolute coordinates in the
frame, but the current code doesn't make use of this and simply
needs to confine itself into the WxH of the fragment.
|
|
This should have been in there already
|
|
This is not optimized at all and tends to hurt the FPS
significantly. This is one of those things that would hugely
benefit from SIMD, but even without SIMD it could be done better.
I just slapped together something obvious for now, as I'd like to
focus more on the rkt side but need a better fader for scene
transitions than style=flicker.
Also changed {a,b}_module= preferred values to blank,compose
so you see something happen if you just run --defaults.
Otherwise, the compose,compose would just fade between two
identical compositions invisibly.
|
|
Originally I was thinking a variadic inputs= like compose::layers
would be desirable, with the T value's integer serving as an
index into the inputs, and the fraction between the integers as
the mixing T.
But I changed my mind and am instead constraining mixer
explicitly to two distinctly named modules; a_module= and
b_module= with the T value 0-1 mapping to a-b.
|
|
Not sure why this was -ENOMEM, module lookups don't allocate
anything...
|
|
Commit 24911280 made til_setup_t required, and most setup code
has already adopted til_module_setup_finalize() added in
c94e4683 for this purpose.
But til_module_setup()'s fall-through case when there's no
til_module_t.setup() on the module being setup was missed. This
broke running simple modules without a .setup() directly like
--module=roto, oops.
Fix is simple.
|
|
I thought the build was already using -Wall but that seems to not
be the case, maybe got lost somewhere along the line or messed up
in configure.ac
After forcing a build with -Wall -Werror, these showed up.
Fixed up in the obvious way, nothing too scary.
|
|
Similar to fill= and fill_color=, these influence how to treat
the "clear" cells.
Until now "clear" cells would unconditionally just be cleared via
til_fb_fragment_clear(). Now they can also be filled
w/color,sampled,textured, and maybe in the future a clear_module=
will also be introduced.
Note that for now I've left the clear_color randomizer disabled,
see code comments.
Having this setting also makes me wonder if the "filled" cells
should be optionally cleared using the "clear" setting first.
Imagine a scenario where you have fill_module=shapes, and you
want the color around the shape to match the clear color, for
instance... Especially once you can pass a color down to the
fill_module, controlling these things becomes more critical.
There's definitely more work to do here.
|
|
Preparatory commit for introducing equivalents for the "clear"
cells.
This renamed the color= setting to fill_color=, in addition to
the internal naming.
|
|
open_memstream() be gone
|
|
Module contexts are now discoverable via the stream a la
til_stream_find_module_contexts().
The current architecture has the stream adding a reference to all
the contexts registered. So even if they get unreferenced by
their creators, they will linger on-stream.
There's a gc mechanism til_stream_gc_module_contexts() which can
be used to trigger a cleanup of contexts _only_ referenced by the
stream.
It's unclear as of yet if this is the way to go long-term, but it
lets things work for now and allows some iterating and
experimentation to see where to go next.
|
|
This probably needs more work, but this at a minimum should
prevent us from leaking contexts in the stream at the myriad
paths we construct them at.
Context registration replaces what's at the existing path, but
rtv produces all sorts of setup paths, and I haven't added any
explicit unregistration of contexts at this time. That might
change, but for now let's just use this gc mechanism even if it's
a temporary hack.
|
|
Some rudimentary instrumentation for monitoring the active module
contexts alongside the pipes
You probably want to redirect stderr to a file when using
--print-pipes and/or --print-module-contexts...
e.g.
```
rototiller --defaults --go --print-pipes --print-module-contexts 2>/dev/null
```
or, if you still want to monitor FPS or log_channels=on in rtv,
2>/file/to/tail then tail -F /file/to/tail in another terminal.
|
|
Just more open_memstream() elimination
|
|
This opens up the possibility of referencing contexts on-stream
by-path anywhere a module may be used:
ref,path=/path/to/existing/context
Currently the path lookup is performed @ render time, and a
reference is taken on the context found there on the first time
it's located. That reference is then held in the ref's context
until its context is destroyed.
There are more details to flesh out, and what probably needs to
happen is some til_stream API for maintaining the reference in
the event that the context at that path gets replaced. Due to
refcounting, the referenced context will persist even if the
stream-resident instance goes away, but there should probably be
a way for that replacement to invalidate the existing references
so they replace theirs with a fresh lookup at the path...
Either way this is a good start. With this addition you can
effectively implement scene transitions in rkt, via the mixer
module, something like (sans proper escaping):
mixer,a_module=ref\,path=/foo/previous/ctxt,b_module=ref\,path=/foo/next/ctxt
|
|
In order for modules like rkt to be able to do integrated
transitions, there must be a way for discovering existing module
contexts by path and using them for rendering in new
compositions.
This is a first stab at something along those lines. The whole
multiple contexts at the same path pattern has been partially
handled in this implementation, but I think it will just be going
away and checkers::fill_module refactored to not do that.
|
|
This eliminates more open_memstream() usage in favor of the new
til_str stuff.
While here I've split up the path construction API exposing the
til_str oriented variants, rather than only providing a
FILE*-oriented API. This way you just use the FILE* stuff when
convenient (like printing to stdout/stderr) and go til_str_t when
you're building up a buffer.
Such is life in a sans-open_memstream-world.
Also while here the FILE*-oriented settings path printers were
renamed s/print/fprint/g hence touching setup.c
|
|
This becomes necessary in a world with
externall-discoverable-on-stream-contexts that can be arbitrarily
referenced by other contexts.
There will probably be more complexity necessary for invalidating
references, but this is a start.
|
|
First elimination of open_memstream() usage...
|
|
Preparation for eliminating open_memstream() usage...
|
|
open_memstream() isn't implemented by windows (not even mingw)
|
|
drm_fb.c did this locally for asprintf() but since then rand_r
and open_memstream have entered the picture as well...
this hasn't been a problem on gnu/linux builds but let's just
globally go _GNU_SOURCE for now. In mingw land I suspect
open_memstream in particular is going to be a PITA and require
replacement, judging from search results anyways
|
|
this is very early/unfinished, hence experimental flag
|
|
The modules don't have to defend against this, vestigial from
simpler times
|
|
compose must have been derived from rtv originally, which uses
txt.h.
|
|
a390e82 stopped using this, but didn't remove it.
As it was initialized to NULL, it was deffectively all a NOOP.
|
|
When all the stream encapsulated were pipes/taps, naming was less
precise. With module contexts in the process of being registered
in the stream, there's a need to distinguish things more.
This is a largely mechanical naming change...
|
|
The return value was just being ignored previously, and that
really starts mattering in a world with contexts finding others
by user-supplied paths making such failures far more likely.
|
|
These are already reality as of late
|
|
This is already in the header
|
|
Just a boring replacement of the ad-hoc n_cpus context creates
for the fill_module contexts with the newly added libtil
equivalent function.
Future commits will expand the libtil side to get module contexts
registered for discovery purposes on-stream. This change moves a
bit closer towards that goal...
|
|
checkers::fill_module is presently implemented via creating
n_cpus identical module contexts, each having n_cpus=1.
Establishing a set of cloned per-cpu contexts, allowing threaded
rendering using any fill_module, regardless of if that module
internally implements threaded rendering.
This works fine, but creates some awkwardness for a future where
contexts are registered and discoverable within the stream at
their respective setup's path. In the checkers::fill_module
case, there would be n_cpus contexts at the same path since they
share the same til_setup_t. Those need to be dealt with
gracefully, ideally making the clones available to another
potentially clone-needing module (imagine a checkers::fill_module
transitionining to another checkers::fill_module situation, where
the transitioned-to fill_module refers to the context by path, it
should get all the contexts out of the stream as-is)
In this commit I'm adding a more formalized method of creating
multiple contexts from the same set of parameters, in preparation
for a future where module contexts get registered on the stream
at their til_setup_t.path. By putting the cloned creates behind
the til API it should at least be relatively trivial to get the
on-stream context registration to capture the multiple contexts.
|
|
This makes TIL_MAXCPUS a proper thing
|
|
The win32 and mach builds were free to violate this, as it's
always just been a local define for putting some kind of bound on
the linux "/sys/devices/system/cpu/cpuN" probing loop.
But in preparation for moving TIL_MAXCPUS out to til_util.h for
public consumption in sizing per-cpu arrays, it needs to actually
be enforced consistently as an upper bound cap. That way
everything can safely assume n_cpus won't ever exceed
TIL_MAXCPUS.
|