Age | Commit message (Collapse) | Author |
|
The channel module name was being used as the settings instance
name, which is redundant to the entries[0] "value as label"
behavior with the module name also @ entries[0].
The resulting paths resembled:
/module/rtv/compose/compose/layers/[N]/...
/module/rtv/compose/compose/texture/...
Now they are:
/module/rtv/channel/compose/layers/[N]/...
/module/rtv/channel/compose/texture/...
There's still work to be done in this area of rtv. It's unclear
if even a static "/module/rtv/channel" is correct, or if it
should show the subscript of the channels[] entry currently being
used when that becomes a proper nested settings situation, i.e.
/module/rtv/channels/[0]/compose/layers/[N]/...
...
/module/rtv/channels/[1]/spiro
...
/module/rtv/channels/[2]/roto
...
for a "channels=compose,spiro,roto" kind of hypothetical
another option which might make sense is to have the channel path
more like an auto-increment identifier so the channel contexts
have unique paths and don't collide on the stream. This could be
important for scenarios utilizing the "ref" module and trying to
reuse parts of the context from one channel to the next. When
the paths collide things tend to break in weird ways presently.
this area needs more thought
|
|
Silly and unnecessarily racy to do this in render_fragment now
that blinds is threaded
|
|
This is a first stab at threading blinds, while here I got rid of
the reliance on _checked() put_pixel for clipping.
This could be better, things like using a block put/copy instead
of put_pixel would be a huge advantage for blinds due to its
naturally contiguous blocks per-blind.
While this module when non-threaded wasn't especially slow, any
module leaving cores idle is depriving other potentially threaded
modules of utilizing them for the duration of the non-threaded
render. So now that rototiller is being used for compositions
and increasingly becoming something of a meta-demo, it's become
important to make everything as threaded as possible.
|
|
Especially since taps are established on first use, it's
desirable to get them online immediately so they show in things
like --print-pipes and modules::rkt/RocketEditor and not
spuriously once the context becomes actively rendered...
This kind of thing will be done for all modules eventually
|
|
computing the angle for every pixel coordinate from the origin is
costly, even with the approximate method added by c690303.
An easy speedup is to only do this once for a given frame
dimensions, and cache those results. In the form of a 32-bit
float, it's equivalent to caching a full page of pixel data.
This is slightly complicated by needing to be an effectively
global cache and the potential for multiple shapes contexts
rendering concurrently when part of a composition.
I think this particular situation highlights a need for something
equivalent generalized on-stream where modules can register
discoverable caches of costly to compute information, having a
high probability of being useful to others.
In this particular case it was alphazed's use of shapes in two
layers that made it an obvious win, even without any other
modules needing atan2() per-pixel with a centered origin.
With this commit:
Configured settings as flags:
--seed=0x64adabae '--module=compose,layers=blank\,shapes\\\,type\\\=pinwheel\\\,scale\\\=.9\\\,pinch\\\=.25\\\,pinch_spin\\\=.1\\\,pinches\\\=7\\\,points\\\=19\\\,spin\\\=.01\,shapes\\\,type\\\=pinwheel\\\,scale\\\=.9\\\,pinch\\\=1\\\,pinch_spin\\\=-.25\\\,pinches\\\=8\\\,points\\\=5\\\,spin\\\=0,texture=moire\,centers\=3' '--video=mem,size=1366x768'
FPS: 73
FPS: 74
FPS: 73
Without:
Configured settings as flags:
--seed=0x64adb857 '--module=compose,layers=blank\,shapes\\\,type\\\=pinwheel\\\,scale\\\=.9\\\,pinch\\\=.25\\\,pinch_spin\\\=.1\\\,pinches\\\=7\\\,points\\\=19\\\,spin\\\=.01\,shapes\\\,type\\\=pinwheel\\\,scale\\\=.9\\\,pinch\\\=1\\\,pinch_spin\\\=-.25\\\,pinches\\\=8\\\,points\\\=5\\\,spin\\\=0,texture=moire\,centers\=3' '--video=mem,size=1366x768'
FPS: 55
FPS: 54
FPS: 54
So it's significant, and in alphazed there's also a transition
from one scene with two full-screen shapes layers into a
checkered scene with shapes as the fill_module. Further
amplifying the payoff.. infact whenever shapes is used for a
fill_module in checkers, there's n_cpus shapes contexts created
because checkers is threaded. All of those would be benefitting
from the cache.
|
|
The color stuff as-is isn't cheap and doesn't even get used if
there's a texture present, so don't bother with it at all.
This is especially significant since this module isn't threaded,
so it ties up all the cores leaving most of them idle when part
of a composition.
Also since spiro doesn't clamp its coordinates to the fragment
dimensions, only considering the frame dimensions, it must
continue using the "checked" put_pixel variant...
|
|
This seems to make things work well enough for mingw+wine
Will probably revisit in the future. Adding an ewouldblock
helper rather than duplicating the ifdeffery seems likely
Let's just leave it like this for now and find out if a real
windows test succeeds
|
|
This is available in win32, unlike inet_aton()
|
|
In the early days of checkers when I introduced fill_module= with
the per-cpu contexts to still allow threaded rendering, the whole
seed passing to contexts thing wasn't as well sorted out. This
meant the contexts often produced vastly different outputs
despite being the same module, same seed, and same settings.
The consequence of that was that w/fill_module checkers would
produce crazy randomized output when you expected the same output
in the filled cells. But by using .cpu_affinity (which I had to
implement just for this use case actually) at least the different
outputs would become stable. It was a band-aid over a different
problem that still needed sorting out.
Nowadays, it seems like this is improved enough at least for
alphazed to look correct without the affinity hack, so I'm
removing it because it really kills checkers threaded
performance.
Whatever modules remain uncooperative WRT seed reproducibility,
they'll just need to be fixed up.
|
|
While here also did more minor optimizations moving things out of
the inner loops that were only there out of laziness...
|
|
Taken from this excellent post:
https://mazzo.li/posts/vectorized-atan2.html
While I haven't gone full vectorized, just getting rid of the
regular atan2f() call will be a big improvement.
This just adds the functionality, nothing is calling it yet.
|
|
This is just a first stab at threading shapes... whenever shapes
finds itself in a scene it easily becomes a significant
bottleneck, threading is a trivial path to improving that
somewhat.
While at it I also got rid of the need for _checked() variants
which also helps a bit.
But nothing here is proper optimization of the routines.. there's
too much math happening per pixel in a naive fashion.
|
|
|
|
trivial simplification
|
|
Silly typo, one of those fun C instances where it's surprising
how silently mostly-working such a blatant mistake can be.
For posterity:
The way this was even observed as having an affect is while
verifying graceful handling of connections broken while in the
listen backlog.
With an active scener session idle at the prompt, start another
telnet, connecting without receiving any banner (queued via
backlog), ^]cl that backlogged telnet. Then start another
telnet in the same way. Now go to the idle scener session and
quit. The latest telnet would just sit there, seemingly blocked
behind the broken-while-backlogged connection.
But what was really happening was the banner send got the error
on the broken connection after accepting, as you'd expect. This
bug in the errno tests prevented detecting the genuine error
though, leaving the broken session connected indefinitely.
Fun!
|
|
After putting the recv() in a for(;;) to not have to render a
frame per byte received, I completely dropped the ball on moving
the return and adding the continue to actually finish the change.
This makes creating new scenes via pasting long settings strings
far less laggy. A future improvement would be to not recv() a
byte at a time, but this really isn't a perf-sensitive thing.
|
|
This changes things so rkt won't exit with an error @ startup if
RocketEditor isn't already listening. It also tolerates
RocketEditor going away, and will show a "OFFLINE" overlay status
text should that happen w/connect=on.
Some status text has also been added to the "EXIT SCENE" 99999
scene for both the RocketEditor connection and the scener
enabled/disabled status. No indicator yet for if scener has a
connection though, only if it's listening or not via listen=on.
|
|
Remove spurious space
|
|
|
|
Trivial related indentation adjustment too
|
|
This adds a BBS-style interface for creating new scenes in a live
rkt session.
It listens on tcp port 54321 on localhost by default, just use
telnet to connect, the rest is fairly self-explanatory.
This is still early days, but it's a whole lot more than nothing.
|
|
With the "ref" builtin module established and seeming to work
well enough, it looks unlikely we'll need access to unresolvable
module names in the contexts.
The thinking originally was that these names might have special
syntax making them more generic. E.g something like
"@/module/rkt/scenes/[1]/drizzle" for a module_name would have
been supported, which would get resolved either at context create
or even later (as in the ref builtin) at render time.
But the ref builtin is using a path setting, so module names just
stay module names.
Maybe in the future a special syntax will be added for brevity
reasons, but it does make the code more complicated vs. module
names just being names and resolving them entirely at setup time.
Anyhow, this commit does away with the module_name in the
context's scenes. You can still access it via
til_module_context_t.module.name anyways... it's basically just a
resolution-of-name-to-context time constraint that's being
codified now.
|
|
In the interests of improving error handling of interactive
setups, return the setting that was invalid when setup returns
-EINVAL.
For now this is only supported for non-finalized/non-baking setup
calls, i.e. (!res_setup).
In the future I want this also supported for finalizing when
res_setup is set. A lot of the -EINVAL returns are actually
during that stage (or need to be added) due to that being when
the more sensitive parsing occurs going from strings to native
numeric types and such.
The main reason it's not already being done in this commit is
it'll churn quite a bit to get there, since the setup funcs don't
generally have the setting pointers onhand at that phase. It'll
require changing the string-value-centric local variables to
instead all be the til_setting_t* holding those values.
And if setup funcs aren't going to be so value-centric, the
settings API will likely change to not even return the values
directly anymore and only return the full-blown til_settings_t as
the main return pointer, perhaps rid of their res_setting use
even.
Anyway, while here I did some random little cleanups too.
|
|
Make this a distinct heap allocation so it can be enlarged when
editing the scenes... (preparatory commit for scenes editing)
|
|
Preparatory commit for adding an interactive scene editing server
of sorts. It'll go in a separate listing, but needs these types
as it'll operate on rkt_context_t->scenes[].
|
|
I had benchmarked this change and it showed no difference at all
on my 2c/4c i7 X230.
But having just tried it on an RPi4B where it moved the test case
from 54FPS to 60FPS, a +10% improvement, it's worth the
readability loss.
It's interesting how Intel's cleverness discourages optimizing in
ways that benefit probably *all* the competition... even when the
optimization is such a minor change in terms of effort.
|
|
Major gain comes from eliminating the cosf() from the inner loop...
There's still a bunch left on the table for moire but even just
these changes turn 19FPS into 81FPS over here for:
'--module=compose,layers=moire\\\,centers\\\=2\,moire\\\,centers\\\=2\,moire\\\,centers\\\=2\,moire\\\,centers\\\=2,texture=none' '--video=mem,size=1366x768'
|
|
This was written when module names were going to have an
@/path/to/context "handle" syntax. But instead I went the "ref"
builtin module route, with path=/path/to/context as a setting.
While it's more verbose in the settings, it "just works"
everywhere that can take a module+settings because the ref
builtin is just another module like any other.
So this TODO is referring to something that won't happen in a
"ref" builtin world.
|
|
This conditionally will end the stream on scene 99999 if
connect=off (playback mode)
When connect=on it'll just make it show an "EXIT SCENE"
diagnostic instead of the "NO SCENE" message.
Now you just stick 99999 in the rkt:scene track to end the show.
It's assumed 99999 scenes will never be needed...
|
|
This probably needs more work... it seems wrong to be bypassing
the taps altogether when dt is 0
|
|
And just maintain it as the last ticks value after rendering with
the context...
A couple modules were already doing this manually in an ad-hoc
fashion, just make it a general thing.
Updated those modules to reflect the new situation
Especially in a rkt world with modules::mixer doing fades, it
becomes common to render the same context twice in the same frame
for the blending. We need to prevent accelerated animations in
such situations. For now let's just rely on ticks in a delta-T
fashion to prevent animating the context when ticks is the same.
modules::stars in particular needs this fixed up, upcoming commit
|
|
This moves the tap updating to a function shared by rendering and
context create... so we can have a valid externally-driven tap
value before rendering a single frame if possible. (this is
important for not having spurious frames/flickers in rkt
sequences)
|
|
It's problematic getting this stuff online at render time,
because the modules end up rendering with uninitialized tap
values in that first frame.
With this change modules can get their taps on-stream at context
create, and we'll notice and do the initial rocket update just
before returning the rkt context to prime everyone on-stream.
|
|
This really needs SIMD to fly on-cpu, but this improves things
some.
Using `--module=mixer,style=fade,a_module=roto,b_module=roto\
--video=mem,size=1366x768 --defaults --go` to test:
Before FPS: 92-95 floating mostly around 94-95
After FPS: 107-111 floating mostly around 108-109
so +14.8% FPS
(2c/4t i7 X230)
|
|
It's problematic/racy to always be dereferencing the tap pointer
to access it's current value.
Running checkers:fill_module=mixer on a 2c/4t (n_cpus=4) machine
was quite effective at crashing in mixer::render_fragment()
due to dereferencing a NULL snapshot fragment.
The "tween" T value is being used to indicate when interpolation
of the snapshots is necessary. But repeatedly re-reading the T
value via the tap pointer would race with the driving tap. In
the case of checkers, which is a threaded module, the mixer
contexts are all rendering in parallel at the same path so
they're sharing a single pipe for their T taps.
This situation of sets of contexts sharing a single path,
resulting in their taps all landing on a single pipe, is still
largely up in the air and might be actively prevented in the
future. But until then, it can be made far less crashy and
problematic by just being more careful about dereferencing the
tap to access its latest value just once at the prepare frame
stage, storing it in the local T variable in the context. Then
all the render_fragment() accesses can at least find a stable
value in the context from prepare-to-render, so at least the
snapshots are there when they should be according to the T value
etc.
Something like this will probably need to be done regardless of
what happens with the context sets sharing the same path. Since
when a given tap isn't the driver, it still has to take care to
just grab the updated value once for the frame... The tap API
can't really automagically do that single update of the local
variable when passenger though, since it's been deliberately kept
devoid of all the type-specific accessor junk (ptr/elems etc are
all void **/void* in the tap api side). Hence why this commit is
just ad-hoc updating the local variable in the else branch; it's
best positioned to do so directly. But there still needs to be
more consideration for thread-safety.
|
|
Maybe earlier versions used the absolute coordinates in the
frame, but the current code doesn't make use of this and simply
needs to confine itself into the WxH of the fragment.
|
|
This is not optimized at all and tends to hurt the FPS
significantly. This is one of those things that would hugely
benefit from SIMD, but even without SIMD it could be done better.
I just slapped together something obvious for now, as I'd like to
focus more on the rkt side but need a better fader for scene
transitions than style=flicker.
Also changed {a,b}_module= preferred values to blank,compose
so you see something happen if you just run --defaults.
Otherwise, the compose,compose would just fade between two
identical compositions invisibly.
|
|
Originally I was thinking a variadic inputs= like compose::layers
would be desirable, with the T value's integer serving as an
index into the inputs, and the fraction between the integers as
the mixing T.
But I changed my mind and am instead constraining mixer
explicitly to two distinctly named modules; a_module= and
b_module= with the T value 0-1 mapping to a-b.
|
|
Not sure why this was -ENOMEM, module lookups don't allocate
anything...
|
|
I thought the build was already using -Wall but that seems to not
be the case, maybe got lost somewhere along the line or messed up
in configure.ac
After forcing a build with -Wall -Werror, these showed up.
Fixed up in the obvious way, nothing too scary.
|
|
Similar to fill= and fill_color=, these influence how to treat
the "clear" cells.
Until now "clear" cells would unconditionally just be cleared via
til_fb_fragment_clear(). Now they can also be filled
w/color,sampled,textured, and maybe in the future a clear_module=
will also be introduced.
Note that for now I've left the clear_color randomizer disabled,
see code comments.
Having this setting also makes me wonder if the "filled" cells
should be optionally cleared using the "clear" setting first.
Imagine a scenario where you have fill_module=shapes, and you
want the color around the shape to match the clear color, for
instance... Especially once you can pass a color down to the
fill_module, controlling these things becomes more critical.
There's definitely more work to do here.
|
|
Preparatory commit for introducing equivalents for the "clear"
cells.
This renamed the color= setting to fill_color=, in addition to
the internal naming.
|
|
This probably needs more work, but this at a minimum should
prevent us from leaking contexts in the stream at the myriad
paths we construct them at.
Context registration replaces what's at the existing path, but
rtv produces all sorts of setup paths, and I haven't added any
explicit unregistration of contexts at this time. That might
change, but for now let's just use this gc mechanism even if it's
a temporary hack.
|
|
this is very early/unfinished, hence experimental flag
|
|
The modules don't have to defend against this, vestigial from
simpler times
|
|
compose must have been derived from rtv originally, which uses
txt.h.
|
|
a390e82 stopped using this, but didn't remove it.
As it was initialized to NULL, it was deffectively all a NOOP.
|
|
The return value was just being ignored previously, and that
really starts mattering in a world with contexts finding others
by user-supplied paths making such failures far more likely.
|
|
These are already reality as of late
|
|
Just a boring replacement of the ad-hoc n_cpus context creates
for the fill_module contexts with the newly added libtil
equivalent function.
Future commits will expand the libtil side to get module contexts
registered for discovery purposes on-stream. This change moves a
bit closer towards that goal...
|