Age | Commit message (Collapse) | Author |
|
Most modules find themselves wanting some kind of "t" value increasing
with time or frames rendered. It's common for them to create and
maintain this variable locally, incrementing it with every frame
rendered.
It may be interesting to introduce a global notion of ticks since
rototiller started, and have all modules derive their "t" value from
this instead of having their own private versions of it.
In future modules and general innovations it seems likely that playing
with time, like jumping it forwards and backwards to achieve some
visual effects, will be desirable. This isn't applicable to all
modules, but for many their entire visible state is derived from their
"t" value, making them entirely reversible.
This commit doesn't change any modules functionally, it only adds the
plumbing to pull a ticks value down to the modules from the core.
A ticks offset has also been introduced in preparation for supporting
dynamic shifting of the ticks value, though no API is added for doing
so yet.
It also seems likely an API will be needed for disabling the
time-based ticks advancement, with functions for explicitly setting
its value. If modules are created for incorporating external
sequencers and music coordination, they will almost certainly need to
manage the ticks value explicitly. When a sequencer jumps
forwards/backwards in the creative process, the module glue
responsible will need to keep ticks synchronized with the
sequencer/editor tool.
Before any of this can happen, we need ticks as a first-class core
thing shared by all modules.
Future commits will have to modify existing modules to use the ticks
appropriately, replacing their bespoke variants.
|
|
Mostly mechanical change, though threads.c needed some jiggering to
make the logical cpu id available to the worker threads.
Now render_fragment() can easily addresss per-cpu data created by
create_context().
|
|
Depending on where cancellation was happening, locks were potentially
left held which could result in the next thread being cancelled
deadlocking.
In deferred cancellation, only cancellation points realize the
cancellation.
So one worker thread could realize the cancellation entering say,
pthread_cond_wait, and exit with the associated mutex still held.
The other thread could be in the process of returning from
pthread_cond_wait - past the cancellation point already, and get stuck
in trying to acquire the mutex as pthread_cond_wait does before
returning, because the lock was left held by the other thread.
Instead, use the cleanup handlers to unlock the mutexes, and enable
asynchronous cancellation.
This seems to eliminate the observed occasional deadlocks on destroy.
|
|
Rather than laying out all fragments in a frame up-front in
ray_module_t.prepare_frame(), return a fragment generator
(rototiller_fragmenter_t) which produces the numbered fragment
as needed.
This removes complexity from the serially-executed
prepare_frame() and allows the individual fragments to be
computed in parallel by the different threads. It also
eliminates the need for a fragments array in the
rototiller_frame_t, indeed rototiller_frame_t is eliminated
altogether.
|
|
Instead of creating fragment lists striped across available threads
uniformly in a round-robin fashion, just have the render threads iterate
across the shared fragments array using atomics.
This way non-uniform cost of rendering can be adapted to, provided the
module prepares the frame with sufficient fragment granularity.
In the ray tracer for example, it is quite common for some areas of the
screen to have lower complexity/cost than others. The previous model
distributed the fragments uniformly across the threads with no ability for
underutilized threads to steal work from overutilized threads in the event
of non-uniform cost distributions.
Now no attempt to schedule work is made. The render threads simply race
with eachother on a per-frame basis, atomically incrementing a shared
index into the frame's prepared fragemnts. The fragment size itself
represents the atomic work unit.
A later commit will change the various renderers to prepare more/smaller
fragments where appropriate. The ray tracer in particular needs more and
would probably further benefit from a tiling strategy, especially when
an acceleration data structure is introduced.
|
|
introduces create_context() and destroy_context() methods, and adds a
'void *context' first parameter to the module methods.
If a module doesn't supply create_context() then NULL is simply passed
around as the context, so trivial modules can continue to only implement
render_fragment().
A subsequent commit will update the modules to encapsulate their global
state in module-specific contexts.
|
|
This is a simple worker thread implementation derived from the ray_threads
code in the ray module. The ray_threads code should be discarded in a
future commit now that rototiller can render fragments using threads.
If a module supplies a prepare_frame() method, then it is called
per-frame to prepare a rototiller_frame_t which specifies how to divvy
up the page into fragments. Those fragments are then dispatched to a
thread per CPU which call the module's rendering function in parallel.
There is no coupling of the number of fragments in a frame to the number of
threads/CPUs. Some modules may benefit from the locality of tile-based
rendering, so the fragments are simply dispatched across the available CPUs
in a striped fashion.
Helpers will be added later to the fb interface for tiling fragments, which
modules desiring tiled rendering may utilize in their prepare_frame()
methods.
This commit does not modify any modules to become threaded, it only adds
the scaffolding.
|