summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
6 daysvmon: introduce -L,--mem-locked flagVito Caputo
This can be necessary for headless mode on cramped embedded devices to prevent vmon from being delayed excessively due to page faults/thrashing. Even with Adherence visually indicated, it complicates comparisons across snapshots to have this be inconsistent/thrashing-severity-dependent. Use this flag in combination with something like SCHED_RR to make vmon more immune to memory and scheduling contention. If you use SCHED_RR, it's wise to also use LimitRTTIME to prevent potential bugs from bogarting the system.
11 dayslibvmon: public vmon_proc_monitor() shouldn't accept a parentVito Caputo
The public callers of vmon_proc_monitor() shouldn't ever be passing a non-NULL parent as that's a pretty intimate and messy internal aspect of libvmon. So let's remove that altogether from the public function for monitoring a process, and turn existing one supporting parents into a libvmon-private function. Trivial cleanup, there's so much more needed in libvmon since it's the epitome of an organically evolved crusty thing.
11 dayslibvmon: ignore orphaned followed children still having parentsVito Caputo
Now that vmon is a real thing with PID1 monitoring, there's this detail of orphans getting inherited which libvmon has so far largely been able to ignore because vwm doesn't care about such things, always monitoring subtrees attached to X windows. This fixes the vmon asserts when monitoring the PID1 hierarchy where vcr_shift_below_row_up_one() would abort @: >·······assert(*(vcr->hierarchy_end_ptr) >= row); It was being triggered when orphans were found by the PID1 children following while they were still children of their exited-from-kernel's-perspective-but-not-yet-libvmon's-perspective parent process. There's more work to be done to improve this situation, but it's likely going to be quite invasive. For now this is a simple solution that should prevent the asserts. We just might not see a process being orphaned for a sample while it gets ignored for still having its vestigial parent around as is_stale=1.
11 daysvmon: wire up charts_vmon_dump_procs() to -D/--dump-procsVito Caputo
This is useful for debugging purposes.
11 dayscharts: expose vmon_dump_procs() in a trivial wrapperVito Caputo
libvmon isn't really exposed to the front-end code beyond charts, so it's this or a vwm_charts_t.vmon accessor and callers including libvmon/vmon.h (yuck).
11 dayslibvmon: add vmon_dump_procs() debugging aidVito Caputo
It's handy to see what the libvmon hash table state of the world is when trying to understand brokenness, but also as a tool for verifying things aren't out of sync with what the hierarchical view contains.
2024-11-30vcr: make vcr_shift_below_row_up_one() assert unconditionalVito Caputo
This assert has proven interesting, but sticking it in the mem backend limits its exercising to headless. I mostly run this on Xlib in vwm/vmon, and it's proving annoying to trigger this assert outside of embedded headless scenarios.
2024-11-30charts: assert that descendents of stale are staleVito Caputo
This is really how things are implemented today, which may actually be incorrect in some edge case scenarios... but let's assert it holds true currently to aid debugging some spurious asserts in vcr_shift_below_row_up_one() about row vs. hierarchy_end. The potential issue I see with this assumption as-is is it's entirely possible to have descendants survive a parent's demise, grandchildren don't have to exit when a parent does. But it might be OK to treat it that way, as they'll be rediscovered as children of PID 1, and there's no strict need to preserve continuity of their associated charts state across that transition. It's rare enough that I don't think it's worth worrying about, but maybe this is what's happening with the asserts during startup specifically; when things are daemonizing / double forking etc.
2024-11-30libvmon: assume stale threads in a stale processVito Caputo
The charts code in vwm/vmon assumes stale descendants of a stale node, but there seems to be some theoretical potential for that to not hold true. Let's be more agressive about ensuring that's the case. The current code already does this style propagation for stale processes, but not for threads. It seems like it'd be weird to have this happen, but maybe there's edge cases where you have a parent process exit with grandchildren threads surviving until a later sample, creating opportunity for a inconsistency.
2024-11-20vcr: fix us->timespec conversion math typoVito Caputo
An extra 0 snuck in here and got copy and pasted too, oof.
2024-11-19libvmon: increase VMON_HTAB_SIZEVito Caputo
The current value of 128 doesn't really accomodate most the systems I'm dealing with these days... bump to 1024.
2024-11-13charts: fix skipped overlay render in stalls when deferredVito Caputo
The deferred pass only enters draw_chart() once regardless of this_sample_duration, with the idx always 0. So when this_sample_duration > 1 (stalls/repeated samples), the conditional draw_overlay_row() would only get entered in the non-deferred passes in deferred mode, which are short-circuited within draw_overlay_row() because we don't want to do render that stuff in those passes in deferred mode. The fix is trivial; always enter draw_overlay_row() for the deferred pass. This fixes a cosmetic artifact where you'd see stale / missing overlays in the hierarchy rows of output, when sample durations were falling behind schedule enough for this_sample_duration to be greater than 1.
2024-10-19vcr: implement markers for vcr_present_mem_to_png()Vito Caputo
This is a stupid simple naive implementation, but it does add markers when enabled for mem->png (headless mode), e.g.: `vmon --hertz 1 --markers 60 --headless --snapshots 3600` Would give you minute markers in the borders of png snapshots created every hour with 1hz samples (every column of pixels represents a second, hence markers 60 gives you per-minute markers) The current color used for the markers is not quite full intensity yellow: {c0,c0,0}
2024-10-19vcr: pass vwm_charts_t.marker_distance ref to vcr_new()Vito Caputo
Since vcr_t implements rendering of borders and backgrounds, to such an extent that when serializing mem->png for headless mode it produces the background and border on the fly on a per-row basis, let's just give it the ability to access the marker distance in vwm_charts_t and draw the markers as needed. It feels hacky to be passing pointers to these values but I really despise repeating setters across abstractions to plumb things through, so I'm doing the stupid simple thing here.
2024-10-19vmon: wire up the new markers API to vmon's CLIVito Caputo
This plumbs the charts marker distance down to vmon's CLI flags in the form of -m/--marker, which you provide the number of pixels distance to put between markers as the argument for.
2024-10-19charts: stub out marker APIVito Caputo
This introduces the concept of border markers intended to serve as timeline references/milestones. Here only the minimal API for setting their distance is added, nothing is actually implemented yet.
2024-10-19charts: remove shadow_row() wrapperVito Caputo
Now that this just wraps vcr_shadow_row() in a post-vcr world it's pointless, so let's get rid of it. No functional difference.
2024-10-19charts: use named define for default intervalVito Caputo
Purely cosmetic, no functional change.
2024-10-19charts/vcr: s/double/float/Vito Caputo
Double precision is unnecessary for this, use floats throughout, at least for everything vmon related.
2024-10-19charts: use multiplicative inverse in some placesVito Caputo
This is a minor trivial optimization turning some frequent divides into multiplies which are generally less costly to compute.
2024-10-18vcr: clip vcr_shift_below_row_up_one() for mem backendVito Caputo
There was an unintentional assumption that hierarchy_end wouldn't extend beyond the bottom, which just isn't always the case. I'm sure there's more of this kind of thing in the headless code since the original Xlib backend could lean on XRender clipping everything.
2024-10-16vcr: prevent row overflow in mem vcr_draw_text()Vito Caputo
Another row clipping check off by one, it'd be nice to make this draw text into partial rows... but this as-is may just scribble.
2024-10-12libvmon: sprinkling of assertsVito Caputo
No functional change, just firming up some assumptions
2024-10-12vcr: more assertsVito Caputo
2024-10-08charts: use clock_gettime(CLOCK_MONOTONIC_RAW)Vito Caputo
This really needs to be a clock unaffected by ntp adjustments, which CLOCK_MONOTONIC_RAW seems to provide.
2024-10-08charts: move vwm logo / $name / Hz back to topVito Caputo
I prefer this be on its own in the upper right corner.
2024-10-08charts: introduce a scheduling "adherence" rowVito Caputo
This draws the new scheduling "adherence" metric in a row below the top IOWait/Idle% row. The headings have moved down one to cover "adherence" instead, which I think should help make the important IOWait/Idle% row more visible as well as improving headings readability. The adherence row should generally be either black or red, rarely cyan. Red indicates %age of sampling interval behind schedule for the given sample, Cyan indicates same but ahead of schedule which should be unusual/almost never happen. Infact I think the current sharing of the "close enough" epsilon as adherence truncating threshold the ahead of schedule "close enough" situations will always get truncated to zero. So it might be impossible to see any cyan adherence as-is right now. A future commit will move the '\/\/\ # %name @ Hz' heading up to the IOWait/Idle% putting it back in the upper right corner, but only that one.
2024-10-08charts: move fixed pre-hierarchy rows to a defineVito Caputo
This will likely be made more dynamic in the future, but for now there's a need to shift "rest" down another row to make room for the "adherence" row. This is a simple way to accomodate that, another preparatory commit.
2024-10-08charts: compute an "adherence" valueVito Caputo
This is an attempt to add a schedule adherence metric which a subsequent commit will plot in a row below the top IOWait/%Idle % row. Ideally the adherence metric's value would always be 0, because we're always exactly on-time with our samples. But what tends to happen is falling behind, or rarely being slightly ahead of schedule (particularly with the epsilon introduction). This metric can serve as a sort of proxy for userspace's ability to get scheduled on time, which is a useful thing to see.
2024-10-08charts: try compute a real desired_delay_usVito Caputo
This introduces a concept of a "close enough" epsilon value. Where if the attempted update's current time is within very small temporal distance from the precisely scheduled time dictated by the interval, the update will still take a sample, rather than try introduce a tiny dely the host/kernel/ppoll will likely fail to adhere to without being tardy. Previously the desired delay was just a third of the interval, with no consideration for how long sampling took. This was dead simple, but made no attempt to schedule the poll timeout to align with the next sampling deadline, and would either cause excessive wakeups, or excessive tardiness, depending on the host's speed. I think this technically also fixed a bug where this_delta wouldn't get assigned if one of the earlier conditions short-circuited the later condition where it was being assigned.
2024-10-08vcr: switch to ppoll() and return microseconds delayVito Caputo
This is mostly preparotory for having more precision in a computed delay, but is also arguably just finishing what was started when adding the _us suffixes throughout. A future commit should also rework signal stuff to only unblock signals in ppoll().
2024-10-08vwm: ms->us poll timeout transitionVito Caputo
Switch to using vcr_backend_poll() like everything else which will do the right thing about handling delay_us.
2024-10-08charts: {depth,row}=0 in draw_chart() not maintain_chart()Vito Caputo
This should have been done when draw_chart() and draw_chart_rest() were split apart making draw_chart() non-recursive. But it becomes much more glaringly obvious in a world where maintain_chart() is calling draw_chart() in multiple places.
2024-10-08charts: parameterize heading toggle in draw_columns()Vito Caputo
This is preparatory for shifting heading off row 0 which until now has been the safe assumption, but I'm intending to add an "adherence" row below the IOWait/Idle top row. The headings will be moving down to that.
2024-10-08charts: some cosmetic fixupsVito Caputo
primarily s/sampling_interval/sampling_interval_secs/ units clarification
2024-10-08charts: repeat samples on missed deadlinesVito Caputo
This applies charts->this_sample_duration by advancing and drawing the graph bars this_sample_duration times. It's a bit crufty with conditionals especially where it overlaps with deferred_pass handling... but seems to work ok in initial tests. Future work will have to add a row indicating how far we've deviated from the scheduled sample time... Maybe cyan would show how premature we were, and red how late we were. Where 100% would be the entire sample interval was exceeded, but < 100% would show our still more or less on-schedule scheduling deviations.
2024-10-05charts: compute a "sample duration" for current sampleVito Caputo
This turns the time passed since the last sample taken into a "sample duration". Ideally this would always be 1, and up until now in the main use case, vwm, it's been assumed to generally be 1 and drops in the timeline treated benign/fleeting because of the live viewing. But with the introduction of --headless and increasing use on my servers / embedded interests, this has become more problematic. In this commit the duration is only being maintained, but not applied. Subsequent commits will have to repeat the current sample in the graphs (this_sample_duration - 1) times.
2024-10-05charts: s/monitor/proc/Vito Caputo
Mechanical renaming of this vestigial name choice from when vmon_proc_t was below the "monitor". Now it's just the vmon_proc_t pointed at from the chart, so let's name accordingly. No functional change.
2024-09-29vwm: s/delay/delay_us/Vito Caputo
More clarification of delay/timeout units in naming.
2024-09-29vmon: s/delay/delay_us/Vito Caputo
More clarification of delay/timeout units in naming
2024-09-29charts: clarify timeout units as microsecondsVito Caputo
This API is targeting poll() usage which implies microseconds, but let's better clarify it in naming.
2024-09-29vcr: clarify timeout units in namingVito Caputo
vcr_backend_poll() mirrors the poll() api, but let's clarify the timeout units as microseconds.
2024-09-21charts: scale % bars by num_cpus when appropriateVito Caputo
For rows reflecting threads and single/non-threaded processes, let's scale the bar % by the number of cpus, so they can use the full height of the row. These tasks can't scale to multiple CPUs, so it's pointless to leave vertical space for the other cores' capacity, if present. For multi-threaded process rows, the vertical space continues to accomodate all cores. I've been on the fence about this change for a while because it increases the cognitive load of reading the graphs, now the scales are inconsistent. But when you've got 16 cores like on my AMD P14s thinkpad, combined with a row height of 16 pixels, you start wishing these rows used the full height of the row for their single-core-constrained %ages.
2024-09-21libvmon: maintain a flag indicating if a process is threadedVito Caputo
Preparatory commit for enabling charts to apply % scaling to non-threaded procesess, to make better use of the row's available space. A non-threaded process can't use more than a single core, so it should be able to scale its %age out to the full row height. The same will be applied to individual thread rows, as those can at most use a single core. The exception is a threaded process - its CPU %ages are aggregate, and must represent up to the number of CPUs in the system within their row.
2024-09-21libvmon: get the number of cpusVito Caputo
Preparatory commit for enabling charts that scale per-thread and per-non-threaded-process CPU utilization levels by number of cpus, so they can utilize the whole row.
2024-09-21vcr: use 4-bit/16-color PNG in mem_to_pngVito Caputo
I planned to use the 256-colors for differentiating graphs in other row types, but it's becoming clear saving space is more important. So this cuts the file size down further quite a bit: -rw-r--r-- 1 vc vc 219323 Sep 19 01:37 09.19.24-01:36:43-2.png Becomes: -rw-r--r-- 1 vc vc 152674 Sep 21 14:25 09.21.24-14:25:16-2.png
2024-09-19vcr: densify+deduplicate mem_to_png paletteVito Caputo
This introduces a LUT indirection table for mapping the raw layer values to a dense, deduplicated palette used with the PNG. This should help with compression ratios at basically no cost. Before `1800x8000 --headless --snapshot 10 --hertz 60`: -rw-r--r-- 1 vc vc 87873 Sep 19 01:36 09.19.24-01:36:43-0.png -rw-r--r-- 1 vc vc 215719 Sep 19 01:36 09.19.24-01:36:43-1.png -rw-r--r-- 1 vc vc 219323 Sep 19 01:37 09.19.24-01:36:43-2.png -rw-r--r-- 1 vc vc 221979 Sep 19 01:37 09.19.24-01:36:43-3.png After: -rw-r--r-- 1 vc vc 72303 Sep 19 01:37 09.19.24-01:37:30-0.png -rw-r--r-- 1 vc vc 174100 Sep 19 01:37 09.19.24-01:37:30-1.png -rw-r--r-- 1 vc vc 177430 Sep 19 01:37 09.19.24-01:37:30-2.png -rw-r--r-- 1 vc vc 178711 Sep 19 01:38 09.19.24-01:37:30-3.png Without any increase in compression level used. Which while it would improve ratios further, substantially increases CPU cost.
2024-09-16libvmon: Revert "libvmon: assume short positive reads found EOF"Vito Caputo
This reverts commit 9f564cf8df6ef5fcba37082ba8013d6175955125. Experimenting with smaller initial seq_file buffers in the kernel has exposed this actually breaks, which contradicts my expectations for proc files established back in the /proc/mdstat racy incremental parsing corrupting the output days. I'm seeing /proc/$pid/task/$pid/children spit out short reads when the seq_file size is smaller than the amount of output. Userspace's read() call can provide a large buffer, and if seq_file's is smaller than the children output, it'll split the children output instead of enlarging the seq_file buf to the read() buffers bounds.
2024-09-15vcr: bump row clipping to not attempt partial row operationsVito Caputo
It'd be nice to allow the last partial row to be rendered, but as-is it can result in segfault for some operations that aren't clipping at that granularity. Another option is to always round up the bits allocation to the ROW_HEIGHT boundaries, then not worry about this, and do the partial clip @ serialization to png time.
2024-09-09vcr: lighten grays used in PNG output rowsVito Caputo
Increasing contrast a bit for odd rows / separators
© All Rights Reserved