summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
Diffstat (limited to 'README')
-rw-r--r--README56
1 files changed, 53 insertions, 3 deletions
diff --git a/README b/README
index e57b6c3..21a6bc0 100644
--- a/README
+++ b/README
@@ -2,11 +2,61 @@
Usage is like `tee`: wye [FILE]...
-wye always consumes from its stdin, the optional files specified are
+Wye always consumes from its stdin, the optional files specified are
read-multiplexed as alternative inputs to stdin. All reads are
performed using PIPE_BUF sized buffers which are the atomic units for
unix pipes, any read data is immediately written to stdout.
-Someone should try get this command added upstream in GNU Coreutils
-upstream, and give me credit for the name. This implementation is a
+Someone should try get this command added upstream in GNU Coreutils,
+and I'd appreciate credit for the name. This implementation is a
quick and dirty hack and not particularly robust.
+
+The primary correctness and robustness problem has to do with how
+UNIX pipes are implemented. The intended use of this utility is to
+supply pipes as wye's various inputs, and pipes normally buffer writes
+such that the read side may get any number of bytes, including a
+fraction of what was an atomic write. The default atomicity
+guarantees WRT pipes and PIPE_BUF pertain only to concurrent writers
+to the same pipe. They have zero relevance to semantics at the read
+side.
+
+Wye could read every ready fd until exhausting what's immediately
+available (EAGAIN/EWOULDBLOCK) in an attempt to combat this, but
+there's still the potential for a short write at the writer side when
+the pipe's internal buffers are full to come through at the read side
+partially.
+
+When a partial record arrives at the read side, wye will naively
+propagate that partial record in its output as if it were whole. Then
+other input streams may be interleaved with that partial record, and
+the aggregated stream becomes potentially incoherent. Wye also has no
+ability to ensure only a single record passes through from each when
+multiple inputs are ready simultaneously for reading, not without
+becoming content-aware and parsing the data contents - rendering wye
+specialized for a specific content type rather than a generalized
+aggregator treating contents as opaque records.
+
+In Plan9 pipes have been implemented differently [0], enabling a tool
+like wye to be more naturally robust and correctly implemented.
+
+Since Linux 3.4, the O_DIRECT flag has been implemented for the
+pipe2() syscall [1], enabling plan9-like semantics for pipes. If the
+shells available on Linux supported a means for conveniently enabling
+O_DIRECT "packetized pipes", a wye-like tool could arguably be
+provided in a generalized fashion, as it could have significant
+utility.
+
+I suspect a significant factor in landing a tool like this upstream
+somewhere like GNU Coreutils will first require getting "packetized
+pipes" generally accessible through shells like GNU bash. I've made a
+first attempt in doing so [2], but as one would expect was mostly met
+by resistance in what little attention it received. These things take
+time, and require significant buy-in from the community for movement
+to occur. If this feature interests you, show your support on
+bug-bash and help work towards exposing the "packetized pipe"
+capability via the popular linux shells like GNU bash.
+
+
+[0] http://man.cat-v.org/plan_9/2/pipe
+[1] https://www.man7.org/linux/man-pages/man2/pipe.2.html
+[2] https://mail.gnu.org/archive/html/bug-bash/2020-09/msg00076.html
© All Rights Reserved