Having a very simplified example (still I'm not sure it would be totally reproducible at any env) So there's a socket pipe
func SocketPair() (*os.File, *os.File, error) {
fds, err := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
if err != nil {
return nil, nil, err
}
f0 := os.NewFile(uintptr(fds[0]), "socket-0")
f1 := os.NewFile(uintptr(fds[1]), "socket-1")
return f0, f1, nil
}
And a simple cmd call
func main() {
f0, f1, err := utils.SocketPair()
if err != nil {
return panic(err)
}
cmd := exec.CommandContext(ctx, "cat")
cmd.Stdin = f0
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// pipe routine
go func() {
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
err := cmd.Run()
if err != nil {
return panic(err)
}
}
So calling this with something like
echo "abc\ndef\nghi" | app
makes output similar to
------res 11, <nil>
abc
def
ghi
and hangs. Which actually tells that pipe routine
successfully delivered stdin data to the socket pair. Yet the cmd input is not still EOF.
For this exact simple example the issue (in my env) can be solved with two options
- either put
pipe routine
just beforecmd := exec.
line - keep
pipe routine
at the very initial position instead make it waiting to enter the execution as follows
started := make(chan byte)
go func() {
close(started)
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
<-started
So both these solutions resolves the issue and application gracefully exits.
Still in more complex cases with deeper go routines chain even this doesn't help. Instead simple call time.Sleep(time.Second)
just before the cmd.Run()
works.
It very looks like there's a race condition for the moment of start reading within io.Copy / cmd.Run
matters a lot.
So solving the issue I don't want to play with time.Sleep
here searching for the optimal interval (which is a bad idea if this is really a race condition)
Yet my crucial question here: what is the root cause of that behavior. What is really the matter for who starts reading first.
Thanks
Having a very simplified example (still I'm not sure it would be totally reproducible at any env) So there's a socket pipe
func SocketPair() (*os.File, *os.File, error) {
fds, err := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
if err != nil {
return nil, nil, err
}
f0 := os.NewFile(uintptr(fds[0]), "socket-0")
f1 := os.NewFile(uintptr(fds[1]), "socket-1")
return f0, f1, nil
}
And a simple cmd call
func main() {
f0, f1, err := utils.SocketPair()
if err != nil {
return panic(err)
}
cmd := exec.CommandContext(ctx, "cat")
cmd.Stdin = f0
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// pipe routine
go func() {
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
err := cmd.Run()
if err != nil {
return panic(err)
}
}
So calling this with something like
echo "abc\ndef\nghi" | app
makes output similar to
------res 11, <nil>
abc
def
ghi
and hangs. Which actually tells that pipe routine
successfully delivered stdin data to the socket pair. Yet the cmd input is not still EOF.
For this exact simple example the issue (in my env) can be solved with two options
- either put
pipe routine
just beforecmd := exec.
line - keep
pipe routine
at the very initial position instead make it waiting to enter the execution as follows
started := make(chan byte)
go func() {
close(started)
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
<-started
So both these solutions resolves the issue and application gracefully exits.
Still in more complex cases with deeper go routines chain even this doesn't help. Instead simple call time.Sleep(time.Second)
just before the cmd.Run()
works.
It very looks like there's a race condition for the moment of start reading within io.Copy / cmd.Run
matters a lot.
So solving the issue I don't want to play with time.Sleep
here searching for the optimal interval (which is a bad idea if this is really a race condition)
Yet my crucial question here: what is the root cause of that behavior. What is really the matter for who starts reading first.
Thanks
Share Improve this question asked Jan 31 at 8:37 404404 4592 silver badges9 bronze badges 1- 3 One guess is that your copy and input goroutines end up on different OS threads depending on timing. On posix systems a close call on one pthread is not guaranteed to close an FD used by another pthread, so maybe this is affecting something in the socketpair. – Mr_Pink Commented Jan 31 at 8:58
2 Answers
Reset to default 0When searching for more details on syscall.Socketpair
, I stumbled on this gist :
func Socketpair() (net.Conn, net.Conn, error) {
fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
if err != nil {
return nil, nil, err
}
c1, err := fdToFileConn(fds[0])
if err != nil {
return nil, nil, err
}
c2, err := fdToFileConn(fds[1])
if err != nil {
c1.Close()
return nil, nil, err
}
return c1, c2, err
}
func fdToFileConn(fd int) (net.Conn, error) {
f := os.NewFile(uintptr(fd), "")
defer f.Close()
return net.FileConn(f)
}
Pluging this into your code sample fixes the issue on my linux system.
complete playground sample: https://go.dev/play/p/B24cowycU1G
note: running it on the playground does not give the same behavior as on my machine (either the time package is tweaked in a way that hinders the timeouts, or interprocess signalling is just forbidden ...), if you copy/paste the code to a go file on your machine you should get:
$ go run foo.go
===== net.Conn pair
Hello World!
===== *os.File pair
Hello World!
panic: signal: killed # <- timeout triggered
goroutine 1 [running]:
main.main()
/tmp/foo.go:105 +0xdb
exit status 2
I haven't looked in complete details the differences between os.NewFile()
and net.FileConn()
, the first obvious difference I spotted is that os.NewFile()
wraps the file descriptor using os.newFile()
, while net.FileConn
uses net.newFileFD()
, and both functions have a very different initialization sequence.
I guess, the hanging problem you're seeing is a race condition between when the reading and writing of the pipe gets set up. It's tricky because sometimes it works and sometimes it doesn't!
Here's why both your solutions work:
Moving the pipe routine earlier: This gives the goroutine time to start up before
cmd.Run()
tries to read from it.Using the channel sync:
started := make(chan byte) go func() { close(started) // your copy code }() <-started
This makes sure your goroutine is actually running before moving on.
Instead of using time.Sleep()
, for example:
// Set up a WaitGroup to coordinate everything
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
defer f1.Close()
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
}()
err := cmd.Run()
wg.Wait()
The root issue is that we need to make sure the goroutine handling the pipe is ready before the command starts trying to read from it.