• Yann E. MORIN's avatar
    communicate: check return status of msgrcv() · 16b032c6
    Yann E. MORIN authored
    msgrcv can return with -1 to indicate an error condition.
    One such error is to have been interrupted by a signal.
    
    Being interrupted by a signal is very rare in this code, except in a
    very special condition: a highly parallel job, like a makefile with
    hundreds jobs, which all compete to read extended attributes from
    different files.
    
    For example, see: https://bugs.busybox.net/show_bug.cgi?id=10141, where
    a highly-parallel (1000 threads!) mksquashfs on a filesystem with
    extended attributes, where we see errors like (those are mksquashfs
    errors):
        llistxattr for titi/603/883 failed in read_attrs, because Unknown
        error 1716527536
    
    However, when a signal is delivered, the content of the message (aka
    buf) is not changed, so the various fields will contain whatever they
    contained before the call. One field is especially critical,
    buf->xattr.flags_rc that can contain either the flags when passed
    
    One way to solve this would be to simply return the error condition to
    the caller. However, not all the calls we wrap may return EINTR, so some
    callers would be very disapointed if we were to return them EINTR.
    
    Instead, it is guaranteed that no message is delivered in case msgrcv()
    is interrupted by a signal, so we simply attempt to receive again.
    
    As for other error conditions, there is not much we can do either: none
    are restartable. If any would occur, it would mean we had incorrect
    setup on our side, and nothing would work to begin with.
    
    Note: the other error conditions in msgrcv() are:
    
      - E2BIG.  the message is too big: nothing would work at all anyway, as
                we would not be able to exchange any message with faked;
    
      - EACCES: we would not own the IPC we created ourselves;
    
      - EFAULT: the pointer we allocated ourselves would not be in our
                own address range;
    
      - EIDRM:  someone killed faked, or called ipcrm... We're doomed, but
                we can't exit(), because we still want to allow the caller
                to recover and take action, like saving its state to storage
                (although that may backfire with more calls to wrapped
                functions);
    
      - EINVAL: incorrect message ID, which would mean the msg_get variable
                got overwritten by something, like a memory leak, which
                could very well come from somewhere else in the process
                (i.e. not in our code);
    
      - ENOMSG: can't occur, since we never specify IPC_NOWAIT; and if it
                did occur, we could not do much either;
    
      - ENOSYS: can't occur, we don't use MSG_COPY; ditto, we could not do
                much if it occured anyway.
    
    So, basically any other error condition would mean the internal state
    would be completely fsked-up, so we should just return that error
    condition to the caller, in the hope that it can cope. Yet, we can't
    always do that, because we don't have a way to convey that error to the
    internal caller: we don't know if it is interested in the stat field or
    the xatttr one, so even though we could well put the errno in the xattr
    field, it might still get missed by the other callers.
    
    Yet, that's what we do, since we can't do much more, and we also print
    a warning message, and hope for the best, or at least not the worst...
    Signed-off-by: 's avatar"Yann E. MORIN" <yann.morin.1998@free.fr>
    16b032c6