Can’t call method “stat” on an undefined value

I use Rexify to manage my servers. (I’m learning to treat my servers like cattle, not pets.) I have it distributing standardized config files, running updates, and more.

The Problem

After adding a Raspberry Pi-based system to manage my backups, I was plagued with this error:

[2022-12-10 23:41:13] INFO - Running task Config:System:standard on backup
[2022-12-10 23:41:15] ERROR - Error executing task:
[2022-12-10 23:41:15] ERROR - Can't call method "stat" on an undefined value at /usr/lib64/perl5/vendor_perl/5.34/Rex/Interface/Fs/OpenSSH.pm line 70, <> line 144.
[2022-12-10 23:41:15] ERROR - 1 out of 1 task(s) failed: 
[2022-12-10 23:41:15] ERROR - Config:System:standard failed on backup
[2022-12-10 23:41:15] ERROR - Can't call method "stat" on an undefined value at /usr/lib64/perl5/vendor_perl/5.34/Rex/Interface/Fs/OpenSSH.pm line 70, <> line 144.

The error didn’t make much sense, and wasn’t consistent. Not all Rex commands triggered it. When I did run into it, the same command executed directly on the server worked just fine.

Just to be clear: the problem had nothing to do with the system architecture. I.e. the fact that it was a Raspberry Pi, or an ARM device generally, was a red herring, but this was the first RPi that I had managed with Rex, and the only system with this problem, so I didn’t rule it out right away.

The Diagnosis

At first glance it appeared to be a bug in Rex, because that was the only thing that made sense. Why would it be complaining that it can’t call “stat” on an undefined file path, when the files are all clearly defined in my Rexfile configuration?

The line number referred to a place where Rex calls stat using SFTP. The actual value of $path was unknown to me, as I didn’t want to immediately jump into someone else’s code and start outputting debug info, but it seemed reasonable that it would either be the local or remote file, and the files should be clearly defined.

my $attr = $sftp->stat($path);

That seems pretty straight-forward. If this is a bug in Rex then other people should be encountering it.

Googling the error was rather unhelpful. That specific error, with quotes to search exact matches, returned no hits. An inexact match returned nothing helpful. Nothing about Rex, just a bunch of other applications doing other things. Let’s set that assumption aside and look elsewhere.

I could run simple commands using Rex, like ‘uptime’, to the server but more complicated things were a problem. The only common thread so far involved putting files. Could it be an issue with sending and receiving files?

Rex is configured to connect as root, there aren’t other users on the system (besides default and service accounts) to accidentally connect as, and the paths are all regular filesystem paths (not /dev or similar), so it’s almost certainly not a permissions issue.

Maybe a second SSH connection couldn’t be established; SFTP establishes an SSH connection and tunnels across it, after all.

I don’t set a common ControlMaster and ControlPath setting in my config, but I know from experience that if you do, and you force the first (“master”) to close, you’ll knock out other connections to the same host.

Well, I could connect via SSH just fine, using multiple simultaneous connections and opening / closing independently, so this clearly wasn’t an SSH connection problem. Time to look somewhere else.

Except I was wrong.

By ruling out SSH I missed a critical, based troubleshooting step: connect to my new server via SFTP and issue a couple of commands. When I finally tried it, after beating around the bush for far too long, I had my “duh!” moment.

Lo and behold, SFTP failed. I didn’t expect that could happen if SSH was fine.

Of course, it failed with an inscrutable error in batch mode:

$ echo "ls /" | sftp -b - backup
Connection closed.
Connection closed

Using the -v option was hardly more informative:

$ echo "ls /" | sftp -v -b - backup
OpenSSH_9.0p1, OpenSSL 1.1.1q  5 Jul 2022
debug1: Reading configuration data ...
<snip>
subsystem request failed on channel 0
Connection closed.
Connection closed

Getting out of batch mode and trying to get a prompt gave me a little bit more information, but only just:

$ sftp -v backup
OpenSSH_9.0p1, OpenSSL 1.1.1q  5 Jul 2022
debug1: Reading configuration data ...
<snip>
debug1: Sending subsystem: sftp
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 4528, received 4060 bytes, in 0.2 seconds
Bytes per second: sent 22612.3, received 20275.2
debug1: Exit status 127
debug1: compress outgoing: raw data 211, compressed 171, factor 0.81
debug1: compress incoming: raw data 918, compressed 764, factor 0.83
Connection closed.  
Connection closed

Most disconcerting. Fortunately, my Google-fu still had some juice today, and I learned something new from the search results.

The best clue was the exit status 127: https://serverfault.com/questions/770541/can-connect-via-ssh-but-not-via-sftp-exit-status-127

The Solution

SSH has something that appears to be a “legacy” option for configuring SFTP: the SFTP server subsystem.

Subsystem sftp <subsystem>

You have three options:

  • internal-sftp, which uses code built into SSH
  • a user-defined external binary, e.g. /usr/lib64/misc/sftp-server
  • Omitting the option entirely, to choose no subsystem (effectively denying SFTP from your server)

https://serverfault.com/questions/660160/openssh-difference-between-internal-sftp-and-sftp-server

The base Raspberry Pi image I used included the 2nd option in the default sshd_config file, pointing to a non-existent binary. Weird choice, but ok.

Subsystem      sftp    /usr/lib64/misc/sftp-server

The internal subsystem is perfectly acceptable for many use-cases, and enabling it fixed my issue.

Subsystem      sftp    /usr/lib64/misc/sftp-server

SFTP now works. Testing show that Rex can distribute files to my Pi-server.

One could argue that Rex should catch an error like this and present something friendlier than a stack trace, but one could also argue that people should know what they’re doing when they stand servers up. SSH/SFTP doesn’t emit a useful error here, either, and having a verbose error (“SFTP server not active” or some-such) would be a lot more helpful.

Perl: Spaces in a function or method name

I accidentally stumbled over an interesting ability/quirk in Perl: a subroutine / function / method name may contain spaces. Since I couldn’t find any info about it in the perlsub man page or on Google I decided to write it down.

It should be obvious that you can’t create such a subroutine by defining it the traditional way, but in case it isn’t: you can’t. Perl will consider the first word to be your subroutine identifier, and the following word(s) to be invalid keywords.

use strict;

sub name With Spaces
{
    print "works!\n"; # liar, doesn't work
}
Illegal declaration of subroutine main::name line 4.

NOTE: the following examples were tested in Perl versions 5.8.8 (circa 2006), 5.14.2 (circa 2011), and 5.28.2 (circa 2019).

To create a method name with a space, you have to manipulate the symbol table directly. (Indeed, I figured it out by accident thanks to an AUTOLOADed method that did that.)

sub AUTOLOAD
{
    my $self = shift;

    ( my $method = $AUTOLOAD ) =~ s{.*::}{};

    if ( exists $self->{_attr}->{ $method } ) {
        my $accessor = sub { return shift->{_attr}->{ $method } };

        {
            no strict 'refs';
            *$AUTOLOAD = $accessor;
        }

        unshift @_ => $self;
        goto &$AUTOLOAD;
    }

    return;
}

Stated more simply:

my $name = "name With Space";
*$name = sub { "works!" }; # insert directly to symbol table

Utilities like Test::Deep “just work” if there’s a space:

cmp_methods( $obj,
             [ 'name With Space' => 'value' ], # not a peep!
             'Basic methods'
            );
ok 1 - Basic Methods

The obvious question, though, is how to access it directly?

You can access a method using a variable, which is a pretty common thing to do on it’s own. (In my experience, anyway, YMMV).

my $name = 'name With Space';
my $value = $obj->$name; # works!

You can also create a reference to a string and immediately deference it.

my $value = $obj->${ \'name With Space' }; # works!

The second example works with regular function calls as well. Here’s a stand-alone example:

use strict;

{
    no strict "refs";
    my $name = "name With Space";
    *$name = sub { "works!" };
}

print ${ \"name With Space" }, "\n";' # prints "works!"

I can’t recommend creating subroutines with spaces in the name as good style, but it’s helpful to know that it can happen and how to work with it when it does.

Perl’s Open3, Re-Explained

I recently went spelunking into a core Perl module that I previously knew nothing about, IPC::Open3.  After fifteen years of developing in Perl I finally had a reason to use it.

If you’re reading this, it’s probably because you went looking for information on how to use open3 because the module’s documentation is bad.  I mean it’s really, really terrible.

Not only will you not know how to use open3 after reading the docs, you may become convinced that maybe open3 isn’t the module that you need, or maybe it would work but you’d just be better off looking for something else because this is too damn hard to use.

Fear not, intrepid reader, because if I can figure it out so can you.  But I will try to save you some of the leg work I went through. There’s precious little information scattered online, because this isn’t a popular package.  My loss is your gain, hopefully this helps you.

Why IPC::Open3?

When Would I Use IPC::Open3?

open3 is used when you need to open three pipes to another process.  That might be obvious from the name as well as the package’s synopsis:

$pid = open3( \*CHLD_IN,
              \*CHLD_OUT,
              \*CHLD_ERR,
              'some cmd and args',
              'optarg', ...
            );

Why would you do that?  The most obvious situation is when you want to control STDIN, STDOUT, and STDERR simultaneously.  The example I provide below, which is not contrived by the way but adapted from real production code, does exactly that.

There Are Lots Of Modules To Make This Easier, Why Should I Use IPC::Open3?

IPC::Open3 is part of the Perl core.  There’s a lot to be said for using a library that’s already installed and doesn’t have external dependencies vs. pulling in someone’s write-once-read-never Summer of Code academic project.

In addition, the modules that I found only served to hide the complexity of Open3, but they did it badly and didn’t really remove much code compared to what I came up with.

What Else Do I Need?

One of the things that’s not obvious from the Open3 docs are that you’re not going to use IPC::Open3 by itself.  You need a couple of other packages (also part of core) in order to use it effectively.

How I Used IPC::Open3

In our example, we’re going to fork a separate process (using open3) to encrypt a file stream using gpg.  gpg will accept a stream of data, encrypt it, and output to a stream.  We also want to capture errors sent to STDERR.

In a terminal, using bash, this would be really easy: gpg --decrypt < some_file > some_file.pgp 2>err.out

We could do all of this in Perl by writing temporary files, passing special file handle references into gpg as arguments, and capturing STDERR the old fashioned way, all using a normal open().  But where’s the fun in that?

First, lets use the packages we’ll need:

use IO::Handle;
use IO::Select;
use IPC::Open3;

IO::Handle allows us to operate on handles using object methods.  I don’t typically use it, but this code really appreciates it.  IO::Select does the same for select, but it helps even more than IO::Handle here.

use constant INPUT_BUF_SZ  => 2**12;
use constant OUTPUT_BUF_SZ => 2**20;

You might want to experiment to find the best buffer sizes.  The input buffer should not be larger than the pipe buffer on your particular system, else you’ll block trying to put two pounds of bytes into a one pound buffer.

Now, using IO::Handle we’ll create file handles for the stdin, stdout, and stderr that our forked process will read and write to:

my ( $in,
     $out,
     $err,
   ) = ( IO::Handle->new,
         IO::Handle->new,
         IO::Handle->new
       );

Call open3, which (like fork) gives us the PID of our new process.

Note: If we don’t call waitpid later on we’ll create a zombie after we’re done.

my $pid = open3( $in, $out, $err, '/usr/bin/gpg', @gpg_options );

if ( !$pid ) {
    die "failed to open pipe to gpg";
}

One of the features of IO::Select is that it allows us to find out when a handle is blocked. This is important when the output stream is dependent on the input stream, and each stream depends on a pipe of limited size.

We’re going to repeatedly loop over the handles, looking for a stream that is active, and read/write a little bit before continuing to loop.  We do this until both our input and output is exhausted.  It’s pretty likely that they’ll be exhausted at different times, i.e. we’ll be done with the input sometime before we’re done with the output.

As we exhaust each handle we remove it from the selection of possible handles, so that the main loop terminates naturally.

The value passed to can_write and can_read is the number of seconds to wait for the handle to be ready.  Non-zero timeouts cause a noticeable delay, while not setting it at all will cause us to block until the handle is ready, so for now we’ll leave it at zero.

# $unencrypted_fh and $encrypted_fh should be defined as
# handles to real files

my $sel = IO::Select->new;

$sel->add( $in, $out, $err );

# loop until we don't have any handles left

while ( my @handles = ( $sel->handles) ) {
    # read until there's nothing left
    #
    # write in small chunks so we don't overfill the buffer
    # and accidentally cause the pipe to block, which will
    # block us
    while ( my @ready = ( $sel->can_write(0) ) ) {
        for my $fh ( @ready ) {
            if ( $fh == $in ) {
                # read a small chunk from your source data
                my $read = read( $unencrypted_fh,
                                 my $bytes,
                                 INPUT_BUF_SZ,
                               );

                # and write it to our forked process
                #
                # if we're out of bytes to read, close the
                # handle
                if ( !$read ) {
                    $sel->remove( $fh );
                    $fh->close;
                }
                else {
                    syswrite( $fh, $bytes );
                }
            }
            else {
                die "unexpected filehandle for input";
            }
        }
    }

    while ( my @ready = ( $sel->can_read(0) ) ) {
        # fetch the contents of STDOUT and send it to the
        # destination
        for my $fh ( @ready ) {
            # this buffer can be much larger, though in the
            # case of gpg it will generally be much smaller
            # than the input was. The process will block if
            # the output pipe is full, so you want to pull as
            # much out as you can.

            my $read = sysread( $fh, my $bytes, OUTPUT_BUF_SZ );

            if ( !$read ) {
                $sel->remove( $fh );
                $fh->close;
            }
            elsif ( $fh == $out ) {
                # $encrypted_fh is whatever we're throwing output
                # into

                syswrite( $encrypted_fh, $bytes ) if $read;
            }
            elsif ( $fh == $err ) {
                print STDERR $bytes;
            }
            else {
                die "unexpected filehandle for output";
            }
        }
    }

    # IO::Handle won't complain if we close a handle that's
    # already closed
    $sel->remove( $in ); $in->close;
    $sel->remove( $out ); $out->close;
    $sel->remove( $err ); $err->close;

    waitpid( $pid, 0 );
}

That’s actually about it.

I keep my buffer for input small, as pipe buffers tend to be small.  If you overload your pipe your program will hang indefinitely (or until an alarm goes off, if you set one).  4096 bytes seems to be the limit, though your own limit may be different.  When in doubt, be conservative and go smaller.

The output buffer can afford to be bigger, up to the limit of available memory (but don’t do that).  In our example of encryption gpg will consume much more than it produces, so a larger buffer doesn’t really buy you anything but if we were decrypting it would be the reverse and a larger buffer would help immensely.