Transferring Large Files

Linux has an impressive tool set, if you know how to use it.  The  philosophy of using simple tools that do one job (but do it well) with the ability to chain commands together using pipes creates a powerful system.

Everyone has to transfer large files across the network on occasion.  scp is an easy choice most of the time, but if you’re working with small or old machines the CPU will be a bottleneck due to encryption.

There are several alternatives to scp, if you don’t need encryption.  These aren’t safe on the open internet but should be acceptable on private networks.  TFTP and rsync come to mind, but they have their limitations.

  • tftp is generally limited to 4 gig files
  • rsync either requires setting up an rsync service, or piping through ssh

My new personal favorite is netcat-as-a-server.  It’s a little more complicated to set up than scp or ftp but wins for overall simplicity and speed of transfer.

netcat doesn’t provide much output, so we’ll put it together with pv (pipeviewer) to tattle on bytes read and written.

First, on the sending machine (the machine with the file), we’ll set up netcat to listen on port 4200, and pv will give us progress updates:
pv -pet really.big.file | nc -q 1 -l -p 4200

  • pv -p prints a progress bar, -e displays ETA, -t enables the elapsed time
  • nc -q 1 quits 1 second after EOF, -l 4200 listens on port 4200

Without the -q switch, the sender will have to be killed with control-c or similar.

On the receiver (the machine that wants the file) netcat will read all bytes until the sender disconnects:
nc file.server.net 4200 | pv -b > really.big.file

  • nc will stream all bytes from file.server.net, port 4200
  • -b turns on the byte counter

Once the file is done transferring, both sides will shut down.

Dovecot woes

So after an upgrade, Dovecot failed to start:

Error: socket() failed: Address family not supported by protocol
Error: service(imap-login): listen(::, 143) failed: Address family not supported by protocol
Error: socket() failed: Address family not supported by protocol
Error: service(imap-login): listen(::, 993) failed: Address family not supported by protocol
Fatal: Failed to start listeners
* start-stop-daemon: failed to start `/usr/sbin/dovecot'
* ERROR: dovecot failed to start

How irritating.

A google search wasn’t particularly conclusive, but “listen(::, 143) Address family not supported by protocol” gave me some idea that it might be complaining about IPv6 support.  I removed support some time ago from the kernel and libraries.  (I don’t have a problem with IPv6 per se, but why have it when my ISP doesn’t support it?)  I’ve had Dovecot running well for ages without it, so what changed?

I found the culprit in /etc/dovecot/dovecot.conf:

# A comma separated list of IPs or hosts where to listen in for connections. 
# "*" listens in all IPv4 interfaces, "::" listens in all IPv6 interfaces.
# If you want to specify non-default ports or anything more complex,
# edit conf.d/master.conf.
#listen = *, ::

So the default configuration now enables IPv6. At least the fix is easy:

listen = *

Nerd Poetry

< > ! * ' ' #
^ " ` $ $ -
! * = @ $ _
% * < > ~ # 4
& [ ] . . /
| { , , SYSTEM HALTED

Translation:

Waka waka bang splat tick tick hash,
Caret quote back-tick dollar dollar dash,
Bang splat equal at dollar under-score,
Percent splat waka waka tilde number four,
Ampersand bracket bracket dot dot slash,
Vertical-bar curly-bracket comma comma CRASH.

Vim and tabs

At work, we use vim for our editing needs – which, as programmers, means we spend our day in vim.

We have a lot of mixed-format code – sometimes it has tabs, sometimes it has spaces.  Personally, I prefer spaces for my indenting, but I’m a convert to the church of make-your-code-match-the-existing-code.  As opposed to reformatting the existing code, which is soooo irresistible until you’re faced with a mountain.

So, I have to switch back and forth a lot.  Here, for the sake of posterity, is how to switch from spaces to tabs on-the-fly in vim:

:set noexpandtab
:set copyindent
:set preserveindent
:set softtabstop=0
:set shiftwidth=4
:set tabstop=4

Working with hidden files and directories

I had a problem.  My home directory is huge – 9 gigs – but I don’t know what’s taking up all that room. My porn stash is on another partition where my wife doesn’t know to look, so something is taking up a lot of room and I want to know what and why.

Oh yeah, I know how to check the size of a directory – use du ('du -sh .') for the usage of current directory (including all sub directories).  And, to see the size for every individual directory in the current directory, 'du -sh *'.  Easy peasy.

But that didn’t tell me what I needed to know, since the total size of all visible directories was less than a quarter of the used space.  That’s where hidden directories come into play.

Now, in the unix world, there isn’t a special file permission to hide a file or directory.  You just name it with a leading dot, like '.my_hidden_stuff', and most utilities won’t display it.  There’s nothing intrinsically hidden about it, though.  You can view them easily enough, e.g. 'ls -a' will show everything, including the “hidden” stuff.

What if you want to see just the hidden stuff?  It’s not as simple as saying 'ls -a .*', since that includes '.' (the current directory) and '..' (the parent directory), too.  Some utilities, like du, will then combine arguments with a common root, which means you get the summary for the current directory, but none of the hidden files broken out.

Solution

In bash, at least, you can include simple regular expressions on the command line.  (Remember, in unix, your command line is pre-processed by the shell (bash, csh, tcsh, etc.) and the expanded items are given to the program. DOS/Windows, by contrast, the expansion and processing is the responsibility of the program and command.com (or cmd.exe) does little processing itself.

The regex for all dot files, minus ‘.’ and ‘..’, is '.[^.]*' (which basically says, “start with a dot, and the next character must exist but cannot be a dot, and then anything goes after that”).

So, my command to see how much space each of my hidden directories are using, is

du -shx .[^.]*

Five cents

I have ‘fortune’ run at login for all of my machines.  These are the fortunes I saw, one right after the other, while hopping from machine to another:

Our country has plenty of good five-cent cigars, but the trouble is they charge fifteen cents for them.

Then, when I connected to the next machine…

What this country needs is a good five cent ANYTHING!

My computers are getting better at coordination.  I had best keep them happy.

Device is being exclusively used by the host computer

I say:

$ VBoxManage controlvm <vbox uuid> usbattach <device uuid>

VirtualBox says:

VBoxManage: error: USB device '<device>' with UUID <uuid> is being exclusively used by the host computer

I say:

$ gpasswd -a <vbox user> plugdev

and log out the vbox user completely (no vms running, no logged-in shell).

And now the device is available and add-able.

“device is busy…”

I’m working on a fun little project to set up a custom-made bootable usb key. But I ran into a little trouble after using a chroot, due to /dev and mount --rbind.

# mount -t proc none /mnt/gentoo/proc
# mount --rbind /sys /mnt/gentoo/sys
# mount --rbind /dev /mnt/gentoo/dev
# mount -t tmpfs tmpfs /mnt/gentoo/tmp
# mount -t tmpfs tmpfs /mnt/gentoo/var/tmp
# mount -t tmpfs tmpfs /mnt/gentoo/usr/src
# mount | grep gentoo
/dev/sde1 on /mnt/gentoo type ext2 (rw)
none on /mnt/gentoo/proc type proc (rw)
/sys on /mnt/gentoo/sys type none (rw,bind,rbind)
/dev on /mnt/gentoo/dev type none (rw,bind,rbind)
tmpfs on /mnt/gentoo/tmp type tmpfs (rw)
tmpfs on /mnt/gentoo/var/tmp type tmpfs (rw)
tmpfs on /mnt/gentoo/usr/src type tmpfs (rw)
# chroot /mnt/gentoo

All’s well, until it’s time to exit the chroot and unmount everything..

# umount /mnt/gentoo/usr/src /mnt/gentoo/var/tmp \
/mnt/gentoo/tmp /mnt/gentoo/sys /mnt/gentoo/proc \
/mnt/gentoo/dev /mnt/gentoo
umount: /mnt/gentoo/dev: device is busy.
       (In some cases useful info about processes that use
        the device is found by lsof(8) or fuser(1))

I don’t see anything mounted under there. Fuuuu…

# mount | grep gentoo
/dev/sde1 on /mnt/gentoo type ext2 (rw)
# lsof|grep gentoo
# fuser -m /mnt/gentoo
/mnt/gentoo:
#

Rebooting at this stage is inconvenient, but will certainly solve the problem. But what is left using /mnt/gentoo/dev?
Googling around, and seeing some people with similar problems, finally lit a light bulb above my head:

# cat /proc/mounts | awk '{print $2}' | grep gentoo
/mnt/gentoo
/mnt/gentoo/dev
/mnt/gentoo/dev/pts
/mnt/gentoo/dev/shm

So mount --rbind worked as advertised and recursively mounted /dev and everything sub-mounted. And I didn’t realize that udev mounted other things under /dev without updating /etc/mtab. Sigh.

# umount /mnt/gentoo/dev/shm /mnt/gentoo/dev/pts /mnt/gentoo/dev /mnt/gentoo
#

And now I can get on with my life.