The framework ‘Microsoft.NETCore.App’, version ‘6.0.0’ was not found.

New Linux box, old home directory.

Attempting to execute dotnet ef database update repeatedly failed with the error, “The framework ‘Microsoft.NETCore.App’, version ‘6.0.0’ was not found.” Individual dotnet commands (dotnet --version, dotnet build, etc) were working, which became very confusing.

Google was not my friend today, the error as a search term produced lots of noise, some issues from github that indicated old bugs, and red herrings.

I finally stumbled across the problem: the value of the environmental variable DOTNET_ROOT was wrong. The value was /opt/dotnet-sdk-bin-5.0, but the installed version was 6.0.

Version 5.0 had been installed initially, but I upgraded it during the same login session. While /etc/env.d/90dotnet-sdk-bin-6.0 was installed properly, and used the correct value, it would not take effect until I logged out and/or rebooted.

Shame on Microsoft for their terrible, uninformative errors.

Can’t call method “stat” on an undefined value

I use Rexify to manage my servers. (I’m learning to treat my servers like cattle, not pets.) I have it distributing standardized config files, running updates, and more.

The Problem

After adding a Raspberry Pi-based system to manage my backups, I was plagued with this error:

[2022-12-10 23:41:13] INFO - Running task Config:System:standard on backup
[2022-12-10 23:41:15] ERROR - Error executing task:
[2022-12-10 23:41:15] ERROR - Can't call method "stat" on an undefined value at /usr/lib64/perl5/vendor_perl/5.34/Rex/Interface/Fs/OpenSSH.pm line 70, <> line 144.
[2022-12-10 23:41:15] ERROR - 1 out of 1 task(s) failed: 
[2022-12-10 23:41:15] ERROR - Config:System:standard failed on backup
[2022-12-10 23:41:15] ERROR - Can't call method "stat" on an undefined value at /usr/lib64/perl5/vendor_perl/5.34/Rex/Interface/Fs/OpenSSH.pm line 70, <> line 144.

The error didn’t make much sense, and wasn’t consistent. Not all Rex commands triggered it. When I did run into it, the same command executed directly on the server worked just fine.

Just to be clear: the problem had nothing to do with the system architecture. I.e. the fact that it was a Raspberry Pi, or an ARM device generally, was a red herring, but this was the first RPi that I had managed with Rex, and the only system with this problem, so I didn’t rule it out right away.

The Diagnosis

At first glance it appeared to be a bug in Rex, because that was the only thing that made sense. Why would it be complaining that it can’t call “stat” on an undefined file path, when the files are all clearly defined in my Rexfile configuration?

The line number referred to a place where Rex calls stat using SFTP. The actual value of $path was unknown to me, as I didn’t want to immediately jump into someone else’s code and start outputting debug info, but it seemed reasonable that it would either be the local or remote file, and the files should be clearly defined.

my $attr = $sftp->stat($path);

That seems pretty straight-forward. If this is a bug in Rex then other people should be encountering it.

Googling the error was rather unhelpful. That specific error, with quotes to search exact matches, returned no hits. An inexact match returned nothing helpful. Nothing about Rex, just a bunch of other applications doing other things. Let’s set that assumption aside and look elsewhere.

I could run simple commands using Rex, like ‘uptime’, to the server but more complicated things were a problem. The only common thread so far involved putting files. Could it be an issue with sending and receiving files?

Rex is configured to connect as root, there aren’t other users on the system (besides default and service accounts) to accidentally connect as, and the paths are all regular filesystem paths (not /dev or similar), so it’s almost certainly not a permissions issue.

Maybe a second SSH connection couldn’t be established; SFTP establishes an SSH connection and tunnels across it, after all.

I don’t set a common ControlMaster and ControlPath setting in my config, but I know from experience that if you do, and you force the first (“master”) to close, you’ll knock out other connections to the same host.

Well, I could connect via SSH just fine, using multiple simultaneous connections and opening / closing independently, so this clearly wasn’t an SSH connection problem. Time to look somewhere else.

Except I was wrong.

By ruling out SSH I missed a critical, based troubleshooting step: connect to my new server via SFTP and issue a couple of commands. When I finally tried it, after beating around the bush for far too long, I had my “duh!” moment.

Lo and behold, SFTP failed. I didn’t expect that could happen if SSH was fine.

Of course, it failed with an inscrutable error in batch mode:

$ echo "ls /" | sftp -b - backup
Connection closed.
Connection closed

Using the -v option was hardly more informative:

$ echo "ls /" | sftp -v -b - backup
OpenSSH_9.0p1, OpenSSL 1.1.1q  5 Jul 2022
debug1: Reading configuration data ...
<snip>
subsystem request failed on channel 0
Connection closed.
Connection closed

Getting out of batch mode and trying to get a prompt gave me a little bit more information, but only just:

$ sftp -v backup
OpenSSH_9.0p1, OpenSSL 1.1.1q  5 Jul 2022
debug1: Reading configuration data ...
<snip>
debug1: Sending subsystem: sftp
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 4528, received 4060 bytes, in 0.2 seconds
Bytes per second: sent 22612.3, received 20275.2
debug1: Exit status 127
debug1: compress outgoing: raw data 211, compressed 171, factor 0.81
debug1: compress incoming: raw data 918, compressed 764, factor 0.83
Connection closed.  
Connection closed

Most disconcerting. Fortunately, my Google-fu still had some juice today, and I learned something new from the search results.

The best clue was the exit status 127: https://serverfault.com/questions/770541/can-connect-via-ssh-but-not-via-sftp-exit-status-127

The Solution

SSH has something that appears to be a “legacy” option for configuring SFTP: the SFTP server subsystem.

Subsystem sftp <subsystem>

You have three options:

  • internal-sftp, which uses code built into SSH
  • a user-defined external binary, e.g. /usr/lib64/misc/sftp-server
  • Omitting the option entirely, to choose no subsystem (effectively denying SFTP from your server)

https://serverfault.com/questions/660160/openssh-difference-between-internal-sftp-and-sftp-server

The base Raspberry Pi image I used included the 2nd option in the default sshd_config file, pointing to a non-existent binary. Weird choice, but ok.

Subsystem      sftp    /usr/lib64/misc/sftp-server

The internal subsystem is perfectly acceptable for many use-cases, and enabling it fixed my issue.

Subsystem      sftp    /usr/lib64/misc/sftp-server

SFTP now works. Testing show that Rex can distribute files to my Pi-server.

One could argue that Rex should catch an error like this and present something friendlier than a stack trace, but one could also argue that people should know what they’re doing when they stand servers up. SSH/SFTP doesn’t emit a useful error here, either, and having a verbose error (“SFTP server not active” or some-such) would be a lot more helpful.

Recursively delete directories unless a specific file is present

There are several ways to do this, but my Google-fu may be weak because it took me much too long to figure this out.

I want to recursively delete directories with a specific name (or names) within directory structure, UNLESS the matched directory contains a sentinel file.

In my case I want to make a C# directory structure “cleaner-than-clean” by removing all ‘bin’ and ‘obj’ directories, leaving just the user-generated files behind.  This is pretty easy to achieve:

#!/bin/env bash

dir=/path/to/project

find $dir -type d \
    \( -name 'bin' -o -name 'obj' \)
    -print

This says “find things under $dir that are directories (-type d) and are named either ‘bin’ (-name 'bin') or (-o) named ‘obj’ (-name 'obj').  The parentheses force the two -name statements to be considered as a single condition, so the effect is to return true if either item matches. If the final result is true then print the path.

Notice that I’ve escaped (\) the parentheses because I’m using bash. Most UNIX shells do require these to be escaped, but yours may not. I’ve also terminated each line by escaping it. A single-line command may be spread over several lines this way, making it easier to read.

‘bin’ is also the conventional name for a directory of non-build executables, like helper scripts.  I do have some, including this cleaner-than-clean cleaning script that I’m working out, and don’t want to delete those by accident. The above command would find them, if they were in the directory tree.

find allows you to prune (-prune) the search tree, ignoring selective directories, according to certain criteria but it doesn’t support the concept of peeking into sub-directories. Bummer.

You may, however, execute independent commands (-exec) and use the results of those commands to affect find‘s parameters, including -prune. We can exec the test command, which can tell us if our sentinel file exists.

#!/bin/env bash

dir=/path/to/project
sentinel=.keep

find "$basedir" \
    -type d \
    \( -name bin -o -name obj \) \
    ! -exec test -e "{}/$sentinel" ';' \
    -print

The new line executes test to see if the current path ({}) contains a file called $sentinel (I’ve defined $sentinel to be .keep but any filename will do), which returns true if it exists. The line is negated (!) so if the sentinel is found further actions are skipped.

The final step is to actually delete the directory. We call rm -Rf (-R = recursive, -f = force) because we just want the whole thing gone, no questions asked. The trailing plus (+) tells find that rm can accept multiple paths in a single call, rather than calling rm once for each path.

#!/bin/env bash

dir=/path/to/project
sentinel=.keep

find "$basedir" \
    -type d \
    \( -name bin -o -name obj \) \
    ! -exec test -e "{}/$sentinel" ';' \
    -print \
    -exec rm -Rf '{}' \+

CNAMEs in Samba

I’m documenting something that wasn’t easy to uncover.

TL;DR – if you want to create a CNAME in Samba to replace an existing DNS record, you must delete the A record first.

Background

I have an Active Directory domain running on Samba.  I’ve had an underpowered file server, simply called ‘files’, for a while.  I finally had a chance to upgrade it to some newer hardware with a rather large SSD.

Since this, like all my home projects, is a side-project that takes several days to complete I chose to build the new server (‘concord’) and get it running while leaving ‘files’ in-place.

I like to have servers named after their roles, because it makes things easy, but we have a lot more computers than formal roles in the house.  We’ve finally settled on a naming convention: Windows names are places in Washington, Apple products are from California, and Linux products are from Massachusetts.  (I am aware that Unix was birthed in New Jersey but… Ew.  At least X came from MIT, that’s good enough for me.)

I also have a number of dependencies on the name ‘files’ including, most crucially, my own brain.  Muscle memory is hard to overcome (“ls /net/files/… damn ^H/net/concord/…”) and I don’t want to relearn a server name.

That left me with three problems to solve: follow the naming standard, use a “taken” name for the server, and build said server while the needed name is still available on the network.

The obvious answer is to use CNAMEs.  I planned to set up ‘files’ as an alias to ‘concord’.  Similar practice would carry us forward through an indefinite number of role-swaps in the future.

After copying all of our data from ‘files’ to ‘concord’ I confidently shut ‘files’ down and added my CNAME.  This is where things went wrong.

The Problem

After shutting ‘files’ down, I started by creating the CNAME:

dc1 # samba-tool dns add 192.168.1.2 ad.jonesling.us files CNAME concord.ad.jonesling.us -U administrator
Password for [AD\administrator]: ******
Record added successfully

That’s all well and good.  Let’s test it out from another computer:

natick $ nslookup
> files
Server:     dc1
Address:    2001:470:1f07:583:44a:52ff:fe4a:8cee#53

Name:   files.ad.jonesling.us
Address: 192.168.1.153
files.ad.jonesling.us   canonical name = concord.ad.jonesling.us.

Crap.  That’s the correct canonical name, but the wrong IP address – it’s ‘files’ old IP address.

Some googling uncovered someone with a similar issue back in 2012, but they “solved” it by creating static A records instead.  That’s not a great solution, certainly not what I want.

I thought about it for a few minutes.  I got a success message, but was the record actually created?  How can I tell?  What happens if I insert it again?

dc1 # samba-tool dns add 192.168.1.2 ad.jonesling.us files CNAME concord.ad.jonesling.us -U administrator
Password for [AD\administrator]: ******

ERROR(runtime): uncaught exception - (9711, 'WERR_DNS_ERROR_RECORD_ALREADY_EXISTS')
  File "/usr/lib/python3.7/site-packages/samba/netcmd/__init__.py", line 186, in _run
    return self.run(*args, **kwargs)
  File "/usr/lib/python3.7/site-packages/samba/netcmd/dns.py", line 945, in run
    raise e
  File "/usr/lib/python3.7/site-packages/samba/netcmd/dns.py", line 941, in run
    0, server, zone, name, add_rec_buf, None)

Well, it was inserted somewhere, that much is clear.

What happens if I dig it?  nslookup gave us a canonical address, but I want to see the actual DNS record.  Maybe it contains a clue.

First, lets dig the CNAME:

dc1 # dig @dc1 files.ad.jonesling.us IN CNAME

; <<>> DiG 9.14.8 <<>> @dc1 files.ad.jonesling.us IN CNAME
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10370
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 7a0aa65a623d5d3bdbdc39075f2eff9d5b81dbd9ed05c9d0 (good)
;; QUESTION SECTION:
;files.ad.jonesling.us. IN CNAME

;; ANSWER SECTION:
files.ad.jonesling.us. 900 IN CNAME concord.ad.jonesling.us.

;; Query time: 8 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Sat Aug 08 15:40:13 EDT 2020
;; MSG SIZE rcvd: 100

I’ve bolded the line that shows the alias.  That looks right.

But what about ‘files’?

dc1 # dig @dc1 files.ad.jonesling.us

; <<>> DiG 9.14.8 <<>> @dc1 files.ad.jonesling.us
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42296
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 0352365b5c07ecdace1ebf3c5f2effa6da5d32bfe9002b32 (good)
;; QUESTION SECTION:
;files.ad.jonesling.us. IN A

;; ANSWER SECTION:
files.ad.jonesling.us. 3600 IN A 192.168.1.153

;; Query time: 8 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Sat Aug 08 15:40:22 EDT 2020
;; MSG SIZE rcvd: 94

Ah.  That looks like a conflict.  Both records exist, and one has primacy over the other.

‘files’ was assigned an address via DHCP, I never gave it a static address, so I didn’t expect that I would need to delete anything.  But if I think about it, I realized that Samba doesn’t know that ‘files’ isn’t coming back.  (That makes me wonder what kind of graveyard DNS becomes, with friends’ phones and laptops popping in from time to time.)

So, can we delete the old A record, and what happens if we do?

The Solution

We delete the address.  It looks like it’s working:

dc1 # samba-tool dns delete 192.168.1.2 ad.jonesling.us files A 192.168.1.153 -U administrator
Password for [AD\administrator]:
Record deleted successfully

Was that the problem all along?

dc1 # dig @dc1 files.ad.jonesling.us

; <<>> DiG 9.14.8 <<>> @dc1 files.ad.jonesling.us
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38286
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 1610fb8ec07db8e3a43976ed5f2effdffeb142b30ca93848 (good)
;; QUESTION SECTION:
;files.ad.jonesling.us. IN A

;; ANSWER SECTION:
files.ad.jonesling.us. 900 IN CNAME concord.ad.jonesling.us.
concord.ad.jonesling.us. 3600 IN A 192.168.1.82

;; Query time: 15 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Sat Aug 08 15:41:20 EDT 2020
;; MSG SIZE rcvd: 116

That looks pretty good!

Securing WordPress: The Basics

This is the first in an occasional series of documents on WordPress.


WordPress is ubiquitous but fragile.  There are few alternatives that provide the easy posting, wealth of plugins, and integration of themes, while also being (basically) free to use.

It’s also a nerve-wracking exercise in keeping bots and bad actors out.  Some of the historical security holes are legendary.  It doesn’t take long to find someone who experienced a site where the comments section was bombed by a spammer, or even outright defacement.  (I will reluctantly raise my own hand, having experienced both in years past.)

Most people that use WordPress nowadays rely on 3rd parties to host it.  This document isn’t for them; hosted security is mostly outside of your control.  That’s generally a good thing: professionals are keeping you up to date and covered by best practices.

The rest of us muddle through security and updates in piece-meal fashion, occasionally stumbling over documents like this one.

Things To Look Out For

As a rule, good server hygiene demands that you keep an eye on your logs.  Tools like goaccess help you analyze usage, but nothing beats a peek at the raw logs for noticing issues cropping up.

The Good Bots

Sleepy websites like mine show a high proportion of “good” bots like Googlebot, compared to human traffic.  They’re doing good things like crawling (indexing) your site.

In my case they are the primary visitor base to my site, generating hundreds or even thousands of individual requests per day.  Hopefully your own WordPress site has a better visitor-to-bot ratio than mine.

We don’t want to block these guys from their work, they’re actually helpful.

The Bad Bots

You’ll also see bad bots, possibly lots of them.  Most are attempting to guess user credentials so they can post things on your WordPress site.

Some are fairly up-front about it:

...
132.232.47.138 [07:51:14] "POST /xmlrpc.php HTTP/1.1"
132.232.47.138 [07:51:14] "POST /xmlrpc.php HTTP/1.1"
132.232.47.138 [07:51:15] "POST /xmlrpc.php HTTP/1.1"
132.232.47.138 [07:51:16] "POST /xmlrpc.php HTTP/1.1"
132.232.47.138 [07:51:16] "POST /xmlrpc.php HTTP/1.1"
132.232.47.138 [07:51:18] "POST /xmlrpc.php HTTP/1.1"
...

They’ll hammer your server like that for hours.

Blocking their individual IP addresses at the firewall is devastatingly effective… for about five minutes.  Another bot from another IP will pop up soon.  Blocking individual IPs is a game of whack-a-mole.

Some are part of a “slow” botnet, hitting the same page from unique a IP address each time.  These are part of the large botnets you read about.

83.149.124.238 [05:01:06] "GET /wp-login.php HTTP/1.1" 200
83.149.124.238 [05:01:06] "POST /wp-login.php HTTP/1.1" 200
188.163.45.140 [05:03:38] "GET /wp-login.php HTTP/1.1" 200
188.163.45.140 [05:03:39] "POST /wp-login.php HTTP/1.1" 200
90.150.96.222 [05:04:30] "GET /wp-login.php HTTP/1.1" 200
90.150.96.222 [05:04:32] "POST /wp-login.php HTTP/1.1" 200
178.89.251.56 [05:04:42] "GET /wp-login.php HTTP/1.1" 200
178.89.251.56 [05:04:43] "POST /wp-login.php HTTP/1.1" 200

These are more insidious: patient and hard to spot on a heavily-trafficked blog.

Keeping WordPress Secure

You (hopefully) installed WordPress to a location outside of your “htdocs” document tree.  If not, you should fix that right away!  (Consider this “security tip #0” because without this you’re basically screwed.)

Security tip #1 is to make sure auto updates are enabled.  The slight risk of a botched release being automatically applied is much lower than that of having an critical security patch that is applied too late.

Like medieval door locks on your front door, there is little security advantage to running old software.

Once an exploit is patched, the prior releases are vulnerable as people deconstruct the patch and reverse-engineer the exploit(s) – assuming a exploit wasn’t published before the patch was released.

Locking WordPress Down

Your Apache configuration probably contains a section similar to this:

<Directory "/path/to/wordpress">
    ...
    Require all granted
    ...
</Directory>

We’re going to add some items between <Directory></Directory> tags to restrict access to the most vulnerable pieces.

You Can’t Attack Things You Can’t Reach

We’ll start by invoking the Principle of Least Privilege: people should only be able to do the things they must do, and nothing more.

xmlrpc.php is an API for applications to talk to WordPress.  Unfortunately it doesn’t carry extra security, so if you’re a bot it’s great to hammer with your password guesses – you won’t be blocked, and no one will be alerted.

Most people don’t need it.  Unless you know you need it, you should disable it completely.

<Directory "/path/to/wordpress">
    ...
    <Files xmlrpc.php>
        <RequireAll>
            Require all denied
        </RequireAll>
    </Files>
</Directory>

There are WordPress plugins that purport to “disable” xmlrpc.php, but they deny access from within WordPress.  That means that you’ve still paid a computational price for executing xmlrpc.php, which can be steeper than you expect, and you’re still at risk of exploitable bugs within it.  Denying access to it at the server level is much safer.

You Can’t Log In If You Can’t Reach the Login Page

This next change will block anyone from outside your LAN from logging in.  That means that if you’re away from home you won’t be able to log in, either, without tunneling back home.

<Directory "/path/to/wordpress">
    ...
    <Files wp-login.php>
        <RequireAll>
            Require all granted
            # remember that X-Forwarded-For may contain multiple
            # addresses, don't just search for ^192...
            Require expr %{HTTP:X-Forwarded-For} =~ /\b192\.168\.1\./
        </RequireAll>
    </Files>
</Directory>

If you’re not using a public-facing proxy, and don’t need to look at X-Forwarded-For, you can simplify this a little:

<Directory "/path/to/wordpress">
    ...
    <Files wp-login.php>
        <RequireAll>
            Require all granted
            Require ip 192.168.1
        </RequireAll>
    </Files>
</Directory>

This will prevent 3rd parties from signing up on your blog and submitting comments.  This may be important to you.

Restart Apache

After inserting these blocks, you should execute Apache’s ‘configtest’ followed by reload:

$ sudo apache2ctl configtest
apache2      | * Checking apache2 configuration ...     [ ok ]
$ sudo apache2ctl reload
apache2      | * Gracefully restarting apache2 ...      [ ok ]

Now test your changes from outside your network:

xmlrpc.php forbidden

Apache’s access log should show a ‘403’ (Forbidden) status:

... "GET /xmlrpc.php HTTP/1.1" 403 ...

And just like that, you’ve made your WordPress blog a lot more secure.

Interestingly, by making just these changes on my own site the attacks immediately dropped off by 90%.  I guess that the better-written bots realized that I’m not a good target anymore and stopped wasting their time, preferring lower-hanging fruit.

Bypassing a Tunnel-Broker IPv6 Address For Netflix

Surprisingly, it worked beautifully… that is, until I discovered an unintended side effect

My ISP is pretty terrible but living in the United States, as I do, effectively makes internet service a regional monopoly.  In my case, not only do I pay too much for service but certain websites (cough google.com cough) are incredibly slow for no reason other than my ISP is a dick and won’t peer with them properly.

This particular ISP, despite being very large, has so far refused to roll out IPv6.  This was annoying until I figured out that I could use this to my advantage.  If they won’t peer properly over IPv4, maybe I can go through a tunnel broker to get IPv6 and route around them.  Surprisingly, it worked beautifully.  GMail has never loaded so fast at home.

It was beautiful, that is, until I discovered an unintended side effect: Netflix stopped working.

netflix error: you seem to be using an unblocker or proxy
Despite my brokered tunnel terminating inside the United States, Netflix suspects me of coming from outside the United States.

A quick Google search confirmed my suspicion.  Netflix denies access to known proxies, VPNs, and, sadly, IPv6 tunnel brokers.  My brave new world was about to somewhat less entertaining if I couldn’t fix this.

Background

Normally a DNS lookup returns both A (IPv4) and AAAA (IPv6) records together:

$ nslookup google.com
Server:     192.168.1.2
Address:    192.168.1.2#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.12.142
Name:   google.com
Address: 2607:f8b0:4006:819::200e

Some services will choose to provide multiple addresses for redundancy; if the first address doesn’t answer then your computer will automatically try the next in line.

Netflix in particular will return a large number of addresses:

$ nslookup netflix.com 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
Name: netflix.com
Address: 54.152.239.3
Name: netflix.com
Address: 52.206.122.138
Name: netflix.com
Address: 35.168.183.177
Name: netflix.com
Address: 54.210.113.65
Name: netflix.com
Address: 52.54.154.226
Name: netflix.com
Address: 54.164.254.216
Name: netflix.com
Address: 54.165.157.123
Name: netflix.com
Address: 107.23.222.64
Name: netflix.com
Address: 2406:da00:ff00::3436:9ae2
Name: netflix.com
Address: 2406:da00:ff00::6b17:de40
Name: netflix.com
Address: 2406:da00:ff00::34ce:7a8a
Name: netflix.com
Address: 2406:da00:ff00::36a5:f668
Name: netflix.com
Address: 2406:da00:ff00::36a5:9d7b
Name: netflix.com
Address: 2406:da00:ff00::23a8:b7b1
Name: netflix.com
Address: 2406:da00:ff00::36d2:7141
Name: netflix.com
Address: 2406:da00:ff00::36a4:fed8

The Solution

The key is to have your local DNS resolver return A records, but not AAAA, if (and only if) it’s one of Netflix’s hostnames.

Before I document the solution, it helps to know my particular setup and assumptions:

  • IPv6 via a tunnel broker
  • BIND’s named v9.14.8

Earlier versions of BIND are configured somewhat differently: you may have different options, or (if it’s a really old build) you may need to run two separate named instances.  YMMV.

Step 0: Break Out Your Zone Info (optional but recommended)

If your zone info is part of named.conf you really should put it into it’s own file for easier maintenance and re-usability. The remaining instructions won’t work, without modification, if you don’t.

# /etc/bind/local.conf
zone "." in {
        type hint;
        file "/var/bind/named.cache";
};

zone "localhost" IN {
        type master;
        file "pri/localhost.zone";
        notify no;
};

# 127.0.0. zone.
zone "0.0.127.in-addr.arpa" {
        type master;
        file "pri/0.0.127.zone";
};

Step 1: Add a New IP Address

You can run a single instance of named but you’ll need at least two IP addresses to handle responses.

In this example the DNS server’s “main” IP address is 192.168.1.2 and the new IP address will be 192.168.1.3.

How you do this depends on your distribution. If you’re using openrc and netifrc then you only need to modify /etc/conf.d/net:

# Gentoo and other netifrc-using distributions
config_eth0="192.168.1.2/24 192.168.1.3/24"

Step 2: Listen To Your New Address

Add your new IP address to your listen-on directive, which is probably in /etc/bind/named.conf:

listen-on port 53 { 127.0.0.1; 192.168.1.2; 192.168.1.3; };

It’s possible that your directive doesn’t specify the IP address(es) and/or you don’t even have a listen-on directive – and that’s ok. From the manual:

The server will listen on all interfaces allowed by the address match list. If a port is not specified, port 53 will be used… If no listen-on is specified, the server will listen on port 53 on all IPv4 interfaces.

https://downloads.isc.org/isc/bind9/9.14.8/doc/arm/Bv9ARM.ch05.html

Everything I just said also applies to listen-on-v6.

Step 3: Filter Query Responses

Create a new file called /etc/bind/limited-ipv6.conf and add the following at the top:

view "internal-ipv4only" {
        match-destinations { 192.168.1.3; };
        plugin query "filter-aaaa.so" {
                # don't return ipv6 addresses
                filter-aaaa-on-v4 yes;
                filter-aaaa-on-v6 yes;
        };
};

What this block is saying is, if a request comes in on the new address, pass it through the filter-aaaa plugin.

We’re configuring the plugin to filter all AAAA record replies to ipv4 clients (filter-aaaa-on-v4) and ipv6 clients (filter-aaaa-on-v6).

Now add a new block after the first block, or modify your existing default view:

# forward certain domains back to the ipv4-only view
view "internal" {
        include "/etc/bind/local.conf";

        # AAAA zones to ignore
        zone "netflix.com" {
                type forward;
                forward only;
                forwarders { 192.168.1.3; };
        };
};

This is the default view for internal clients. Requests that don’t match preceding views fall through here.

We’re importing the local zone from step 0 (so we don’t have to maintain two copies of the same information), then forwarding all netflix.com look-ups to the new IP address, which will be handled by the internal-ipv4only view.

Step 4: Include the New Configuration File

Modify /etc/bind/named.conf again, so we’re loading the new configuration file (which includes local.conf).

#include "/etc/bind/local.conf";
include "/etc/bind/limited-ipv6.conf";

Restart named after you make this change.

Testing

nslookup can help you test and troubleshoot.

In the example below we call the “normal” service and get both A and AAAA records, but when we call the ipv4-only service we only get A records:

$ nslookup google.com 192.168.1.2
Server:         192.168.1.2
Address:        192.168.1.2#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.3.110
Name:   google.com
Address: 2607:f8b0:4006:803::200e

$ nslookup google.com 192.168.1.3
Server:         192.168.1.3
Address:        192.168.1.3#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.3.110

 

Failed to retrieve directory listing

filezilla connection log with "failed to retrieve directory listing" error
Filezilla’s opaque error

I occasionally run a local vsftp daemon on my development machine for testing.  I don’t connect to it directly — it’s used to back up unit tests that need an FTP connection.  No person connects to it, least of all me, and the scripts that do connect are looking at small, single-use directories.

I needed to test a new feature: FTPS, aka FTP with SSL (Not to be confused with SFTP, a very different beast.)  Several of our vendors will be requiring it soon; frankly, I’m surprised they haven’t required it sooner.  But I digress.

To start this phase of the project I needed to make sure that my local vsftp daemon supports FTPS so that I can run tests against it.  So I edit /etc/vsftpd/vsftpd.conf to add some lines to my config, and restart:

rsa_cert_file=/etc/ssl/private/vsftpd.pem
rsa_private_key_file=/etc/ssl/private/vsftpd.pem
ssl_enable=YES

But Filezilla bombs with an opaque error message:

Status: Resolving address of localhost
Status: Connecting to 127.0.0.1:21...
Status: Connection established, waiting for welcome message...
Status: Initializing TLS...
Status: Verifying certificate...
Status: TLS connection established.
Status: Logged in
Status: Retrieving directory listing...
Command: PWD
Response: 257 "/home/dad" is the current directory
Command: TYPE I
Response: 200 Switching to Binary mode.
Command: PASV
Response: 227 Entering Passive Mode (127,0,0,1,249,239).
Command: LIST
Response: 150 Here comes the directory listing.
Error: GnuTLS error -15: An unexpected TLS packet was received.
Error: Disconnected from server: ECONNABORTED - Connection aborted
Error: Failed to retrieve directory listing

I clue in pretty quickly that “GnuTLS error -15: An unexpected TLS packet was received” is actually a red herring, so I drop the SSL from the connection and get a different error:

Response: 150 Here comes the directory listing.
Error: Connection closed by server
Error: Failed to retrieve directory listing

Huh, that’s not particularly helpful, shame on you Filezilla.  I drop down further to a command-line FTP client to get the real error:

$ ftp localhost
Connected to localhost.
220 (vsFTPd 3.0.3)
Name (localhost:dad): 
530 Please login with USER and PASS.
530 Please login with USER and PASS.
SSL not available
331 Please specify the password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> ls
200 PORT command successful. Consider using PASV.
150 Here comes the directory listing.
421 Service not available, remote server has closed connection
ftp> quit

Ah.  Now we’re getting somewhere.

A quick perusal turned up a stackexchange answer with the assertion that “the directory causing this behaviour had too many files in it (2,666).”  My own directory is much smaller, about a hundred files.  According to this bug report, however, the real maximum may be as few as 32 files.  It’s not clear to me whether this is a kernel bug, a vsftpd bug, or just a bad interaction between recent kernels and vsftpd.

Happily, there is a work-around: add “seccomp_sandbox=NO” to vsftpd.conf.

Since vsftpd’s documentation is spare, and actual examples are hard to come by, here’s my working config:

listen=YES
local_enable=YES
write_enable=YES
chroot_local_user=YES
allow_writeable_chroot=YES
seccomp_sandbox=NO
ssl_enable=YES
rsa_cert_file=/etc/ssl/private/vsftpd.pem
rsa_private_key_file=/etc/ssl/private/vsftpd.pem

vim, screen, and bracketed paste mode

A little while back an update was introduced, somewhere, that has been driving me nuts.  I didn’t record exactly when it happened or what changed.  I suppose it doesn’t matter now.

The behavior wasn’t easy to pin down at first since it was the confluence of several things: 1) pasting 2) into vim while 3) using a non-xterm terminal like mate-terminal and 4) inside a screen session.

The behavior exhibits in several ways:

  • Pastes appear to be incomplete, or (more correctly) some number of characters at the beginning of the paste go “missing” and actually become commands to vim
  • Pastes are complete but they’re bracketed with \e[200~content\e[201~
    • some people report 0~content1~ instead, but it appears to be the same phenomenon

What’s going on?  It’s a feature called “bracketed paste mode”.  You can google it read up on it, it has some utility.  As far as I can tell it’s related to readline.  But more importantly, there is a fix.

Add this to your ~/.vimrc:

" fix bracketed paste mode
if &term =~ "screen"
  let &t_BE = "\e[?2004h"
  let &t_BD = "\e[?2004l"
  exec "set t_PS=\e[200~"
  exec "set t_PE=\e[201~"
endif

source: https://vimhelp.appspot.com/term.txt.html#xterm-bracketed-paste

WordPress Error: cURL error 6: Couldn’t resolve host ‘dashboard.wordpress.com’

Background:

I maintain a WordPress blog that uses Jetpack’s Stats package.

Issue:

We started getting this error message when opening the ‘Stats’ page:

We were unable to get your stats just now. Please reload this page to try again. If this error persists, please contact support. In your report please include the information below.

User Agent: 'Mozilla/5.0 (X11; Linux x86_64; rv:54.0) Gecko/20100101 Firefox/54.0'
Page URL: 'https://blog.server.tld/wp-admin/admin.php?page=stats&noheader'
API URL: 'https://dashboard.wordpress.com/wp-admin/index.php?noheader=true&proxy&page=stats&blog=XXX&charset=UTF-8&color=fresh&ssl=1&j=1:5.0&main_chart_only'
http_request_failed: 'cURL error 6: Couldn't resolve host 'dashboard.wordpress.com''

The entire Stats block in the Dashboard was empty, and the little graph that shows up in the Admin bar on the site was empty as well.

Other errors noticed:

RSS Error: WP HTTP Error: cURL error 6: Couldn't resolve host 'wordpress.org'
RSS Error: WP HTTP Error: cURL error 6: Couldn't resolve host 'planet.wordpress.org'

These errors were in the WordPress Events and News section, which was also otherwise empty.

This whole thing was ridiculous on it’s face, as the hosts could all be pinged successfully from said server.

I checked with Jetpack’s support, per the instructions above, and got a non-response of “check with your host.”  Well, this isn’t being run on a hosting service so you’re telling me to ask myself.  Thanks for the help anyway.

Resolution:

The machine in question had just upgraded PHP, but Apache had not been restarted yet. The curl errors don’t make much sense, but since when does anything in PHP make sense?

It was kind of a “duh!” moment when I realized that could be the problem.  Restarting Apache seems to have solved it.

NiFi HTTP Service

I’m attempting to set up an HTTP server in NiFi to accept uploads and process them on-demand.  This gets tricky because I want to submit the files using an existing web application that will not be served from NiFi, which leads to trouble with XSS (Cross-Site Scripting) and setting up CORS (Cross Origin Resource Sharing [1]).

The trouble starts with just trying to PUT or POST a simple file.  The error in Firefox reads:

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource (Reason: CORS header 'Access-Control-Allow-Origin' missing).

You can serve up the Javascript that actually performs the upload from NiFi and side-step XSS, but you may still run into trouble with CORS.  You’ll have trouble even if NiFi and your other web server live on the same host (using different ports, of course), as they’re considered different hosts for the purposes of XSS prevention.

handlehttpresponse screen shot
HandleHttpResponse processor config

To make this work, you’ll need to enable specific headers in the HandleHttpResponse processor.  Neither the need to set some headers, nor the headers that need to be set, are documented by NiFi at this time (so far as I can tell).

  1. Open the configuration of the HandleHttpResponse processor
  2. Add the following headers and values as properties and values, but see below for notes regarding the values
    Access-Control-Allow-Origin: *
    
    Access-Control-Allow-Methods: PUT, POST, GET, OPTIONS
    
    Access-Control-Allow-Headers: Accept, Accept-Encoding, Accept-Language, Connection, Content-Length, Content-Type, DNT, Host, Referer, User-Agent, Origin, X-Forwarded-For

You may want to review the value for Access-Control-Allow-Origin, as the wildcard may allow access to unexpected hosts.  If your server is public-facing (why would you do that with NiFi?) then you certainly don’t want a wildcard here.  The wildcard makes configuration much simpler if NiFi is strictly interior-facing, though.

The specific values to set for Access-Control-Allow-Methods depend on what you’re doing.  You’ll probably need OPTIONS for most cases.  I’m serving up static files so I need GET, and I’m receiving uploads that may or may not be chunked, so I need POST and PUT.

The actual headers needed for Access-Control-Allow-Headers is a bit variable.  A wildcard is not an acceptable value here, so you’ll have to list every header you need separately — and there are a bunch of possible headers.  See [3] for an explanation and a fairly comprehensive list of possible headers.  Our list contains a small subset that covers our basic test cases; your mileage may vary.

You may also want to set up a RouteOnAttribute processor to ignore OPTIONS requests (${http.method:equals('OPTIONS')}), otherwise you might see a bunch of zero-byte files in your flow.

References:

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS

[2] http://stackoverflow.com/questions/24371734/firefox-cors-request-giving-cross-origin-request-blocked-despite-headers

[3] http://stackoverflow.com/questions/13146892/cors-access-control-allow-headers-wildcard-being-ignored