File modes in C++20
I was looking at some code that sets file permissions – file modes –
by calling chmod(2)
. The command-line chmod(1)
has a bunch of
possibilities for setting permissions, but the underlying system-call
needs an int
, and the C headers for it are annoying to use.
So I fired up some C++ machinery to “make it nicer to use”.
tl;dr: scroll to the bottom for a compile-time, compact way of writing file permissions so that
chmod(filename, "rwx--x--x"_mode)
does something plausible, with good error messages in case of typo’s.
The manpage for chmod(1)
is fairly extensive. There are two ways to specify a file mode,
either in octal or symbolically. Octal is just a number,
symbolically is much more complicated.
On the C API side,
the manpage for chmod(2)
shows you need a pathname and a mode, and there are a whole bunch
of symbolic constants to use, like S_IRUSR
.
How I think about file modes
It turns out I nearly always think about file modes in octal.
I have somehow internalized things like “755 for executables”
and “666 for crap” and “600 for files in ~/.ssh
” but I
don’t really have names for these things. If I think about
it, I can use the symbolic manipulations like ugo+rw
for crap.
But I don’t see permissions in this symbolic form, and
the octal form is persnickety in C source, probably because
I don’t expect everyone to know “leading 0 means octal”.
But there is a form in which I see file modes every day:
the output from ls -l
, where permissions are shown with
10 characters, the first 10 on this line:
-rw-r--r-- 1 adridg users 0 Apr 30 11:46 example.txt
The first -
says something about the type of file and is
-
for regular files, d
for directories, l
for symbolic links,
and there are others, too. That’s not really the mode, though, while the next
9 characters are: each group of three shows r
, w
, and x
in that order, or a -
in each position, indicating
the read, write, or execute bit for the corresponding class
of logins. The first three are the current user, next
three are group, the last three for others.
The C code to call chmod with this mode looks like
chmod("example.txt", 0644);
chmod("example.txt", S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
One is octal-arcane and the other is just arcane and hard-to-read.
Turning readable modes into numbers
So I thought to myself: I can write a C++ function that
turns the 9-characters of the mode in text form, into
an actual mode_t
value (that’s an integer).
That’s a trivial exercise, really, although it gets a bit persnickety in error handling. I would just throw on an incorrect length, or any incorrect character, and leave it at that.
From there, though, I went on to: I can write a consteval
C++
function that does the computation at compile-time (guaranteed
to be only at compile-time, because consteval
). This is a function
that can only be called with a compile-time string literal,
so the function signature is potentially:
consteval mode_t from_readable_mode(const char (&permission_string)[10])
The array-reference is to 10 characters because there are 9 mode characters,
and then a trailing NUL byte (zero) in a string literal.
This function can be called (in the context, sill, of chmod()
) like so:
chmod("example.txt", from_readable_mode("rw-r--r--"));
and at compile-time that is already computed to be 420 (that’s 0644 octal).
The last step of make-it-cool (for an appropriate value of “cool”) is to turn the whole thing into a user-defined literal. In my source code I can now write
chmod("example.txt", "rw-r--r--"_mode);
Which really satisfies my own desire for “readable, compact, compile-time”.
Don’t get me started on Qt
For some reason, the values of QFileDevice::Permission
are in hexadecimal,
but written as hex, look like the corresponding octal values.
So in Qt code if you don’t use the symbolic representation of the
flags, and just go ram an integer into there, you get
0x644 meaning the same as 0660 in the call to chmod()
and everything becomes just that much more confusing and fraught.
Meaningful error messages
C++ and “helpful, friendly, easy-to-read error messages” go together like peaches and .. battery acid? Like things that don’t go well together at all. In recent GCC versions there has been a marked improvement in the look of error messages.
With some judicious use of templates and naming of types, existing error messages can be improved. You still get a veritable torrent of messages, but if, somewhere in the middle, the error message (here, from Clang 17) says:
mode.h:26:9: note: subexpression not valid in a constant expression
26 | throw invalid_permission_character{};
mode.h:41:9: note: in call to 'expected_character_at_position(&"rwxbadbug"[0])'
41 | detail::expected_character_at_position<3, 'r'>(s) |
Then it’s a lot easier to decide that the character at position 3 – the letter b
–
is not expected, and maybe an r
was expected there instead.
User-defined file mode literals
Here is the definition of my _mode
literal. String-based
literals get a pointer and a size, which is inconvenient
because they don’t turn back into fixed-size arrays.
consteval mode_t operator""_mode(const char *s, size_t len)
{
if (len != 9)
{
throw detail::invalid_permission_string_length{};
}
return detail::from_readable_mode_string(s);
}
Anyway, if the string is the wrong size then
a meaningful exception is thrown, which isn’t constexpr
,
so you get a meaningful error message:
mode.h:65:9: note: subexpression not valid in a constant expression
65 | throw detail::invalid_permission_string_length{};
main.cc:44:24: note: in call to 'operator""_mode(&"birb"[0], 4)'
44 | std::cout << "birb"_mode;
Here us the implementation of the function that does
the actual work, turning the string into a mode_t
:
consteval mode_t from_readable_mode_string(const char * const s)
{
return
detail::expected_character_at_position<0, 'r'>(s) |
detail::expected_character_at_position<1, 'w'>(s) |
detail::expected_character_at_position<2, 'x'>(s) |
detail::expected_character_at_position<3, 'r'>(s) |
detail::expected_character_at_position<4, 'w'>(s) |
detail::expected_character_at_position<5, 'x'>(s) |
detail::expected_character_at_position<6, 'r'>(s) |
detail::expected_character_at_position<7, 'w'>(s) |
detail::expected_character_at_position<8, 'x'>(s);
}
It’s really wordy, which is unfortunate, but by writing it like this, the error message – at least with Clang 17 – repeats the template parameters and mentions the specific subexpression that is problematic.
Here is a Clang 17 error message when using an inappropriate permission character:
mode.h:26:9: note: subexpression not valid in a constant expression
26 | throw invalid_permission_character{};
mode.h:44:9: note: in call to 'expected_character_at_position(&"------uwu"[0])'
44 | detail::expected_character_at_position<6, 'r'>(s) |
mode.h:67:12: note: in call to 'from_readable_mode_string(&"------uwu"[0])'
67 | return detail::from_readable_mode_string(s);
Huh, I guess you can’t give uWu permission to others. The same error message from GCC 13 looks similar:
main.cc:44:18: in 'constexpr' expansion of 'operator""_mode(((const char*)"------uwu"), 9)'
mode.h:67:45: in 'constexpr' expansion of 'detail::from_readable_mode_string(s)'
mode.h:44:55: in 'constexpr' expansion of 'detail::expected_character_at_position<6, 'r'>(((const char*)s))'
mode.h:26:9: error: expression '<throw-expression>' is not a constant expression
26 | throw invalid_permission_character{};
And here’s the implementation that turns a single character into a bit in a file mode value:
template<int position, char accept>
consteval mode_t expected_character_at_position(const char * const permission_string)
{
const char c = permission_string[position];
if(c == accept) { return 1 << (8-position); }
if(c == '-') { return 0; }
throw invalid_permission_character{};
}
This is a bit wordy, but it ensures that
position and the acceptable (expected) char are front-and-center
in error messages, and that the expected character and -
are the only characters for which this is a constant
expression – everything else will fail because
exceptions aren’t constexpr
.
So there you have it, a compact constexpr representation of file
modes with meaningful error messages.
You can find the code in my personal GitHub repository playground,
in the subdirectory mode/
. I won’t link it here because, frankly, it
is time I get my ass in gear and migrate to some other forge.