Discussion:
bug#10363: /etc/mtab -> /proc/mounts symlink affects df(1) output for /
j***@jidanni.org
2011-12-25 00:40:42 UTC
Permalink
The new symlink on Debian,
$ ls -og /etc/mtab
lrwxrwxrwx 1 12 12-23 22:00 /etc/mtab -> /proc/mounts
Has caused
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 1071468 287940 729100 29% /
udev 248048 0 248048 0% /dev
tmpfs 50564 372 50192 1% /run
/dev/disk/by-uuid/551e44e1-2cad-42cf-a716-f2e6caf9dc78 1071468 287940 729100 29% /
tmpfs 101128 712 100416 1% /tmp
tmpfs 101128 0 101128 0% /run/shm
/dev/sda6 4270273 3711316 341987 92% /home
/dev/sda7 5341549 4336289 733858 86% /var
/dev/sda8 6406856 3024600 3056800 50% /usr
output to 1) repeat / twice, 2) give the long name for /.
This should be reproducible for anyone who has used standard grub and thus has
$ grep -h UUID /boot/grub/grub.cfg /proc/cmdline
matches. More details in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653073 .
Alan Curry
2011-12-26 23:27:05 UTC
Permalink
Post by j***@jidanni.org
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 1071468 287940 729100 29% /
/dev/disk/by-uuid/551e44e1-2cad-42cf-a716-f2e6caf9dc78 1071468 287940 729100 29% /
(I'm replying only on the issue of the duplicate mount point. Someone else
can tackle the long ugly name.)

The one with "rootfs" as its device is the initramfs which you automatically
get with all recent kernels. Even if you aren't using an initramfs, there's
an empty one built into the kernel which gets mounted as the first root
filesystem. The real root gets mounted on top of that.

So this is a special case of a general problem with no easy solution: What
should df do when 2 filesystems are mounted at the same location? It can't
easily give correct information for both of them, since the later mount
obscures the earlier mount from view.

If there's a way for df to get the correct information for the lower mount, I
don't know what it would be. If you have a process with a leftover cwd or
open fd in the obscured filesystem, you can use that. But generally you
won't.

But maybe we could do better than reporting incorrectly that the lower mount
has size and usage identical to the upper mount! At least df could print a
warning at the end if it has seen any duplicate entries. Perhaps there is
some way it could figure out which one is on top, and print a bunch of
question marks as the lower mount's statistics.

If df is running as root, it might be able to unshare(2) the mount namespace,
unmount the upper level, and then statfs the mount point again to get the
correct results for the lower level. That won't work in all cases (even in a
private namespace you can't unmount the filesystem containing your own cwd)
and it does nothing for you if you're not root, but still... it would be a
cool bonus in the cases where it does work.

As a special case, "rootfs" should probably be excluded from the default
listing, since the initramfs is not very interesting most of the time. It
could still be shown with the -a option, although it would always have the
wrong statistics. Or if you really want to be impressive, default to showing
the initramfs if and only if it is the only thing mounted on "/" - so you can
run df within the initramfs before the real root is mounted and get the right
result.

Or... (brace yourself for the most bold idea yet)... can you imagine a kernel
interface that would *cleanly* give access to obscured mount points?

Comments on any of the above? Do the BSDs have any bright ideas we can steal,
or is their df as embarrassingly bad at handling obscured mount points as
ours?
--
Alan Curry
Olaf Titz
2011-12-29 18:28:51 UTC
Permalink
Post by Alan Curry
So this is a special case of a general problem with no easy solution: What
should df do when 2 filesystems are mounted at the same location? It can't
easily give correct information for both of them, since the later mount
obscures the earlier mount from view.
It is a special case of an even more general problem, that mtab or
/proc/self/mounts and therefore mount(8), df(1) etc. only represent the
linear path where a filesystem was mounted at the time it was mounted,
not the underlying tree structure.

What happens with the following sequences, assuming / is the only
mounted filesystem:

mkdir /mnt/p1
mount /dev/sde1 /mnt/p1
mkdir /mnt/p1/p2
mount /dev/sdh1 /mnt/p1/p2

versus

mkdir -p /mnt/p1/p2
mount /dev/sdh1 /mnt/p1/p2
mount /dev/sde1 /mnt/p1

not that that would be very useful, but in general it is possible. In
the second case the filesystem on sdh1 is completely invisible, yet
mtab and /proc/mounts in both cases contain something like

/dev/sde1 /mnt/p1 ...
/dev/sdh1 /mnt/p1/p2 ...

only in different order: the last mounted filesystem comes last.

This way df(1) should already be able to just hide any obscured
filesystem: it could make two passes over the mount list, remembering
every mount point and if a later mount point is equal or a parent of
an earlier one (which can be determined by a simple string compare),
mark the earlier one as invisible. Then in the second pass over the
list output the remaining mounts.

Remains the question whether this is correct in all cases and actually
desirable behaviour. I think the latter is true, because df(1) output
is just a snapshot of how the system looks like to a newly created
process, and a newly created process can't access the obscured
filesystems at all. (The fact that /proc/mounts is a symlink to
/proc/self/mounts hints in the same direction.)

If what's really wanted is the status of all mounted filesystems
whether visible or not, I fear this can't be done without kernel help,
because exactly by the "snapshot as seen by a new process" nature you
don't get a handle to statfs() from the obscured parts. They can be
found by looking in /sys/block or /proc/diskstats but there doesn't
seem to be useful info, perhaps just another sysfs file containing the
statfs() output would already suffice.

Or perhaps just propose that one of the three nearly-identical
/proc/self/mount* files get two additional columns with the info df(1)
needs...

Olaf
Goswin von Brederlow
2012-01-18 14:25:05 UTC
Permalink
Post by Alan Curry
Post by j***@jidanni.org
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 1071468 287940 729100 29% /
/dev/disk/by-uuid/551e44e1-2cad-42cf-a716-f2e6caf9dc78 1071468 287940 729100 29% /
(I'm replying only on the issue of the duplicate mount point. Someone else
can tackle the long ugly name.)
The one with "rootfs" as its device is the initramfs which you automatically
get with all recent kernels. Even if you aren't using an initramfs, there's
an empty one built into the kernel which gets mounted as the first root
filesystem. The real root gets mounted on top of that.
So this is a special case of a general problem with no easy solution: What
should df do when 2 filesystems are mounted at the same location? It can't
easily give correct information for both of them, since the later mount
obscures the earlier mount from view.
The problem also exists in a larger extend with chroots. There will be
lots of entries from outside the chroot that are inaccessible to a df
running inside the chroot.

What df should do is automatically skip the entries that are obscured or
generally inaccessible. Unfortunately the kernel does not (re)sort the
entries correctly following a mount --move call:

rootfs / rootfs rw 0 0
none /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
none /proc proc rw,nosuid,nodev,noexec,relatime 0 0
none /dev devtmpfs rw,relatime,size=491516k,nr_inodes=122879,mode=755 0 0
none /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
/dev/mapper/s-root / ext3 ro,relatime,errors=remount-ro,data=ordered 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,relatime,mode=755 0 0
...

Going by that list the /dev/mapper/s-root filesystems obscures the
rootfs, /sys, /proc, /dev and /dev/pts. In reality though only the
rootfs is obscured because the rest was moved prior to the initramfs
switching / around. What the kernel should do is move the relevant
entries so they appear below the filesystem they are moved to (and
before any that do obscure them, moving them to the bottom isn't always
the right solution).

So at the moment is a bit of a guess which entries are real and which
are obscured. The best you can do is check that each entry is actually a
mountpoint and guess that the last of identical mountpoints is the right
one.
Post by Alan Curry
If there's a way for df to get the correct information for the lower mount, I
don't know what it would be. If you have a process with a leftover cwd or
open fd in the obscured filesystem, you can use that. But generally you
won't.
There afaik isn't and there should not be a way to do so.
Post by Alan Curry
But maybe we could do better than reporting incorrectly that the lower mount
has size and usage identical to the upper mount! At least df could print a
warning at the end if it has seen any duplicate entries. Perhaps there is
some way it could figure out which one is on top, and print a bunch of
question marks as the lower mount's statistics.
Maybe compare the major/minor of the device node with statfs() output.
Post by Alan Curry
If df is running as root, it might be able to unshare(2) the mount namespace,
unmount the upper level, and then statfs the mount point again to get the
correct results for the lower level. That won't work in all cases (even in a
private namespace you can't unmount the filesystem containing your own cwd)
and it does nothing for you if you're not root, but still... it would be a
cool bonus in the cases where it does work.
As a special case, "rootfs" should probably be excluded from the default
listing, since the initramfs is not very interesting most of the time. It
could still be shown with the -a option, although it would always have the
wrong statistics. Or if you really want to be impressive, default to showing
the initramfs if and only if it is the only thing mounted on "/" - so you can
run df within the initramfs before the real root is mounted and get the right
result.
What if you only have a rootfs?

Imho the /proc/mounts file should only contain entries visible in the
processes mount namespace. So for normal systems the rootfs shouldn't
appear and in chroots the list should be even shorter.
Post by Alan Curry
Or... (brace yourself for the most bold idea yet)... can you imagine a kernel
interface that would *cleanly* give access to obscured mount points?
I fear that would let too much information escape from/into the mount
namesapces.

But there could be a /proc/global-mounts or something that is only
readable from the root namespace.
Post by Alan Curry
Comments on any of the above? Do the BSDs have any bright ideas we can steal,
or is their df as embarrassingly bad at handling obscured mount points as
ours?
MfG
Goswin
Paul Eggert
2012-01-18 18:12:36 UTC
Permalink
Post by Goswin von Brederlow
What df should do is automatically skip the entries that are obscured or
generally inaccessible.
Isn't this missing some of the larger context? df is just doing what
lots of other programs do: finding out what file systems one has,
and reporting statistics on them. It sounds suboptimal to require
the maintainers of all these programs (coreutils, nautilus, etc.)
to rewrite their apps to deal with obscured entries. Surely it would
be better to have the kernel ordinarily return just the ordinary entries,
and to return obscured entries only when they are specially requested.
That way, this issue would be isolated to the few bits of code that really
want to see obscured entries.
Goswin von Brederlow
2012-01-19 09:57:22 UTC
Permalink
Post by Paul Eggert
Post by Goswin von Brederlow
What df should do is automatically skip the entries that are obscured or
generally inaccessible.
Isn't this missing some of the larger context? df is just doing what
lots of other programs do: finding out what file systems one has,
and reporting statistics on them. It sounds suboptimal to require
the maintainers of all these programs (coreutils, nautilus, etc.)
to rewrite their apps to deal with obscured entries. Surely it would
be better to have the kernel ordinarily return just the ordinary entries,
and to return obscured entries only when they are specially requested.
That way, this issue would be isolated to the few bits of code that really
want to see obscured entries.
+1. Kernel knows best anyway.

MfG
Goswin
Henrique de Moraes Holschuh
2012-01-19 11:23:00 UTC
Permalink
Post by Goswin von Brederlow
Post by Paul Eggert
Post by Goswin von Brederlow
What df should do is automatically skip the entries that are obscured or
generally inaccessible.
Isn't this missing some of the larger context? df is just doing what
lots of other programs do: finding out what file systems one has,
and reporting statistics on them. It sounds suboptimal to require
the maintainers of all these programs (coreutils, nautilus, etc.)
to rewrite their apps to deal with obscured entries. Surely it would
be better to have the kernel ordinarily return just the ordinary entries,
and to return obscured entries only when they are specially requested.
That way, this issue would be isolated to the few bits of code that really
want to see obscured entries.
+1. Kernel knows best anyway.
The kernel has to return all entries that are visible to the current
namespace, otherwise you pretty much cannot know about the existence of
shadowed entries in the first place, and that has all sort of nasty
implications for security and troubleshooting.

The kernel should NOT include entries that are out of reach due to
namespaces or chrooting, but I don't think this is quite correct right now.

If you don't want to show to the user shadowed entries, fix it in the
UI, maybe write a nice LGPL lib and get the various GNU utils to use it
to avoid duplicated effort... or fix it in glibc, if applicable. But
/proc/mounts really has to return complete information.
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Henrique de Moraes Holschuh
2012-01-19 15:29:24 UTC
Permalink
Post by Henrique de Moraes Holschuh
Post by Goswin von Brederlow
Post by Paul Eggert
Post by Goswin von Brederlow
What df should do is automatically skip the entries that are obscured or
generally inaccessible.
Isn't this missing some of the larger context? df is just doing what
lots of other programs do: finding out what file systems one has,
and reporting statistics on them. It sounds suboptimal to require
the maintainers of all these programs (coreutils, nautilus, etc.)
to rewrite their apps to deal with obscured entries. Surely it would
be better to have the kernel ordinarily return just the ordinary entries,
and to return obscured entries only when they are specially requested.
That way, this issue would be isolated to the few bits of code that really
want to see obscured entries.
+1. Kernel knows best anyway.
The kernel has to return all entries that are visible to the current
namespace, otherwise you pretty much cannot know about the existence of
shadowed entries in the first place, and that has all sort of nasty
implications for security and troubleshooting.
The kernel should NOT include entries that are out of reach due to
namespaces or chrooting, but I don't think this is quite correct right now.
If you don't want to show to the user shadowed entries, fix it in the
UI, maybe write a nice LGPL lib and get the various GNU utils to use it
to avoid duplicated effort... or fix it in glibc, if applicable. But
/proc/mounts really has to return complete information.
Note: there is no reason why the kernel could not return the mount
information with shadowed paths removed in a separate procfs node, as
that would cause no security/troubleshooting problems. I do think
kernel people will tell you to fix that in userspace, though.
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Henrique de Moraes Holschuh
2012-01-19 16:30:06 UTC
Permalink
Post by Henrique de Moraes Holschuh
Note: there is no reason why the kernel could not return the mount
information with shadowed paths removed in a separate procfs node, as
that would cause no security/troubleshooting problems.
That's what I was thinking of, and it'd be a much better fix,
as it would fix things for all applications.
The current approach expects all app developers to modify
their applications in order to deal with a feature that app
developers typically don't know about and don't understand;
this isn't a good way to introduce a new feature.
On the app side, I will tell you what you're likely to get back from the
crowd on LKML: write a proper BSD/MIT/LGPL library so that people don't
have to reinvent the wheel, and fix it in userspace. It gets worse: such
library interface already exists, in the form of getmntent, setmntent,
addmntent, endmntent, hasmntopt, getmntent_r. So they will tell you to fix
it in glibc.

AFAIK, the kernel is not in any better position to remove shadowed paths
than userspace, both are perfectly capable of doing it. Now, removing
paths that are outside of the current process scope (due to namespaces or
chroot or whatever), THAT is something only the kernel can do correctly...
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Paul Eggert
2012-01-19 22:17:11 UTC
Permalink
Post by Henrique de Moraes Holschuh
On the app side, I will tell you what you're likely to get back from the
crowd on LKML: write a proper BSD/MIT/LGPL library
This argument would have stronger force if there were real code in
a real application, code that solved the overall problem -- code
that we could read and run. I don't know of any such code.
Post by Henrique de Moraes Holschuh
the kernel is not in any better position to remove shadowed paths
than userspace, both are perfectly capable of doing it.
This seems to contradict an earlier comment made by someone else,
"So at the moment is a bit of a guess which entries are real and which
are obscured." <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=10363#53>

I don't know who's right, nor do I understand what all the underlying
issues are. I expect most other app developers are in a similar boat.
It's not a good situation to be in.
Goswin von Brederlow
2012-01-20 07:50:55 UTC
Permalink
Post by Henrique de Moraes Holschuh
Post by Henrique de Moraes Holschuh
Note: there is no reason why the kernel could not return the mount
information with shadowed paths removed in a separate procfs node, as
that would cause no security/troubleshooting problems.
That's what I was thinking of, and it'd be a much better fix,
as it would fix things for all applications.
The current approach expects all app developers to modify
their applications in order to deal with a feature that app
developers typically don't know about and don't understand;
this isn't a good way to introduce a new feature.
On the app side, I will tell you what you're likely to get back from the
crowd on LKML: write a proper BSD/MIT/LGPL library so that people don't
have to reinvent the wheel, and fix it in userspace. It gets worse: such
library interface already exists, in the form of getmntent, setmntent,
addmntent, endmntent, hasmntopt, getmntent_r. So they will tell you to fix
it in glibc.
How do you decide which of two conflicting entries is real? Since mount
--move does not change the order of entries you can not just pick the
last one.

For example which entry is the right one with an output like this:

tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=357828k,mode=755 0 0
tmpfs /run tmpfs rw,suid,exec,relatime,size=357828k,mode=755 0 0

I don't think this can be fixed in userspace alone. At a minimum the
kernel has to keep entries in order of visibility, i.e. the later
entries always shadow earlier entries. Which means that on mount --move
the kernel has to move the entry in /proc/mounts up or down as needed.

MfG
Goswin

PS: I think you can also mount something below an already shadowed entry
(if you have a shell with cwd in the shadowed one) and it would show up
in the wrong spot in /proc/mounts.
Henrique de Moraes Holschuh
2012-01-20 17:28:19 UTC
Permalink
Post by Goswin von Brederlow
Post by Henrique de Moraes Holschuh
Post by Henrique de Moraes Holschuh
Note: there is no reason why the kernel could not return the mount
information with shadowed paths removed in a separate procfs node, as
that would cause no security/troubleshooting problems.
That's what I was thinking of, and it'd be a much better fix,
as it would fix things for all applications.
The current approach expects all app developers to modify
their applications in order to deal with a feature that app
developers typically don't know about and don't understand;
this isn't a good way to introduce a new feature.
On the app side, I will tell you what you're likely to get back from the
crowd on LKML: write a proper BSD/MIT/LGPL library so that people don't
have to reinvent the wheel, and fix it in userspace. It gets worse: such
library interface already exists, in the form of getmntent, setmntent,
addmntent, endmntent, hasmntopt, getmntent_r. So they will tell you to fix
it in glibc.
How do you decide which of two conflicting entries is real? Since mount
--move does not change the order of entries you can not just pick the
last one.
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=357828k,mode=755 0 0
tmpfs /run tmpfs rw,suid,exec,relatime,size=357828k,mode=755 0 0
I don't think this can be fixed in userspace alone. At a minimum the
kernel has to keep entries in order of visibility, i.e. the later
entries always shadow earlier entries. Which means that on mount --move
the kernel has to move the entry in /proc/mounts up or down as needed.
Yes, it would have to order in that way.
Post by Goswin von Brederlow
PS: I think you can also mount something below an already shadowed entry
(if you have a shell with cwd in the shadowed one) and it would show up
in the wrong spot in /proc/mounts.
I believe that's correct, and should be fixed.
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Paul Eggert
2012-01-19 15:48:37 UTC
Permalink
Post by Henrique de Moraes Holschuh
Note: there is no reason why the kernel could not return the mount
information with shadowed paths removed in a separate procfs node, as
that would cause no security/troubleshooting problems.
That's what I was thinking of, and it'd be a much better fix,
as it would fix things for all applications.

The current approach expects all app developers to modify
their applications in order to deal with a feature that app
developers typically don't know about and don't understand;
this isn't a good way to introduce a new feature.
Goswin von Brederlow
2012-01-20 07:42:26 UTC
Permalink
Post by Henrique de Moraes Holschuh
Post by Goswin von Brederlow
Post by Paul Eggert
Post by Goswin von Brederlow
What df should do is automatically skip the entries that are obscured or
generally inaccessible.
Isn't this missing some of the larger context? df is just doing what
lots of other programs do: finding out what file systems one has,
and reporting statistics on them. It sounds suboptimal to require
the maintainers of all these programs (coreutils, nautilus, etc.)
to rewrite their apps to deal with obscured entries. Surely it would
be better to have the kernel ordinarily return just the ordinary entries,
and to return obscured entries only when they are specially requested.
That way, this issue would be isolated to the few bits of code that really
want to see obscured entries.
+1. Kernel knows best anyway.
The kernel has to return all entries that are visible to the current
namespace, otherwise you pretty much cannot know about the existence of
shadowed entries in the first place, and that has all sort of nasty
implications for security and troubleshooting.
The kernel should NOT include entries that are out of reach due to
namespaces or chrooting, but I don't think this is quite correct right now.
If you don't want to show to the user shadowed entries, fix it in the
UI, maybe write a nice LGPL lib and get the various GNU utils to use it
to avoid duplicated effort... or fix it in glibc, if applicable. But
/proc/mounts really has to return complete information.
But isn't the rootfs out of reach because the initramfs "chroots" to the
real root and starts /sbin/init? Back when pivot_root was used that
was combined with an actual call to chroot. Before run-init combined the
two.

I'm not realy disagreeing with you but argue that the duplicate rootfs
entry is not visible to the namespace.

Same with later chroots:

***@frosties:~/chroot% sudo chroot . df
df: `/sys': No such file or directory
df: `/dev': No such file or directory
df: `/dev/pts': No such file or directory
df: `/run': No such file or directory
df: `/tmp': No such file or directory
df: `/usr': No such file or directory
df: `/var': No such file or directory
df: `/home': No such file or directory
df: `/var/lib/nfs/rpc_pipefs': No such file or directory
df: `/sys/fs/fuse/connections': No such file or directory
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 1789128 1808 1787320 1% /
/dev/mapper/r-root 1789128 1808 1787320 1% /
tmpfs 1789128 1808 1787320 1% /

What it should show is only the last entry, the tmpfs the chroot is
on. All other entries are not visible to the processes inside the
chroot.

Note that in a chroot any mountpoints inside the chroot have their
prefix removed (/home/mrvn/chroot becomes /) while others are left as
is. That is wrong too IMHO. The filesystem the chroots / is on should
become / even if the chroot is a directory instead of a mountpoint and
entries outside the chroot should not be listed at all.

MfG
Goswin
Henrique de Moraes Holschuh
2012-01-20 17:41:28 UTC
Permalink
Post by Goswin von Brederlow
Post by Henrique de Moraes Holschuh
The kernel has to return all entries that are visible to the current
namespace, otherwise you pretty much cannot know about the existence of
shadowed entries in the first place, and that has all sort of nasty
implications for security and troubleshooting.
The kernel should NOT include entries that are out of reach due to
namespaces or chrooting, but I don't think this is quite correct right now.
...
Post by Goswin von Brederlow
But isn't the rootfs out of reach because the initramfs "chroots" to the
real root and starts /sbin/init? Back when pivot_root was used that
was combined with an actual call to chroot. Before run-init combined the
two.
That's what I meant with "I don't think this is quite correct right
now".
Post by Goswin von Brederlow
I'm not realy disagreeing with you but argue that the duplicate rootfs
entry is not visible to the namespace.
I am not sure how /proc/mounts and friends should play with chroot(). I
suppose it depends on whether one can actually reach that path somehow.
If it is forever unacessible, IMO it is effectively outside the
namespace and I believe it should not be visible. But that's where I
reach the limits of my knowledge, and I can't really argue about it.
Post by Goswin von Brederlow
What it should show is only the last entry, the tmpfs the chroot is
on. All other entries are not visible to the processes inside the
chroot.
I think you're correct in this.
Post by Goswin von Brederlow
Note that in a chroot any mountpoints inside the chroot have their
prefix removed (/home/mrvn/chroot becomes /) while others are left as
is. That is wrong too IMHO. The filesystem the chroots / is on should
become / even if the chroot is a directory instead of a mountpoint and
entries outside the chroot should not be listed at all.
I also think you're correct here, but as I said, chroot() is tricky, and
I am wary of arguing too much about it without strong knowledge about
the nuances, which I don't have.

Maybe this thread really ought to move to linux-fsdevel or LKML?
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Assaf Gordon
2018-10-15 15:15:38 UTC
Permalink
tags 10363 fixed
close 10363
stop

(triaging old bugs)

Fix for 'long ugly UUID names' commited in
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=1e18d8416f9ef43bf08982cabe54220587061a08

With no further follow-ups about other issues raised
in this thread in 6 years, I'm closing this bug.

regards,
- assaf

Andreas Schwab
2012-01-20 11:10:23 UTC
Permalink
Post by Goswin von Brederlow
Note that in a chroot any mountpoints inside the chroot have their
prefix removed (/home/mrvn/chroot becomes /) while others are left as
is. That is wrong too IMHO. The filesystem the chroots / is on should
become / even if the chroot is a directory instead of a mountpoint and
entries outside the chroot should not be listed at all.
You can get such a view from /proc/self/mountinfo.

Andreas.
--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Jim Meyering
2011-12-29 14:09:03 UTC
Permalink
On systems with recent kernel/tools, a symlink from /etc/mtab to
/proc/mounts, and a by-UUID mount (i.e., soon, nearly everyone),
you will see something like the following when running "df -hT":

Filesystem Type Size Used Avail Use% Mounted on
rootfs rootfs 11G 1.9G 8.0G 19% /
udev devtmpfs 3.8G 0 3.8G 0% /dev
tmpfs tmpfs 774M 376K 774M 1% /run
/dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7096a2edb66 ext4 11G 1.9G 8.0G 19% /
tmpfs tmpfs 1.6G 8.0K 1.6G 1% /run/shm
/dev/sda2 ext3 494M 78M 392M 17% /boot
/dev/sda5 ext4 12G 7.6G 3.7G 68% /usr
/dev/sda6 ext4 9.9G 6.6G 2.8G 71% /var

Contrast that with what we're used to seeing (modulo the
two entries mounted on "/", which is a separate problem):

Filesystem Type Size Used Avail Use% Mounted on
rootfs rootfs 11G 1.9G 8.0G 19% /
udev devtmpfs 3.8G 0 3.8G 0% /dev
tmpfs tmpfs 774M 376K 774M 1% /run
/dev/sda3 ext4 11G 1.9G 8.0G 19% /
tmpfs tmpfs 1.6G 8.0K 1.6G 1% /run/shm
/dev/sda2 ext3 494M 78M 392M 17% /boot
/dev/sda5 ext4 12G 7.6G 3.7G 68% /usr
/dev/sda6 ext4 9.9G 6.6G 2.8G 71% /var

When that long /dev/disk/by-*/... name is merely a symlink
to a much shorter (and often more useful) device name like
"/dev/sda3", and when it's part of a listing of all file systems,
I would much prefer to see only the latter.

I.e., if I explicitly run
"df -hT /dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7096a2edb66",
*then*, it's fine -- and expected -- to print to the long name.
It was explicitly given. However, with no non-option argument,
df should print the shorter name. Note that performing this
translation at a lower level (via a change to gnulib's mountlist.c)
would make it impossible to distinguish those two cases.

Here's a patch to do that.
It's a little larger than you might expect because of two factors:
- it modifies each of five get_dev call-points
- to avoid introducing a leak, it allocates space for dev_name earlier

I hesitated to make this change, because it feels like a kludge that's
working around a misfeature that should be fixed elsewhere. I'd rather
that the long root device name not appear in the mount listing
(/proc/mounts) in the first place.

If anyone can assure us that this will soon be "fixed" so that we
don't get such entries in /proc/mounts quite so easily, I'll be
happy to defer or drop the patch.
Pádraig Brady
2011-12-29 16:02:08 UTC
Permalink
Post by Jim Meyering
On systems with recent kernel/tools, a symlink from /etc/mtab to
/proc/mounts, and a by-UUID mount (i.e., soon, nearly everyone),
Filesystem Type Size Used Avail Use% Mounted on
rootfs rootfs 11G 1.9G 8.0G 19% /
/dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7096a2edb66 ext4 11G 1.9G 8.0G 19% /
Ouch.

BTW, this is the first mail I've gotten about 10363?

I guess this is the right thing to do.
I.E. have higher level paths in /proc/mounts, allowing
tools that require lower level paths to traverse the links
and pick the appropriate one.

The highest level path needs to be available to do this,
so I suppose this is appropriate in /proc/mounts,
even though it might not be the most appropriate for
human consumption, as can be seen above.

The patch looks good.
I guess "9" is the only questionable bit.
On my Fedora 16 system I have in /proc/mounts:

/dev/mapper/vg_tp1-lv_root ...... /

That's a fairly informative name, whereas the links further
resolve to a fairly generic:

/dev/dm-2

Hmm, I was contemplating using the old wrap limit of 19,
but apart from not handling the above, using a width
is inconsistent. Perhaps it's better to always resolve?
I.E. always print the base device. It seems that one
can work back from this anyway:

$ findmnt --source /dev/dm-2
TARGET SOURCE FSTYPE OPTIONS
/ /dev/mapper/vg_tp1-lv_root ext4 rw,relatime,...

cheers,
Pádraig.
Jim Meyering
2011-12-30 14:00:20 UTC
Permalink
Post by Pádraig Brady
Post by Jim Meyering
On systems with recent kernel/tools, a symlink from /etc/mtab to
/proc/mounts, and a by-UUID mount (i.e., soon, nearly everyone),
Filesystem Type Size Used Avail Use% Mounted on
rootfs rootfs 11G 1.9G 8.0G 19% /
/dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7096a2edb66 ext4 11G
1.9G 8.0G 19% /
Ouch.
BTW, this is the first mail I've gotten about 10363?
The first message I received had these headers:

X-Debbugs-Original-Cc: ***@bugs.debian.org, bug-***@gnu.org
From: ***@jidanni.org
References: <***@codelibre.net>
Date: Sun, 25 Dec 2011 08:40:42 +0800
Post by Pádraig Brady
I guess this is the right thing to do.
I.E. have higher level paths in /proc/mounts, allowing
tools that require lower level paths to traverse the links
and pick the appropriate one.
The highest level path needs to be available to do this,
so I suppose this is appropriate in /proc/mounts,
even though it might not be the most appropriate for
human consumption, as can be seen above.
The patch looks good.
I guess "9" is the only questionable bit.
/dev/mapper/vg_tp1-lv_root ...... /
That's a fairly informative name, whereas the links further
Good point.
I've just posted a revision that replaces only /dev/disk/by-uuid/... symlinks.
Post by Pádraig Brady
/dev/dm-2
Hmm, I was contemplating using the old wrap limit of 19,
but apart from not handling the above, using a width
is inconsistent. Perhaps it's better to always resolve?
I'd rather keep this minimal, since it changes the default.
Dereferencing all device symlinks seems like a job for a new option,
assuming there's sufficient justification.
Post by Pádraig Brady
I.E. always print the base device. It seems that one
$ findmnt --source /dev/dm-2
TARGET SOURCE FSTYPE OPTIONS
/ /dev/mapper/vg_tp1-lv_root ext4 rw,relatime,...
Paul Eggert
2011-12-29 16:53:06 UTC
Permalink
+ /* If dev_name is a long-named symlink like
+ /dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66 and its
+ canonical name is shorter, use the shorter name. But don't bother
+ checking when DEV_NAME is no longer than e.g., "/dev/sda1" */
+ if (resolve_device_symlink && 9 < orig_len
+ && (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
+ {
+ if (strlen (resolved_dev) < orig_len)
+ {
+ free (dev_name);
+ dev_name = resolved_dev;
+ }
+ else
+ {
+ free (resolved_dev);
+ }
+ }
I have some qualms about that "is shorter" part;
couldn't that lead to confusing results, on systems
where the canonical name is sometimes a bit shorter and sometimes
a bit longer?

Also, that "9 < orig_len" could also cause confusion.

The flag "resolve_device_symlink" suggests that
the name should always be resolved, at any rate.

In short, how a simpler approach, that always resolves
symlinks? Something like this:

/* If dev_name is a symlink use the resolved name.
On recent GNU/Linux systems we often see a symlink from, e.g.,
/dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66
tp /dev/sda3 and it's better to output /dev/sda3. */
if (resolve_device_symlink
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
{
free (dev_name);
dev_name = resolved_dev;
}
Jim Meyering
2011-12-30 13:52:13 UTC
Permalink
Post by Paul Eggert
+ /* If dev_name is a long-named symlink like
+ /dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66 and its
+ canonical name is shorter, use the shorter name. But don't bother
+ checking when DEV_NAME is no longer than e.g., "/dev/sda1" */
+ if (resolve_device_symlink && 9 < orig_len
+ && (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
+ {
+ if (strlen (resolved_dev) < orig_len)
+ {
+ free (dev_name);
+ dev_name = resolved_dev;
+ }
+ else
+ {
+ free (resolved_dev);
+ }
+ }
I have some qualms about that "is shorter" part;
couldn't that lead to confusing results, on systems
where the canonical name is sometimes a bit shorter and sometimes
a bit longer?
Also, that "9 < orig_len" could also cause confusion.
The flag "resolve_device_symlink" suggests that
the name should always be resolved, at any rate.
In short, how a simpler approach, that always resolves
/* If dev_name is a symlink use the resolved name.
On recent GNU/Linux systems we often see a symlink from, e.g.,
/dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66
tp /dev/sda3 and it's better to output /dev/sda3. */
if (resolve_device_symlink
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
{
free (dev_name);
dev_name = resolved_dev;
}
Thanks for the suggestion. I've dropped the 9 < ... part, but since this
is (at least planned) to be the default, I want to limit the scope of the
change to those very long by-uuid symlinks, so I've adjusted it like this:

/* On some systems, dev_name is a long-named symlink like
/dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66 pointing
to a much shorter and more useful name like /dev/sda1.
When resolve_device_symlink is true and dev_name is a symlink whose
name starts with /dev/disk/by-uuid/ use the resolved name instead. */
if (resolve_device_symlink
&& STRPREFIX (dev_name, "/dev/disk/by-uuid/")
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
{
free (dev_name);
dev_name = resolved_dev;
}

I considered matching only "/dev/disk/by-", in case some initrd
uses by-id, by-label or by-path, but I doubt that will happen often
enough, so will keep this change very precisely targeted.

I've also changed the parameter name from resolve_device_symlink to
process_all. I don't particular like that name either.
Suggestions welcome.
Jim Meyering
2012-01-02 17:34:20 UTC
Permalink
Jim Meyering wrote:
...
+ char *dev_name = xstrdup (disk);
+ char *resolved_dev;
+
+ /* On some systems, dev_name is a long-named symlink like
+ /dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66 pointing
+ to a much shorter and more useful name like /dev/sda1.
+ When process_all is true and dev_name is a symlink whose name starts
+ with /dev/disk/by-uuid/ use the resolved name instead. */
+ if (process_all
+ && STRPREFIX (dev_name, "/dev/disk/by-uuid/")
+ && (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
+ {
+ free (dev_name);
+ dev_name = resolved_dev;
+ }
I noticed that there is a similar problem on any recent
system with an encrypted root partition.
In that case, the kernel is booted with an argument like this

root=/dev/mapper/luks-88888888-8888-8888-8888-888888888888

and that same name appears in /proc/mounts and thus, even with
the proposed patch, in df's "Filesystem" column. The knee-jerk
reaction is to do this:

if (process_all
&& (STRPREFIX (dev_name, "/dev/disk/by-uuid/")
|| STRPREFIX (dev_name, "/dev/mapper/luks-"))
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))

but will the prefix always be spelled that way?

Now, I'm thinking about making this a little more future-proof by
matching the UUID part /[0-9a-fA-F-]{36}$/ instead.
I.e., testing something like this:

if (process_all
&& has_uuid_suffix (dev_name)
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))

using this new function:

static bool
has_uuid_suffix (char const *s)
{
size_t len = strlen (s);
return (36 < len
&& strspn (s + len - 36, "-0123456789abcdefABCDEF") == 36);
}
Pádraig Brady
2012-01-02 19:27:20 UTC
Permalink
Post by Jim Meyering
...
+ char *dev_name = xstrdup (disk);
+ char *resolved_dev;
+
+ /* On some systems, dev_name is a long-named symlink like
+ /dev/disk/by-uuid/828fc648-9f30-43d8-a0b1-f7196a2edb66 pointing
+ to a much shorter and more useful name like /dev/sda1.
+ When process_all is true and dev_name is a symlink whose name starts
+ with /dev/disk/by-uuid/ use the resolved name instead. */
+ if (process_all
+ && STRPREFIX (dev_name, "/dev/disk/by-uuid/")
+ && (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
+ {
+ free (dev_name);
+ dev_name = resolved_dev;
+ }
I noticed that there is a similar problem on any recent
system with an encrypted root partition.
In that case, the kernel is booted with an argument like this
root=/dev/mapper/luks-88888888-8888-8888-8888-888888888888
and that same name appears in /proc/mounts and thus, even with
the proposed patch, in df's "Filesystem" column. The knee-jerk
if (process_all
&& (STRPREFIX (dev_name, "/dev/disk/by-uuid/")
|| STRPREFIX (dev_name, "/dev/mapper/luks-"))
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
but will the prefix always be spelled that way?
Now, I'm thinking about making this a little more future-proof by
matching the UUID part /[0-9a-fA-F-]{36}$/ instead.
if (process_all
&& has_uuid_suffix (dev_name)
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
static bool
has_uuid_suffix (char const *s)
{
size_t len = strlen (s);
return (36 < len
&& strspn (s + len - 36, "-0123456789abcdefABCDEF") == 36);
}
Yes that's awkward but warranted.
The logic looks correct.

This is assuming of course that UUIDs are the only "high level" form
presented in /proc/mounts that one doesn't want displayed.
I can't think of anything else worth avoiding at the moment.

cheers,
Pádraig.
Jim Meyering
2012-01-02 21:12:32 UTC
Permalink
Pádraig Brady wrote:
...
Post by Pádraig Brady
Post by Jim Meyering
Now, I'm thinking about making this a little more future-proof by
matching the UUID part /[0-9a-fA-F-]{36}$/ instead.
if (process_all
&& has_uuid_suffix (dev_name)
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
static bool
has_uuid_suffix (char const *s)
{
size_t len = strlen (s);
return (36 < len
&& strspn (s + len - 36, "-0123456789abcdefABCDEF") == 36);
}
Yes that's awkward but warranted.
The logic looks correct.
This is assuming of course that UUIDs are the only "high level" form
presented in /proc/mounts that one doesn't want displayed.
I can't think of anything else worth avoiding at the moment.
Thanks for the review.
Here's an updated patch.
Changes:
- added comment for new function
- added _GL_ATTRIBUTE_PURE, too
- updated commit log and comments
Jim Meyering
2012-01-03 15:58:11 UTC
Permalink
Post by Jim Meyering
...
Post by Pádraig Brady
Post by Jim Meyering
Now, I'm thinking about making this a little more future-proof by
matching the UUID part /[0-9a-fA-F-]{36}$/ instead.
if (process_all
&& has_uuid_suffix (dev_name)
&& (resolved_dev = canonicalize_filename_mode (dev_name, CAN_EXISTING)))
static bool
has_uuid_suffix (char const *s)
{
size_t len = strlen (s);
return (36 < len
&& strspn (s + len - 36, "-0123456789abcdefABCDEF") == 36);
}
Yes that's awkward but warranted.
The logic looks correct.
This is assuming of course that UUIDs are the only "high level" form
presented in /proc/mounts that one doesn't want displayed.
I can't think of anything else worth avoiding at the moment.
Thanks for the review.
Here's an updated patch.
- added comment for new function
- added _GL_ATTRIBUTE_PURE, too
- updated commit log and comments
Pádraig Brady
2012-01-03 16:23:50 UTC
Permalink
diff --git a/NEWS b/NEWS
+ df, with no non-option argument and recent enough kernel/tools, would
+ print a long UUID-including file system name, pushing second and subsequent
+ columns far to the right. Now, when that long name refers to a symlink,
+ df prints the usually-short referent instead.
I would change this:

s/would print a long UUID-including file system name/
could print long UUID-including file system names/

I usually try to make the first line of the NEWS entry a summary, like:

df avoids long UUID-including file system names, in the default listing.
On recent enough kernel/tools, these long names can be used, pushing second
and subsequent columns far to the right. Now, when a long name refers
to a symlink, and no file systems are specified, df prints the
usually-short referent instead.

cheers,
Pádraig.
Jim Meyering
2012-01-03 16:26:18 UTC
Permalink
Post by Pádraig Brady
diff --git a/NEWS b/NEWS
+ df, with no non-option argument and recent enough kernel/tools, would
+ print a long UUID-including file system name, pushing second and subsequent
+ columns far to the right. Now, when that long name refers to a symlink,
+ df prints the usually-short referent instead.
s/would print a long UUID-including file system name/
could print long UUID-including file system names/
df avoids long UUID-including file system names, in the default listing.
On recent enough kernel/tools, these long names can be used, pushing second
and subsequent columns far to the right. Now, when a long name refers
to a symlink, and no file systems are specified, df prints the
usually-short referent instead.
Thanks. I prefer your wording, too.
Will adjust.
Jim Meyering
2012-01-03 16:36:10 UTC
Permalink
Post by Pádraig Brady
diff --git a/NEWS b/NEWS
+ df, with no non-option argument and recent enough kernel/tools, would
+ print a long UUID-including file system name, pushing second and subsequent
+ columns far to the right. Now, when that long name refers to a symlink,
+ df prints the usually-short referent instead.
s/would print a long UUID-including file system name/
could print long UUID-including file system names/
df avoids long UUID-including file system names, in the default listing.
On recent enough kernel/tools, these long names can be used, pushing second
and subsequent columns far to the right. Now, when a long name refers
to a symlink, and no file systems are specified, df prints the
usually-short referent instead.
Thanks, modulo minor rewording:
Bob Proulx
2011-12-29 17:39:34 UTC
Permalink
Post by Jim Meyering
On systems with recent kernel/tools, a symlink from /etc/mtab to
/proc/mounts, and a by-UUID mount (i.e., soon, nearly everyone),
Very unpleasant output.
Post by Jim Meyering
Here's a patch to do that.
...
I hesitated to make this change, because it feels like a kludge that's
working around a misfeature that should be fixed elsewhere. I'd rather
that the long root device name not appear in the mount listing
(/proc/mounts) in the first place.
If anyone can assure us that this will soon be "fixed" so that we
don't get such entries in /proc/mounts quite so easily, I'll be
happy to defer or drop the patch.
+1 vote for doing something, either this or similar, to address the
issue in the coreutils df. Otherwise there will be an endless stream
of annoyed people knocking down the door of the messenger.

Bob
Loading...