-
Notifications
You must be signed in to change notification settings - Fork 535
Lsof improvements (show deleted as in lsof output) + files_only argument #1827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, a couple of things fell out of it. Firstly, messing with the core linux extensions can have knock on effects, which is why we version everything. There is a way of updating it so that not everything breaks, but that one's a show stopper.
The other is just about appending to filenames to represent information about the file. If we're going to do that, it has to be unambiguously separable from the filename (ie, you could always tell what was the tag and what was the filename).
@@ -204,6 +207,9 @@ def do_get_path(cls, rdentry, rmnt, dentry, vfsmnt) -> Union[None, str]: | |||
# path would be /foo/bar/baz, but bar is missing due to smear the results | |||
# returned here will show /foo//baz. Note the // for the missing dname. | |||
return f"<potentially smeared> {path}" | |||
|
|||
if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0: | |||
path = f"{LinuxUtilities.deleted} {path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to put constants into the constants.linux
module, as all caps? Again, I'm not sure it's worthwhile for a tag like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the tag value will change in the future? I can see other modules rely on this tag to check specifics...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if it fits there, I have only seen constants from linux kernel header files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not keen on tacking flags at the start of the path. It feels more and more like there should be a structure or a method that can determine this, that's separate to the filename. I don't really like it for smear, but at least the filename can be smeared. It's not the filename that's deleted, it's the file, this should be a property of the file object, or something you can pass the file to which tells you whether it's been deleted or not. That then also solves the need for string based flags to say meta things about the file that shouldn't actually be part of the filename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about the logic in 8f60299?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It is returned as a tuple now of (path, is_deleted) that can then be added to the path exactly like GNU utilities does (e.g lsof), also deleted flag is not really part of FD or INODE...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "(deleted)" part does need to be part of the file name, and to match Linux systems it needs to be after the name. You can grep across the maps
file of processes to see how its displayed (shown last in this comment). The addition of "(deleted)" comes directly from the kernel in d_path
, which is the function we replicate and is the one that generates file paths. It is 100% a bug if we don't show "(deleted)" on these file names, this is the kernel line of code in question:
https://github.com/torvalds/linux/blob/d7b8f8e20813f0179d8ef519541a3527e7661d3a/fs/d_path.c#L287
$ sudo grep delete /proc/*/maps | head -10
/proc/1198626/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154 /dev/zero (deleted)
/proc/1198626/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7 /SYSV043c0003 (deleted)
/proc/1242475/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154 /dev/zero (deleted)
/proc/1242475/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7 /SYSV043c0003 (deleted)
/proc/1242476/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154 /dev/zero (deleted)
/proc/1242476/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7 /SYSV043c0003 (deleted)
/proc/1246761/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154 /dev/zero (deleted)
/proc/1246761/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7 /SYSV043c0003 (deleted)
/proc/1246762/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154 /dev/zero (deleted)
/proc/1246762/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7 /SYSV043c0003 (deleted)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow up to my own comment, we already show (deleted)
correctly, which was fixed in the parity push, in our d_path
replication, so whatever new PRs replace/enhance the existing code need to keep it. Link here:
return "/" + pre_name + " (deleted)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so my bad, apparently this is how the kernel does it (and it does it at the end). Sorry for messing you about @SolitudePy . I do still need @atcuno to do a review of it (because clearly I don't know linux that deeply), but he's very busy in the run up to blackhat/defcon so this may have to just wait a while I'm afraid. In the interim, feel free to move the string back to the end of the filename, sorry for the confusion I caused! 5:S
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well I absolutely agree that it shouldnt be in the end for the problems you have mentioned, but thats convention. were there other issues I should fix except that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much improved, last little bits and then it should be good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not happy with tagging filesnames with a string to identify them. It feels like a quick solution that could be setting us up for a whole heap of trouble in the future. Hopefully @atcuno can give this a look and suggest a better means of handling this?
@@ -204,6 +207,9 @@ def do_get_path(cls, rdentry, rmnt, dentry, vfsmnt) -> Union[None, str]: | |||
# path would be /foo/bar/baz, but bar is missing due to smear the results | |||
# returned here will show /foo//baz. Note the // for the missing dname. | |||
return f"<potentially smeared> {path}" | |||
|
|||
if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0: | |||
path = f"{LinuxUtilities.deleted} {path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not keen on tacking flags at the start of the path. It feels more and more like there should be a structure or a method that can determine this, that's separate to the filename. I don't really like it for smear, but at least the filename can be smeared. It's not the filename that's deleted, it's the file, this should be a property of the file object, or something you can pass the file to which tells you whether it's been deleted or not. That then also solves the need for string based flags to say meta things about the file that shouldn't actually be part of the filename.
@@ -167,23 +174,28 @@ def list_fds( | |||
linuxutils_symbol_table = task.vol.type_name.split(constants.BANG)[0] | |||
|
|||
fd_generator = linux.LinuxUtilities.files_descriptors_for_process( | |||
context, linuxutils_symbol_table, task | |||
context, linuxutils_symbol_table, task, files_only=include_files_only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure this can be determined from the file desriptor, and therefore doesn't require tampering with the files_descriptors_for_process
method? I really can't believe that we get this structure back, but we can only separate out files from not files within that method? I'd also prefer that we just add it to the file descriptor to say "type is file" rather than this overly specific additional parameter added to this method...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean the method files_descriptors_for_process
should return file type as well that will then be used in linux.lsof
to filter out non-files? right now linux.lsof
uses files_descriptors_for_process
directly, which uses LinuxUtilities.path_for_file
- and that's where the filtering happens via dname_is_valid
, so both methods need to have this argument. or I can also reuse the dname_is_valid
logic in linux.lsof
which seems less elegant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it feels like it should be one of the fd_fields
returned in the generator and then handled by FDInternal
. It might be a lot of work to thread it in that way, and I haven't even considered the version bumps that might be required to achieve it, plus I've no idea whether @atcuno would be happy doing that, but there's no indication that there aren't more fields (requiring more boolean flags to be added to the method) still to come. As such I'd like to find an extensible way of adding those flags, and given the entire method is about returning a set of flags, it feels as though this should be added to that existing set of flags...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my view on these hasn't really changed. Rather than designing the data flow properly, we're trying to carry the information by cramming it into other fields which is then extremely difficult to separate at a later date. It's the wrong way of doing it, and though I don't know how much work the right way would be, this is not it.
@@ -167,23 +174,28 @@ def list_fds( | |||
linuxutils_symbol_table = task.vol.type_name.split(constants.BANG)[0] | |||
|
|||
fd_generator = linux.LinuxUtilities.files_descriptors_for_process( | |||
context, linuxutils_symbol_table, task | |||
context, linuxutils_symbol_table, task, files_only=include_files_only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it feels like it should be one of the fd_fields
returned in the generator and then handled by FDInternal
. It might be a lot of work to thread it in that way, and I haven't even considered the version bumps that might be required to achieve it, plus I've no idea whether @atcuno would be happy doing that, but there's no indication that there aren't more fields (requiring more boolean flags to be added to the method) still to come. As such I'd like to find an extensible way of adding those flags, and given the entire method is about returning a set of flags, it feels as though this should be added to that existing set of flags...
@@ -204,6 +207,9 @@ def do_get_path(cls, rdentry, rmnt, dentry, vfsmnt) -> Union[None, str]: | |||
# path would be /foo/bar/baz, but bar is missing due to smear the results | |||
# returned here will show /foo//baz. Note the // for the missing dname. | |||
return f"<potentially smeared> {path}" | |||
|
|||
if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0: | |||
path = f"{LinuxUtilities.deleted} {path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's still fundamentally trying to cram to pieces of information into a single field. Whether the file has been deleted or not is distinct from the filename it has. We should be using a separate piece of information to track that, not overloading the filename field with two pieces of information that we'll later have difficult separating.
Improvements for Lsof to show deleted files as in lsof output on live system.
its probably shouldnt be in that function directly as its no longer mimic
prepend_path
kernel functionthis PR breaks commit 68b51e8 but it can easily be fixed if desired.