Skip to content

Lsof improvements (show deleted as in lsof output) + files_only argument #1827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

SolitudePy
Copy link
Contributor

Improvements for Lsof to show deleted files as in lsof output on live system.
its probably shouldnt be in that function directly as its no longer mimic prepend_path kernel function
this PR breaks commit 68b51e8 but it can easily be fixed if desired.

(venv) ubuntu@ubuntuPC:~/Dev/volatility3$ vol -f ~/dumps/deleted_proc_fd_dump.raw -r json linux.lsof --pid 8321
Volatility 3 Framework 2.26.2
/home/ubuntu/Dev/volatility3/volatility3/framework/deprecation.py:105: FutureWarning: This plugin (PluginRequirement) has been renamed and will be removed in the first release after 2026-06-01. PluginRequirement is to be deprecated. Use VersionRequirement instead.
  warnings.warn(
Progress:  100.00               Stacking attempts finished           
[
...
...
  {
    "Accessed": "2025-06-04T18:18:04.438000+00:00",
    "Changed": "2025-06-04T18:15:11.438000+00:00",
    "Device": "0:23",
    "FD": 2,
    "Inode": 4,
    "Mode": "crw--w----",
    "Modified": "2025-06-04T18:18:04.438000+00:00",
    "PID": 8321,
    "Path": "/dev/pts/1",
    "Process": "copied_bash",
    "Size": 0,
    "TID": 8321,
    "Type": "CHR",
    "__children": []
  },
  {
    "Accessed": "2025-06-04T17:49:01.296000+00:00",
    "Changed": "2025-06-04T18:17:41.693000+00:00",
    "Device": "253:0",
    "FD": 255,
    "Inode": 201866930,
    "Mode": "-rw-r--r--",
    "Modified": "2025-06-04T17:48:56.399000+00:00",
    "PID": 8321,
    "Path": "/tmp/evil.sh (deleted)",
    "Process": "copied_bash",
    "Size": 19,
    "TID": 8321,
    "Type": "REG",
    "__children": []
  }
]
(venv) ubuntu@ubuntuPC:~/Dev/volatility3$ vol -f ~/dumps/deleted_proc_fd_dump.raw linux.lsof --files-only | grep deleted
/home/ubuntu/Dev/volatility3/volatility3/framework/deprecation.py:105: FutureWarning: This plugin (PluginRequirement) has been renamed and will be removed in the first release after 2026-06-01. PluginRequirement is to be deprecated. Use VersionRequirement instead.
  warnings.warn(
799gress799100.0systemd-udevd   8tacking/var/lib/sss/mc/group (deleted) 253:0   201339099       REG     -rw-rw-r--      2025-06-04 17:46:02.113000 UTC  2025-06-04 17:46:02.113000 UTC   2025-06-04 17:46:01.736000 UTC  6940392
799     799     systemd-udevd   9       /var/lib/sss/mc/passwd (deleted)        253:0   201866915       REG     -rw-rw-r--      2025-06-04 17:46:02.109000 UTC  2025-06-04 17:46:02.109000 UTC   2025-06-04 17:46:01.724000 UTC  9253600
906     906     auditd  4       /var/lib/sss/mc/group (deleted) 253:0   201339099       REG     -rw-rw-r--      2025-06-04 17:46:02.113000 UTC  2025-06-04 17:46:02.113000 UTC  2025-06-04 17:46:01.736000 UTC   6940392
906     907     auditd  4       /var/lib/sss/mc/group (deleted) 253:0   201339099       REG     -rw-rw-r--      2025-06-04 17:46:02.113000 UTC  2025-06-04 17:46:02.113000 UTC  2025-06-04 17:46:01.736000 UTC   6940392
...
960     1218    gmain   9       / (deleted)     0:1     26770   REG     -rwxrwxrwx      2025-06-04 17:46:02.397000 UTC  2025-06-04 17:46:02.397000 UTC  2025-06-04 17:46:02.397000 UTC   4096
961     961     sssd_nss        6       /var/lib/sss/mc/passwd (deleted)        253:0   201866915       REG     -rw-rw-r--      2025-06-04 17:46:02.109000 UTC  2025-06-04 17:46:02.109000 UTC   2025-06-04 17:46:01.724000 UTC  9253600

Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, a couple of things fell out of it. Firstly, messing with the core linux extensions can have knock on effects, which is why we version everything. There is a way of updating it so that not everything breaks, but that one's a show stopper.

The other is just about appending to filenames to represent information about the file. If we're going to do that, it has to be unambiguously separable from the filename (ie, you could always tell what was the tag and what was the filename).

@SolitudePy SolitudePy requested a review from ikelos June 11, 2025 16:50
@@ -204,6 +207,9 @@ def do_get_path(cls, rdentry, rmnt, dentry, vfsmnt) -> Union[None, str]:
# path would be /foo/bar/baz, but bar is missing due to smear the results
# returned here will show /foo//baz. Note the // for the missing dname.
return f"<potentially smeared> {path}"

if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0:
path = f"{LinuxUtilities.deleted} {path}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tend to put constants into the constants.linux module, as all caps? Again, I'm not sure it's worthwhile for a tag like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the tag value will change in the future? I can see other modules rely on this tag to check specifics...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if it fits there, I have only seen constants from linux kernel header files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not keen on tacking flags at the start of the path. It feels more and more like there should be a structure or a method that can determine this, that's separate to the filename. I don't really like it for smear, but at least the filename can be smeared. It's not the filename that's deleted, it's the file, this should be a property of the file object, or something you can pass the file to which tells you whether it's been deleted or not. That then also solves the need for string based flags to say meta things about the file that shouldn't actually be part of the filename.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about the logic in 8f60299?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It is returned as a tuple now of (path, is_deleted) that can then be added to the path exactly like GNU utilities does (e.g lsof), also deleted flag is not really part of FD or INODE...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "(deleted)" part does need to be part of the file name, and to match Linux systems it needs to be after the name. You can grep across the maps file of processes to see how its displayed (shown last in this comment). The addition of "(deleted)" comes directly from the kernel in d_path, which is the function we replicate and is the one that generates file paths. It is 100% a bug if we don't show "(deleted)" on these file names, this is the kernel line of code in question:

https://github.com/torvalds/linux/blob/d7b8f8e20813f0179d8ef519541a3527e7661d3a/fs/d_path.c#L287

$ sudo grep delete /proc/*/maps | head -10
/proc/1198626/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154                       /dev/zero (deleted)
/proc/1198626/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7                          /SYSV043c0003 (deleted)
/proc/1242475/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154                       /dev/zero (deleted)
/proc/1242475/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7                          /SYSV043c0003 (deleted)
/proc/1242476/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154                       /dev/zero (deleted)
/proc/1242476/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7                          /SYSV043c0003 (deleted)
/proc/1246761/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154                       /dev/zero (deleted)
/proc/1246761/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7                          /SYSV043c0003 (deleted)
/proc/1246762/maps:7fa0ef35c000-7fa2fc600000 rw-s 00000000 00:01 8154                       /dev/zero (deleted)
/proc/1246762/maps:7fa30085c000-7fa30085d000 rw-s 00000000 00:01 7                          /SYSV043c0003 (deleted)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow up to my own comment, we already show (deleted) correctly, which was fixed in the parity push, in our d_path replication, so whatever new PRs replace/enhance the existing code need to keep it. Link here:

return "/" + pre_name + " (deleted)"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so my bad, apparently this is how the kernel does it (and it does it at the end). Sorry for messing you about @SolitudePy . I do still need @atcuno to do a review of it (because clearly I don't know linux that deeply), but he's very busy in the run up to blackhat/defcon so this may have to just wait a while I'm afraid. In the interim, feel free to move the string back to the end of the filename, sorry for the confusion I caused! 5:S

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I absolutely agree that it shouldnt be in the end for the problems you have mentioned, but thats convention. were there other issues I should fix except that?

Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much improved, last little bits and then it should be good to go.

@SolitudePy SolitudePy requested a review from ikelos June 12, 2025 18:28
Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not happy with tagging filesnames with a string to identify them. It feels like a quick solution that could be setting us up for a whole heap of trouble in the future. Hopefully @atcuno can give this a look and suggest a better means of handling this?

@@ -204,6 +207,9 @@ def do_get_path(cls, rdentry, rmnt, dentry, vfsmnt) -> Union[None, str]:
# path would be /foo/bar/baz, but bar is missing due to smear the results
# returned here will show /foo//baz. Note the // for the missing dname.
return f"<potentially smeared> {path}"

if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0:
path = f"{LinuxUtilities.deleted} {path}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not keen on tacking flags at the start of the path. It feels more and more like there should be a structure or a method that can determine this, that's separate to the filename. I don't really like it for smear, but at least the filename can be smeared. It's not the filename that's deleted, it's the file, this should be a property of the file object, or something you can pass the file to which tells you whether it's been deleted or not. That then also solves the need for string based flags to say meta things about the file that shouldn't actually be part of the filename.

@@ -167,23 +174,28 @@ def list_fds(
linuxutils_symbol_table = task.vol.type_name.split(constants.BANG)[0]

fd_generator = linux.LinuxUtilities.files_descriptors_for_process(
context, linuxutils_symbol_table, task
context, linuxutils_symbol_table, task, files_only=include_files_only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure this can be determined from the file desriptor, and therefore doesn't require tampering with the files_descriptors_for_process method? I really can't believe that we get this structure back, but we can only separate out files from not files within that method? I'd also prefer that we just add it to the file descriptor to say "type is file" rather than this overly specific additional parameter added to this method...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean the method files_descriptors_for_process should return file type as well that will then be used in linux.lsof to filter out non-files? right now linux.lsof uses files_descriptors_for_process directly, which uses LinuxUtilities.path_for_file - and that's where the filtering happens via dname_is_valid, so both methods need to have this argument. or I can also reuse the dname_is_valid logic in linux.lsof which seems less elegant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant it feels like it should be one of the fd_fields returned in the generator and then handled by FDInternal. It might be a lot of work to thread it in that way, and I haven't even considered the version bumps that might be required to achieve it, plus I've no idea whether @atcuno would be happy doing that, but there's no indication that there aren't more fields (requiring more boolean flags to be added to the method) still to come. As such I'd like to find an extensible way of adding those flags, and given the entire method is about returning a set of flags, it feels as though this should be added to that existing set of flags...

@ikelos ikelos requested a review from atcuno June 22, 2025 22:21
@SolitudePy SolitudePy requested a review from ikelos June 26, 2025 10:36
Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my view on these hasn't really changed. Rather than designing the data flow properly, we're trying to carry the information by cramming it into other fields which is then extremely difficult to separate at a later date. It's the wrong way of doing it, and though I don't know how much work the right way would be, this is not it.

@@ -167,23 +174,28 @@ def list_fds(
linuxutils_symbol_table = task.vol.type_name.split(constants.BANG)[0]

fd_generator = linux.LinuxUtilities.files_descriptors_for_process(
context, linuxutils_symbol_table, task
context, linuxutils_symbol_table, task, files_only=include_files_only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant it feels like it should be one of the fd_fields returned in the generator and then handled by FDInternal. It might be a lot of work to thread it in that way, and I haven't even considered the version bumps that might be required to achieve it, plus I've no idea whether @atcuno would be happy doing that, but there's no indication that there aren't more fields (requiring more boolean flags to be added to the method) still to come. As such I'd like to find an extensible way of adding those flags, and given the entire method is about returning a set of flags, it feels as though this should be added to that existing set of flags...

@@ -204,6 +207,9 @@ def do_get_path(cls, rdentry, rmnt, dentry, vfsmnt) -> Union[None, str]:
# path would be /foo/bar/baz, but bar is missing due to smear the results
# returned here will show /foo//baz. Note the // for the missing dname.
return f"<potentially smeared> {path}"

if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0:
path = f"{LinuxUtilities.deleted} {path}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's still fundamentally trying to cram to pieces of information into a single field. Whether the file has been deleted or not is distinct from the filename it has. We should be using a separate piece of information to track that, not overloading the filename field with two pieces of information that we'll later have difficult separating.

@SolitudePy SolitudePy requested a review from ikelos July 11, 2025 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants