Skip to content

Safe copy to fix issue #888 #889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 2, 2024

Conversation

jkriegshauser
Copy link
Contributor

Fix for issue #888

Instead of attempting to read potentially unsafe memory that may be in a module that unloaded, this uses Structured Exception Handling on Windows or pipes on POSIX platforms to ensure that the entire memory region can be read.

@wolfpld
Copy link
Owner

wolfpld commented Sep 23, 2024

What exactly is this code doing?

@jkriegshauser
Copy link
Contributor Author

Basically it treats memory requested by HandleSymbolCodeQuery as unsafe to read:

  • On Windows, it does memcpy() inside a structured exception block. If an access violation occurs during the memcpy, the __except portion of the block is triggered and the memory cannot be safely read.
  • On Linux/Mac, it writes the memory to a pipe which causes the kernel to attempt to read the memory. If the memory could not be read, write() sets errno to EFAULT and the memory cannot be safely read.

In both of these instances, if memory cannot be safely read, withSafeCopy() returns false and causes HandleSymbolCodeQuery to reply with AckSymbolCodeNotAvailable(). If memory was safely copied, SendLongString() is invoked with the symbol code.

@jkriegshauser
Copy link
Contributor Author

Looks like there are issues with Vectored exception handling on Windows blocking the frame-based handler. More info here. I’ll address tomorrow.

@wolfpld
Copy link
Owner

wolfpld commented Sep 24, 2024

The pipe trick relying on EFAULT won't work.

EFAULT buf is outside your accessible address space.

Glibc is forwarding the write call to a sys call, which is implemented in fs/read_write.c as:

SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
		size_t, count)
{
	return ksys_write(fd, buf, count);
}

This goes to:

ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count)
{
	struct fd f = fdget_pos(fd);
	ssize_t ret = -EBADF;

	if (fd_file(f)) {
		loff_t pos, *ppos = file_ppos(fd_file(f));
		if (ppos) {
			pos = *ppos;
			ppos = &pos;
		}
		ret = vfs_write(fd_file(f), buf, count, ppos);
		if (ret >= 0 && ppos)
			fd_file(f)->f_pos = pos;
		fdput_pos(f);
	}

	return ret;
}

Which goes to:

ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
{
	ssize_t ret;

	if (!(file->f_mode & FMODE_WRITE))
		return -EBADF;
	if (!(file->f_mode & FMODE_CAN_WRITE))
		return -EINVAL;
	if (unlikely(!access_ok(buf, count)))
		return -EFAULT;
...

And access_ok is in arch/x86/include/asm/uaccess_64.h:

/*
 * User pointers can have tag bits on x86-64.  This scheme tolerates
 * arbitrary values in those bits rather then masking them off.
 *
 * Enforce two rules:
 * 1. 'ptr' must be in the user half of the address space
 * 2. 'ptr+size' must not overflow into kernel addresses
 *
 * Note that addresses around the sign change are not valid addresses,
 * and will GP-fault even with LAM enabled if the sign bit is set (see
 * "CR3.LAM_SUP" that can narrow the canonicality check if we ever
 * enable it, but not remove it entirely).
 *
 * So the "overflow into kernel addresses" does not imply some sudden
 * exact boundary at the sign bit, and we can allow a lot of slop on the
 * size check.
 *
 * In fact, we could probably remove the size check entirely, since
 * any kernel accesses will be in increasing address order starting
 * at 'ptr', and even if the end might be in kernel space, we'll
 * hit the GP faults for non-canonical accesses before we ever get
 * there.
 *
 * That's a separate optimization, for now just handle the small
 * constant case.
 */
static inline bool __access_ok(const void __user *ptr, unsigned long size)
{
	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
		return valid_user_address(ptr);
	} else {
		unsigned long sum = size + (__force unsigned long)ptr;

		return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
	}
}

So you will only get EFAULT if you would want to step over into kernel space, not if you would want to access an unmapped memory address.

FWIW, Tracy on Linux is already protected against this problem by overriding dlclose to do nothing, which is conformant with the documented behavior:

A successful return from dlclose() does not guarantee that the symbols associated with handle are removed from the caller's address space. In addition to references resulting from explicit dlopen() calls, a shared object may have been implicitly loaded (and reference counted) because of dependencies in other shared objects. Only when all references have been released can the shared object be removed from the address space.

https://github.com/wolfpld/tracy/blob/master/public/client/TracyOverride.cpp

@jkriegshauser
Copy link
Contributor Author

jkriegshauser commented Sep 24, 2024

We want these large writes to go through the pipe since the kernel can safely validate the memory for us. And yes, as you pointed out it does an initial access check through __access_ok(), but deeper in pipe_write() it will return EFAULT if copying from any page fails:

static ssize_t
pipe_write(struct kiocb *iocb, struct iov_iter *from)
{
    //...
			ret = copy_page_from_iter(buf->page, offset, chars, from);
			if (unlikely(ret < chars)) {
				ret = -EFAULT;
				goto out;
			}
    //...
			copied = copy_page_from_iter(page, 0, PAGE_SIZE, from);
			if (unlikely(copied < PAGE_SIZE && iov_iter_count(from))) {
				if (!ret)
					ret = -EFAULT;
				break;
			}
    //...
}

This is the behavior that we want since all of this is happening in the Tracy Profiler thread, and other threads are off doing whatever--we want to avoid the race condition. In any case, I'm not specifically looking for errno == EFAULT in my change, merely anything that causes a write to fail other than EINTR (though it should be EFAULT for the type of sanitization this is doing).

As to the TracyOverride.cpp behavior, this is never actually getting triggered for us, possibly because we build the Tracy client as a module and it optionally gets loaded. And our plugin system notices when libraries don't unload as this makes hot-reloading impossible.

@jkriegshauser
Copy link
Contributor Author

Also, the Windows fix for SEH was pushed

@jkriegshauser
Copy link
Contributor Author

jkriegshauser commented Sep 24, 2024

I tested with this and discovered that a Linux pipe doesn't actually allow you to use the full amount of memory without blocking, so I now set the O_NONBLOCK flag and handle it appropriately. This also confirms that partially accessible memory produces an EFAULT from write().

    // Test withSafeCopy
    char* p = "test";
    // 0 bytes: valid
    assert(WithSafeCopy(p, 0, [&](char* v, size_t s) { assert(v != p); assert(s == 0); }));
    // sizeof(p) bytes: valid with copy
    assert(WithSafeCopy(p, 5, [&](char* v, size_t s) { assert(v != p); assert(s == 5); assert(strcmp(v, "test") == 0); }));
    // 1 byte at nullptr: invalid
    assert(!WithSafeCopy(nullptr, 1, [](char*, size_t) { assert(false); }));

    // Create a region that is 4096 bytes of valid memory followed by 4096 bytes of unmapped memory.
    // [region+2048, region+4096+2048): invalid
    p = (char*)mmap(NULL, 4096 * 2, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    munmap(p + 4096, 4096);
    assert(!WithSafeCopy(p + 2048, 4096, [](char*, size_t) { assert(false); }));
    munmap(p, 4096);

    // Create a region that is 2MB of valid memory followed by 4096 bytes of unmapped memory.
    p = (char*)mmap(NULL, 2 * 1024 * 1024 + 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    munmap(p + 2 * 1024 * 1024, 4096);
    char const begin = 99;
    for (char* v = p, b = begin; v != (p + 2 * 1024 * 1024); ++v)
        *v = b++;
    // [region, region+2MB): valid (and bytes checked)
    assert(WithSafeCopy(p, 2 * 1024 * 1024, [&](char* v, size_t s) {
        assert(v != p); assert(s == 2 * 1024 * 1024);
        bool match = true;
        for (char* u = v, b = begin; u != (v + s); ++u)
            match |= (*u == b++);
        assert(match);
    }));
    // [region+2048, region+2MB+2048): invalid
    assert(!WithSafeCopy(p + 2048, 2 * 1024 * 1024, [](char*, size_t) { assert(false); }));
    munmap(p, 2 * 1024 * 1024);

// We cannot use Vectored Exception handling because it catches application-wide frame-based SEH blocks. We only
// want to catch unhandled exceptions in the event that there is not already an unhandled exception filter.
if( auto prev = SetUnhandledExceptionFilter( CrashFilter ) )
SetUnhandledExceptionFilter( prev ); // Already had a handler => put it back
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vectored exception handlers are called for every exception, including frame-based exception handlers that handle them. So any Windows application that would use Tracy along with frame-based SEH blocks would inadvertently cause CrashFilter to be called.

Instead, we want CrashFilter to be called only for unhandled exceptions, hence the change to SetUnhandledExceptionFilter.

The answer on this page gives more information, specifically this portion:

VEH handlers added by AddVectoredExceptionHandler will handle exceptions before they reach frame-based handlers.

Filter set by SetUnhandledExceptionFilter will handle exceptions after frame-based handlers failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, here's a simple program that exemplifies what I'm talking about with vectored exceptions. If Tracy is connected, CrashFilter is called when the memcpy causes an access violation. If Tracy is not connected, the CrashFilter isn't installed and the program runs fine.

int main(int argc, char** argv)
{
    tracy::StartupProfiler();

    std::this_thread::sleep_for(std::chrono::seconds(1)); // give time to connect

    {
        ZoneNamed(temp, true);

        __try
        {
            memcpy(nullptr, &temp, 1); // should crash?
        }
        __except (1)
        {
            // nope, control should end up here instead preventing the crash
        }

        std::this_thread::sleep_for(std::chrono::milliseconds(10));
    }

    tracy::ShutdownProfiler();

    return 0;   
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh. The memcpy above is ub, so any reasoning about what should happen is nonsensical. After #887 I don't have any energy left for Microsoft bullshit right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this simple example is UB, but the point is to exemplify a scenario where any Structured Exception is triggered and instead of being handled, the CrashFilter is invoked instead.

And yeah, understandable--887 looks rough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, here's another example. Integer divide-by-zero can be intercepted and handled with structured exceptions. If this happens while Tracy is connected with its vectored exception handler however, the profiler/viewer reports a crash and the application hangs. If Tracy is not connected, the vectored exception handler is not installed, and the application prints "Divide-by-zero without crash" as the exception was handled.

        __try
        {
            int i = 1;
            int j = 1;
            --j;
            fprintf(stdout, "%d\n", i / j);
        }
        __except (1)
        {
            fprintf(stdout, "Divide-by-zero without crash\n");
        }

This of course merely illustrates that using vectored exception handlers will override any frame-based exception handlers in an undesirable fashion, when what we really want is to handle unhandled exceptions.

Yet another example is the old method of setting thread names for Visual Studio:

        THREADNAME_INFO info = {};
        info.dwType = 0x1000;
        info.szName = name;
        info.dwThreadID = threadId;
        __try
        {
            RaiseException(MS_VC_EXCEPTION, 0, sizeof(info) / sizeof(ULONG_PTR), (ULONG_PTR*)&info);
        }
        __except (EXCEPTION_EXECUTE_HANDLER)
        {
        }

this too inadvertently results in Tracy's CrashFilter being invoked as currently written, however in this case it is not one of the recognized exceptions and returns EXCEPTION_CONTINUE_SEARCH which allows the frame-based filter to be called.

In all of these cases, my change to use SetUnhandledExceptionFilter causes CrashFilter to be invoked only in the event of an actual unhandled exception.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, it does make sense.

My initial question was actually for the "Already had a handler => put it back" bit, which seems to deny enabling the crash handler, when something else is already hooked up. The TRACY_NO_CRASH_HANDLER macro is intended to handle this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change that makes Windows now always set the unhandled exception filter, similar to how Linux always sets the signal handlers.

In a modern modular world the way these handlers work is not great, but such is the OS design that we have to work with. The fact that we have TRACY_NO_CRASH_HANDLER gives a work-around which is good.

I also changed the Linux side so that it matches the Windows removal-side now: if something else overrode our signal/SEH handler we put it back.

@jkriegshauser
Copy link
Contributor Author

Anything else I need to do for this?

@wolfpld
Copy link
Owner

wolfpld commented Sep 30, 2024

The general outlook of the changes is good, but I will still need to take a look into the details before the merge. For now, please add braces to scope where only indentation is used, or use single-liners, as you see fit.

// FreeBSD/XNU don't have F_SETPIPE_SZ, so use the default
m_pipeBufSize = 16384;
# else
m_pipeBufSize = (int)(ptrdiff_t)m_safeSendBufferSize;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is double casting needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally do this when both size and signedness are changing to (a) be explicit and (b) prevent pedantic warnings

assert( !m_inUse.exchange(true) );
#endif

if( size > m_safeSendBufferSize ) buf = (char*)tracy_malloc( size );
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the typical histogram of buffer sizes? How often is a larger temporary buffer allocated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a test on a simple application and got the following:

  1 << ? Count Comment
  [0] 0
  [1] 0
  [2] 8
  [3] 85
  [4] 280
  [5] 685
  [6] 863
  [7] 917
  [8] 856
  [9] 799
  [10] 619
  [11] 449
  [12] 239
  [13] 124
  [14] 58
  [15] 34
  [16] 17
  [17] 2 (exceeds buffer)

So at 64k in my simple test only 2 requests required temporary heap allocation as they exceeded the buffer size

@jkriegshauser jkriegshauser force-pushed the fix-long-string-crash branch from 7a9a7af to e1554a1 Compare October 2, 2024 18:21
@wolfpld wolfpld merged commit 13d9ed7 into wolfpld:master Oct 2, 2024
6 checks passed
@wolfpld
Copy link
Owner

wolfpld commented Oct 2, 2024

Thanks!

@mncat77
Copy link

mncat77 commented Oct 19, 2024

These changes seem to break the build with MinGW

tracy/public/client/TracyProfiler.cpp:1510:48: error: invalid conversion from 'LPTOP_LEVEL_EXCEPTION_FILTER' {aka 'long int (*)(_EXCEPTION_POINTERS*)'} to 'void*' [-fpermissive]
 1510 |     m_prevHandler = SetUnhandledExceptionFilter( CrashFilter );
 
tracy/public/client/TracyProfiler.cpp:3116:5: error: expected 'catch' before '__except'
 3116 |     __except( 1 /*EXCEPTION_EXECUTE_HANDLER*/ )

MSVC-specific code should probably be guarded with _MSC_VER instead of _WIN32

@wolfpld
Copy link
Owner

wolfpld commented Oct 20, 2024

Can you make a PR?

@jkriegshauser
Copy link
Contributor Author

tracy/public/client/TracyProfiler.cpp:1510:48: error: invalid conversion from 'LPTOP_LEVEL_EXCEPTION_FILTER' {aka 'long int (*)(_EXCEPTION_POINTERS*)'} to 'void*' [-fpermissive]
 1510 |     m_prevHandler = SetUnhandledExceptionFilter( CrashFilter );

This is easily fixable with a cast.

tracy/public/client/TracyProfiler.cpp:3116:5: error: expected 'catch' before '__except'
 3116 |     __except( 1 /*EXCEPTION_EXECUTE_HANDLER*/ )

This might be more problematic. You can see if __try1 / __except1 work otherwise it's going to be harder to work around.

@wolfpld
Copy link
Owner

wolfpld commented Oct 21, 2024

See #914.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants