Skip to content

Commit 579d883

Browse files
Huge page support for composite images loaded on Linux (#37673)
Add support for loading composite R2R images utilizing huge pages on Linux Support is broken into 3 major portions - Changes to the compiler to add a switch which can compile the composite image with higher than normal alignment - Changes to the runtime to make some slight tweaks to PE file loading on Linux to support these images correctly - Documentation on how to tie these various features together to achieve large page loading on Linux
1 parent cd02b06 commit 579d883

File tree

9 files changed

+302
-23
lines changed

9 files changed

+302
-23
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
Configuring Huge Pages for loading composite binaries using CoreCLR on Linux
2+
----
3+
4+
Huge pages can provide performance benefits to reduce the cost of TLB cache misses when
5+
executing code. In general, the largest available wins may be achieved by enabling huge
6+
pages for use by the GC, which will dominate the memory use in the process, but in some
7+
circumstances, if the application is sufficiently large, there may be a benefit to using
8+
huge pages to map in code.
9+
10+
It is expected that consumers who have these needs have very large applications, and are
11+
able to tolerate somewhat complex solutions. CoreCLR supports loading composite R2R
12+
images using the hugetlbfs. Doing some requires several steps.
13+
14+
1. The composite image must be created with a switch such as `--custom-pe-section-alignment=2097152`. This will align the PE sections in the R2R file on 2MB virtual address boundaries, and align the sections in the PE file itself on the same boundaries.
15+
- This will increase the size of the image by up to 5 * the specified alignment. Typical increases will be more similar to 3 * the specified alignment
16+
2. The composite image must be copied into a hugetlbfs filesystem which is visible to the .NET process instead of the composite image being loaded from the normal path.
17+
- IMPORTANT: The composite image must NOT be located in the normal path next to the application binary, or that file will be used instead of the huge page version.
18+
- The environment variable `COMPlus_NativeImageSearchPaths` must be set to point at the location of the hugetlbfs in use. For instance, `COMPlus_NativeImageSearchPaths` might be set to `/var/lib/hugetlbfs/user/USER/pagesize-2MB`
19+
- As the cp command does not support copying into a hugetlbfs due to lack of support for the write syscall in that file system, a custom copy application must be used. A sample application that may be used to perform this task has a source listing in Appendix A.
20+
3. The machine must be configured to have sufficient huge pages available in the appropriate huge page pool. The memory requirements of huge page PE loading are as follows.
21+
- Sufficient pages to hold the unmodified copy of the composite image in the hugetlbfs. These pages will be used by the initial copy which emplaces the composite image into huge pages.
22+
- By default the runtime will map each page of the composite image using a MAP_PRIVATE mapping. This will require that the maximum number of huge pages is large enough to hold a completely separate copy of the image as loaded.
23+
- To reduce that cost, launch the application with the PAL_MAP_READONLY_PE_HUGE_PAGE_AS_SHARED environment variable set to 1. This environment variable will change the way that the composite image R2R files are mapped into the process to create the mappings to read only sections as MAP_SHARED mappings. This will reduce the extra huge pages needed to only be the sections marked as RW in the PE file. On a Windows machine use the link tool (`link /dump /header compositeimage.dll` to determine the number of pages needed for the these `.data` section of the PE file.)
24+
- If the PAL_MAP_READONLY_PE_HUGE_PE_AS_SHARED is set, the number of huge pages needed is `<Count of huge pages for composite file> + <count of processes to run> * <count of huge pages needed for the .data section of the composite file>`
25+
26+
Appendix A - Source for a simple copy into hugetlbfs program.
27+
28+
```
29+
// Licensed to the .NET Foundation under one or more agreements.
30+
// The .NET Foundation licenses this file to you under the MIT license.
31+
// See the LICENSE file in the project root for more information.
32+
33+
#include <stdlib.h>
34+
#include <stdio.h>
35+
#include <unistd.h>
36+
#include <sys/mman.h>
37+
#include <sys/stat.h>
38+
#include <fcntl.h>
39+
#include <string.h>
40+
#include <unistd.h>
41+
42+
int main(int argc, char** argv)
43+
{
44+
if (argc != 3)
45+
{
46+
printf("Incorrect number arguments specified. Arguments are <src> <dest>");
47+
return 1;
48+
}
49+
50+
void *addrSrc, *addrDest;
51+
int fdSrc, fdDest, ret;
52+
53+
fdSrc = open(argv[1], O_RDWR);
54+
if (fdSrc < 0)
55+
{
56+
printf("Open src failed\n");
57+
return 1;
58+
}
59+
60+
struct stat st;
61+
if (fstat(fdSrc, &st) < 0)
62+
{
63+
printf("fdSrc fstat failed\n");
64+
return 1;
65+
}
66+
67+
addrSrc = mmap(0, st.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fdSrc, 0);
68+
if (addrSrc == MAP_FAILED)
69+
{
70+
printf("fdSrc mmap failed\n");
71+
return 1;
72+
}
73+
74+
fdDest = open(argv[2], O_CREAT | O_RDWR, 0755);
75+
if (fdDest < 0)
76+
{
77+
printf("Open dest failed\n");
78+
return 1;
79+
}
80+
81+
if (ftruncate(fdDest, st.st_size) < 0)
82+
{
83+
printf("ftruncate failed\n");
84+
return 1;
85+
}
86+
87+
addrDest = mmap(0, st.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fdDest, 0);
88+
if (addrDest == MAP_FAILED)
89+
{
90+
printf("fdDest mmap failed\n");
91+
return 1;
92+
}
93+
94+
memcpy(addrDest, addrSrc, st.st_size);
95+
96+
munmap(addrSrc, st.st_size);
97+
munmap(addrDest, st.st_size);
98+
close(fdSrc);
99+
close(fdDest);
100+
return 0;
101+
}
102+
```

src/coreclr/src/pal/src/map/map.cpp

+34-6
Original file line numberDiff line numberDiff line change
@@ -2233,6 +2233,10 @@ void * MAPMapPEFile(HANDLE hFile, off_t offset)
22332233
bool forceRelocs = false;
22342234
char* envVar;
22352235
#endif
2236+
SIZE_T reserveSize = 0;
2237+
bool forceOveralign = false;
2238+
int readWriteFlags = MAP_FILE|MAP_PRIVATE|MAP_FIXED;
2239+
int readOnlyFlags = readWriteFlags;
22362240

22372241
ENTRY("MAPMapPEFile (hFile=%p offset=%zx)\n", hFile, offset);
22382242

@@ -2357,13 +2361,20 @@ void * MAPMapPEFile(HANDLE hFile, off_t offset)
23572361
// We're going to start adding mappings to the mapping list, so take the critical section
23582362
InternalEnterCriticalSection(pThread, &mapping_critsec);
23592363

2364+
reserveSize = virtualSize;
2365+
if ((ntHeader.OptionalHeader.SectionAlignment) > GetVirtualPageSize())
2366+
{
2367+
reserveSize += ntHeader.OptionalHeader.SectionAlignment;
2368+
forceOveralign = true;
2369+
}
2370+
23602371
#ifdef HOST_64BIT
23612372
// First try to reserve virtual memory using ExecutableAllocator. This allows all PE images to be
23622373
// near each other and close to the coreclr library which also allows the runtime to generate
23632374
// more efficient code (by avoiding usage of jump stubs). Alignment to a 64 KB granularity should
23642375
// not be necessary (alignment to page size should be sufficient), but see
23652376
// ExecutableMemoryAllocator::AllocateMemory() for the reason why it is done.
2366-
loadedBase = ReserveMemoryFromExecutableAllocator(pThread, ALIGN_UP(virtualSize, VIRTUAL_64KB));
2377+
loadedBase = ReserveMemoryFromExecutableAllocator(pThread, ALIGN_UP(reserveSize, VIRTUAL_64KB));
23672378
#endif // HOST_64BIT
23682379

23692380
if (loadedBase == NULL)
@@ -2384,7 +2395,7 @@ void * MAPMapPEFile(HANDLE hFile, off_t offset)
23842395
mapFlags |= MAP_JIT;
23852396
}
23862397
#endif // __APPLE__
2387-
loadedBase = mmap(usedBaseAddr, virtualSize, PROT_NONE, mapFlags, -1, 0);
2398+
loadedBase = mmap(usedBaseAddr, reserveSize, PROT_NONE, mapFlags, -1, 0);
23882399
}
23892400

23902401
if (MAP_FAILED == loadedBase)
@@ -2413,15 +2424,28 @@ void * MAPMapPEFile(HANDLE hFile, off_t offset)
24132424
}
24142425
#endif // _DEBUG
24152426

2427+
size_t headerSize;
2428+
headerSize = GetVirtualPageSize(); // if there are lots of sections, this could be wrong
2429+
2430+
if (forceOveralign)
2431+
{
2432+
loadedBase = ALIGN_UP(loadedBase, ntHeader.OptionalHeader.SectionAlignment);
2433+
headerSize = ntHeader.OptionalHeader.SectionAlignment;
2434+
char *mapAsShared = EnvironGetenv("PAL_MAP_READONLY_PE_HUGE_PAGE_AS_SHARED");
2435+
2436+
// If PAL_MAP_READONLY_PE_HUGE_PAGE_AS_SHARED is set to 1. map the readonly sections as shared
2437+
// which works well with the behavior of the hugetlbfs
2438+
if (mapAsShared != NULL && (strcmp(mapAsShared, "1") == 0))
2439+
readOnlyFlags = MAP_FILE|MAP_SHARED|MAP_FIXED;
2440+
}
2441+
24162442
//we have now reserved memory (potentially we got rebased). Walk the PE sections and map each part
24172443
//separately.
24182444

2419-
size_t headerSize;
2420-
headerSize = GetVirtualPageSize(); // if there are lots of sections, this could be wrong
24212445

24222446
//first, map the PE header to the first page in the image. Get pointers to the section headers
24232447
palError = MAPmmapAndRecord(pFileObject, loadedBase,
2424-
loadedBase, headerSize, PROT_READ, MAP_FILE|MAP_PRIVATE|MAP_FIXED, fd, offset,
2448+
loadedBase, headerSize, PROT_READ, readOnlyFlags, fd, offset,
24252449
(void**)&loadedHeader);
24262450
if (NO_ERROR != palError)
24272451
{
@@ -2501,18 +2525,22 @@ void * MAPMapPEFile(HANDLE hFile, off_t offset)
25012525
//Don't discard these sections. We need them to verify PE files
25022526
//if (currentHeader.Characteristics & IMAGE_SCN_MEM_DISCARDABLE)
25032527
// continue;
2528+
int flags = readOnlyFlags;
25042529
if (currentHeader.Characteristics & IMAGE_SCN_MEM_EXECUTE)
25052530
prot |= PROT_EXEC;
25062531
if (currentHeader.Characteristics & IMAGE_SCN_MEM_READ)
25072532
prot |= PROT_READ;
25082533
if (currentHeader.Characteristics & IMAGE_SCN_MEM_WRITE)
2534+
{
25092535
prot |= PROT_WRITE;
2536+
flags = readWriteFlags;
2537+
}
25102538

25112539
palError = MAPmmapAndRecord(pFileObject, loadedBase,
25122540
sectionBase,
25132541
currentHeader.SizeOfRawData,
25142542
prot,
2515-
MAP_FILE|MAP_PRIVATE|MAP_FIXED,
2543+
flags,
25162544
fd,
25172545
offset + currentHeader.PointerToRawData,
25182546
&sectionData);

src/coreclr/src/tools/crossgen2/ILCompiler.ReadyToRun/CodeGen/ReadyToRunObjectWriter.cs

+13-4
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,13 @@ internal class ReadyToRunObjectWriter
5454
/// </summary>
5555
private readonly MapFileBuilder _mapFileBuilder;
5656

57+
/// <summary>
58+
/// If non-null, the PE file will be laid out such that it can naturally be mapped with a higher alignment than 4KB
59+
/// This is used to support loading via large pages on Linux
60+
/// </summary>
61+
private readonly int? _customPESectionAlignment;
62+
63+
5764
#if DEBUG
5865
private struct NodeInfo
5966
{
@@ -72,12 +79,13 @@ public NodeInfo(ISymbolNode node, int nodeIndex, int symbolIndex)
7279
Dictionary<string, NodeInfo> _previouslyWrittenNodeNames = new Dictionary<string, NodeInfo>();
7380
#endif
7481

75-
public ReadyToRunObjectWriter(string objectFilePath, EcmaModule componentModule, IEnumerable<DependencyNode> nodes, NodeFactory factory, bool generateMapFile)
82+
public ReadyToRunObjectWriter(string objectFilePath, EcmaModule componentModule, IEnumerable<DependencyNode> nodes, NodeFactory factory, bool generateMapFile, int? customPESectionAlignment)
7683
{
7784
_objectFilePath = objectFilePath;
7885
_componentModule = componentModule;
7986
_nodes = nodes;
8087
_nodeFactory = factory;
88+
_customPESectionAlignment = customPESectionAlignment;
8189

8290
if (generateMapFile)
8391
{
@@ -127,7 +135,8 @@ public void EmitPortableExecutable()
127135
headerBuilder,
128136
r2rHeaderExportSymbol,
129137
Path.GetFileName(_objectFilePath),
130-
getRuntimeFunctionsTable);
138+
getRuntimeFunctionsTable,
139+
_customPESectionAlignment);
131140

132141
NativeDebugDirectoryEntryNode nativeDebugDirectoryEntryNode = null;
133142

@@ -270,10 +279,10 @@ private void EmitObjectData(R2RPEBuilder r2rPeBuilder, ObjectData data, int node
270279
r2rPeBuilder.AddObjectData(data, section, name, mapFileBuilder);
271280
}
272281

273-
public static void EmitObject(string objectFilePath, EcmaModule componentModule, IEnumerable<DependencyNode> nodes, NodeFactory factory, bool generateMapFile)
282+
public static void EmitObject(string objectFilePath, EcmaModule componentModule, IEnumerable<DependencyNode> nodes, NodeFactory factory, bool generateMapFile, int? customPESectionAlignment)
274283
{
275284
Console.WriteLine($@"Emitting R2R PE file: {objectFilePath}");
276-
ReadyToRunObjectWriter objectWriter = new ReadyToRunObjectWriter(objectFilePath, componentModule, nodes, factory, generateMapFile);
285+
ReadyToRunObjectWriter objectWriter = new ReadyToRunObjectWriter(objectFilePath, componentModule, nodes, factory, generateMapFile, customPESectionAlignment);
277286
objectWriter.EmitPortableExecutable();
278287
}
279288
}

src/coreclr/src/tools/crossgen2/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilation.cs

+6-3
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,7 @@ public sealed class ReadyToRunCodegenCompilation : Compilation
233233

234234
public ReadyToRunSymbolNodeFactory SymbolNodeFactory { get; }
235235
public ReadyToRunCompilationModuleGroupBase CompilationModuleGroup { get; }
236+
private readonly int? _customPESectionAlignment;
236237

237238
internal ReadyToRunCodegenCompilation(
238239
DependencyAnalyzerBase<NodeFactory> dependencyGraph,
@@ -248,7 +249,8 @@ internal ReadyToRunCodegenCompilation(
248249
int parallelism,
249250
ProfileDataManager profileData,
250251
ReadyToRunMethodLayoutAlgorithm methodLayoutAlgorithm,
251-
ReadyToRunFileLayoutAlgorithm fileLayoutAlgorithm)
252+
ReadyToRunFileLayoutAlgorithm fileLayoutAlgorithm,
253+
int? customPESectionAlignment)
252254
: base(
253255
dependencyGraph,
254256
nodeFactory,
@@ -262,6 +264,7 @@ internal ReadyToRunCodegenCompilation(
262264
_resilient = resilient;
263265
_parallelism = parallelism;
264266
_generateMapFile = generateMapFile;
267+
_customPESectionAlignment = customPESectionAlignment;
265268
SymbolNodeFactory = new ReadyToRunSymbolNodeFactory(nodeFactory);
266269
_corInfoImpls = new ConditionalWeakTable<Thread, CorInfoImpl>();
267270
_inputFiles = inputFiles;
@@ -290,7 +293,7 @@ public override void Compile(string outputFile)
290293
using (PerfEventSource.StartStopEvents.EmittingEvents())
291294
{
292295
NodeFactory.SetMarkingComplete();
293-
ReadyToRunObjectWriter.EmitObject(outputFile, componentModule: null, nodes, NodeFactory, _generateMapFile);
296+
ReadyToRunObjectWriter.EmitObject(outputFile, componentModule: null, nodes, NodeFactory, _generateMapFile, _customPESectionAlignment);
294297
CompilationModuleGroup moduleGroup = _nodeFactory.CompilationModuleGroup;
295298

296299
if (moduleGroup.IsCompositeBuildMode)
@@ -339,7 +342,7 @@ private void RewriteComponentFile(string inputFile, string outputFile, string ow
339342
}
340343
componentGraph.ComputeMarkedNodes();
341344
componentFactory.Header.Add(Internal.Runtime.ReadyToRunSectionType.OwnerCompositeExecutable, ownerExecutableNode, ownerExecutableNode);
342-
ReadyToRunObjectWriter.EmitObject(outputFile, componentModule: inputModule, componentGraph.MarkedNodeList, componentFactory, generateMapFile: false);
345+
ReadyToRunObjectWriter.EmitObject(outputFile, componentModule: inputModule, componentGraph.MarkedNodeList, componentFactory, generateMapFile: false, customPESectionAlignment: null);
343346
}
344347

345348
public override void WriteDependencyLog(string outputFileName)

src/coreclr/src/tools/crossgen2/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilationBuilder.cs

+9-1
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ public sealed class ReadyToRunCodegenCompilationBuilder : CompilationBuilder
2828
private ProfileDataManager _profileData;
2929
private ReadyToRunMethodLayoutAlgorithm _r2rMethodLayoutAlgorithm;
3030
private ReadyToRunFileLayoutAlgorithm _r2rFileLayoutAlgorithm;
31+
private int? _customPESectionAlignment;
3132

3233
private string _jitPath;
3334
private string _outputFile;
@@ -137,6 +138,12 @@ public ReadyToRunCodegenCompilationBuilder GenerateOutputFile(string outputFile)
137138
return this;
138139
}
139140

141+
public ReadyToRunCodegenCompilationBuilder UseCustomPESectionAlignment(int? customPESectionAlignment)
142+
{
143+
_customPESectionAlignment = customPESectionAlignment;
144+
return this;
145+
}
146+
140147
public override ICompilation ToCompilation()
141148
{
142149
// TODO: only copy COR headers for single-assembly build and for composite build with embedded MSIL
@@ -223,7 +230,8 @@ public override ICompilation ToCompilation()
223230
_parallelism,
224231
_profileData,
225232
_r2rMethodLayoutAlgorithm,
226-
_r2rFileLayoutAlgorithm);
233+
_r2rFileLayoutAlgorithm,
234+
_customPESectionAlignment);
227235
}
228236
}
229237
}

0 commit comments

Comments
 (0)