aboutsummaryrefslogtreecommitdiff
path: root/malloc
AgeCommit message (Collapse)AuthorFilesLines
2016-11-10Updates to trace2wldj/mallocDJ Delorie1-49/+236
* command line option -p to show progress * command line option -f to use file-based buffers * reduced memory footprint * more 32/64-bit fixes
2016-11-08More merge-related tweaksDJ Delorie3-13/+14
* add --enable-experimental-malloc/--disable-experimental-malloc (default: enabled) * syntax errors related to new lock macros * add some missing #if USE_TCACHE pairs * Undo test tweak to environment variable scanner
2016-11-08Merge branch 'master' into dj/mallocDJ Delorie15-299/+1421
2016-10-28malloc: Update comments about chunk layoutFlorian Weimer1-10/+30
2016-10-28sysmalloc: Initialize previous size field of mmaped chunksFlorian Weimer1-0/+1
With different encodings of the header, the previous zero initialization may be insufficient and produce an invalid encoding.
2016-10-28malloc: Use accessors for chunk metadata accessFlorian Weimer3-70/+91
This change allows us to change the encoding of these struct members in a centralized fashion.
2016-10-27Static inline functions for mallopt helpersSiddhesh Poyarekar1-34/+93
Make mallopt helper functions for each mallopt parameter so that it can be called consistently in other areas, like setting tunables. * malloc/malloc.c (do_set_mallopt_check): New function. (do_set_mmap_threshold): Likewise. (do_set_mmaps_max): Likewise. (do_set_top_pad): Likewise. (do_set_perturb_byte): Likewise. (do_set_trim_threshold): Likewise. (do_set_arena_max): Likewise. (do_set_arena_test): Likewise. (__libc_mallopt): Use them.
2016-10-26malloc: Remove malloc_get_state, malloc_set_state [BZ #19473]Florian Weimer5-100/+480
After the removal of __malloc_initialize_hook, newly compiled Emacs binaries are no longer able to use these interfaces. malloc_get_state is only used during the Emacs build process, so we provide a stub implementation only. Existing Emacs binaries will not call this stub function, but still reference the symbol. The rewritten tst-mallocstate test constructs a dumped heap which should approximates what existing Emacs binaries pass to glibc malloc.
2016-10-26Remove redundant definitions of M_ARENA_* macrosSiddhesh Poyarekar1-5/+0
The M_ARENA_MAX and M_ARENA_TEST macros are defined in malloc.c as well as malloc.h, and the former is unnecessary. This patch removes the duplicate. Tested on x86_64 to verify that the generated code remains unchanged barring changed line numbers to __malloc_assert. * malloc/malloc.c (M_ARENA_TEST, M_ARENA_MAX): Remove.
2016-10-26Document the M_ARENA_* mallopt parametersSiddhesh Poyarekar1-1/+0
The M_ARENA_* mallopt parameters are in wide use in production to control the number of arenas that a long lived process creates and hence there is no point in stating that this interface is non-public. Document this interface and remove the obsolete comment. * manual/memory.texi (M_ARENA_TEST): Add documentation. (M_ARENA_MAX): Likewise. * malloc/malloc.c: Remove obsolete comment.
2016-09-21malloc: Manual part of conversion to __libc_lockFlorian Weimer2-4/+4
This removes the old mutex_t-related definitions from malloc-machine.h, too.
2016-09-10Add tests-static to tests in malloc/MakefileSiddhesh Poyarekar1-2/+1
This is a trivial change to add the static tests only to tests-static and then adding all of tests-static to the tests target to make it look consistent with some other Makefiles. This avoids having to duplicate the test names across the two make targets. * malloc/Makefile (tests): Remove individual static test names and just add all of tests-static.
2016-09-06malloc: Automated part of conversion to __libc_lockFlorian Weimer3-60/+60
2016-08-26malloc: Simplify static malloc interposition [BZ #20432]Florian Weimer10-1/+648
Existing interposed mallocs do not define the glibc-internal fork callbacks (and they should not), so statically interposed mallocs lead to link failures because the strong reference from fork pulls in glibc's malloc, resulting in multiple definitions of malloc-related symbols.
2016-08-11Merge branch 'master' into dj/mallocDJ Delorie5-69/+94
2016-08-10Various namespace issuesDJ Delorie1-12/+12
2016-08-10Remove debugging; fix trace error handlingDJ Delorie1-9/+8
Comment out _m_printf until it's needed again. Properly unlock the trace mutex when we error out because of file errors; also disable tracing when that happens.
2016-08-09Various minor fixesDJ Delorie1-25/+26
Replace "int" with "size_t" as appropriate. Appease gcc's array-bounds warning Process tcache after hooks to support MALLOC_CHECK_
2016-08-08Migrate trace2wl from C++ to CDJ Delorie2-113/+254
Also add posix_memalign support
2016-08-03elf: dl-minimal malloc needs to respect fundamental alignmentFlorian Weimer2-63/+53
The dynamic linker currently uses __libc_memalign for TLS-related allocations. The goal is to switch to malloc instead. If the minimal malloc follows the ABI fundamental alignment, we can assume that malloc provides this alignment, and thus skip explicit alignment in a few cases as an optimization. It was requested on libc-alpha that MALLOC_ALIGNMENT should be used, although this results in wasted space if MALLOC_ALIGNMENT is larger than the fundamental alignment. (The dynamic linker cannot assume that the non-minimal malloc will provide an alignment of MALLOC_ALIGNMENT; the ABI provides _Alignof (max_align_t) only.)
2016-08-02malloc: Run tests without calling mallopt [BZ #19469]Florian Weimer1-0/+4
The compiled tests no longer refer to the mallopt symbol from their main functions. (Some tests still call mallopt explicitly, which is fine.)
2016-08-02malloc: Preserve arena free list/thread count invariant [BZ #20370]Florian Weimer1-5/+36
It is necessary to preserve the invariant that if an arena is on the free list, it has thread attach count zero. Otherwise, when arena_thread_freeres sees the zero attach count, it will add it, and without the invariant, an arena could get pushed to the list twice, resulting in a cycle. One possible execution trace looks like this: Thread 1 examines free list and observes it as empty. Thread 2 exits and adds its arena to the free list, with attached_threads == 0). Thread 1 selects this arena in reused_arena (not from the free list). Thread 1 increments attached_threads and attaches itself. (The arena remains on the free list.) Thread 1 exits, decrements attached_threads, and adds the arena to the free list. The final step creates a cycle in the usual way (by overwriting the next_free member with the former list head, while there is another list item pointing to the arena structure). tst-malloc-thread-exit exhibits this issue, but it was only visible with a debugger because the incorrect fix in bug 19243 removed the assert from get_free_list.
2016-07-22Yet more 32-bit fixes.DJ Delorie1-15/+17
Make sure trace_dump doesn't overflow
2016-07-22Add quick_run compilation mode.Carlos O'Donell1-37/+69
- Add quick_run compilation mode. - Remove disabling of fast bins.
2016-07-22More 32-bit fixes.DJ Delorie2-21/+25
Various fixes to handle traces and workloads bigger than 2 Gb.
2016-07-21Add various bin-related trace path flagsDJ Delorie3-2/+35
2016-07-20Add note about the timing of recording an mremap event.DJ Delorie1-0/+7
2016-07-20Reschedule trace record commits to avoid inversion.DJ Delorie1-17/+87
This change decouples "collecting trace data" from "allocating a trace record" so that the record can be inserted into the trace buffer in the correct sequence wrt when it "owns" the pointers being recorded (i.e. malloc should record its event after it does its allocation, but free should record its event before it returns the memory to the arena). It splits starting a trace record (function entry) with committing to the buffer (trace recording) so that path data can be accumulated easily. Trace inversion happens when one thread records a malloc, but before it can actually do the allocation, the kernel schedules a thread that free's a block, which the malloc later returns. The events are free->malloc, but the trace records are malloc->free.
2016-07-19Minor tweaks to trace_run and trace2wlDJ Delorie2-11/+14
trace_run - fix realloc returning NULL behavior trace2wl - hard stop on multi-level inversion, print number of fixed inversions.
2016-07-19Fix trace window unmapping bugDJ Delorie1-1/+1
We were recording window number, not trace count, resulting in windows not getting unmapped.
2016-07-19Detect single trace inversions and correct them.DJ Delorie1-140/+167
Trace inversion happens when: * thread A calls malloc, starts a trace record, and then is suspended by the kernel. * thread B calls free, writes a trace record, and frees address X. * thread A is scheduled, and returns address X. The trace would show thread A's malloc returning pointer X before thread B free's it, which is "trace inversion". This patch detects a single inversion (multiple inversions can happen, although rare) and reschedules the malloc to happen right after the free.
2016-07-18Change trace_run from mmap to readDJ Delorie1-51/+121
To avoid huge memory requirements for huge workloads, and unreliable RSS size due to unmlock'able maps, switch trace_run to a read-as-you-go design. Data is read per-thread in 4k or 64k chunks (based on workload size) into a fixed buffer.
2016-07-16Enhance the tracer with new data and fixes.Carlos O'Donell5-71/+299
* Increase trace entry to 64-bytes. The following patch increases the trace entry to 64-bytes, still a proper multiple of the shared memory window size. While we have doubled the entry size the on-disk format is still smaller than the ASCII version. In the future we may wish to add variable sized records, but for now the simplicity of this method works well. With the extra bytes we are going to: - Record internal size information for incoming (free) and outgoing chunks (malloc, calloc, realloc, etc). - Simplifies accounting of RSS usage and provides an extra cross check between malloc<->free based on internal chunk sizes. - Record alignment information for memalign, and posix_memalign. - Continues to extend the tracer to the full API. - Leave 128-bits of padding for future path uses. - Useful for more path information. Additionally __MTB_TYPE_POSIX_MEMALIGN is added for the sole purpose of recording the trace only so that we can hard-fail in the workload converter when we see such an entry. Lastly C_MEMALIGN, C_VALLOC, C_PVALLOC, and C_POSIX_MEMALIGN are added for workload entries for the sake of completeness. Builds on x86_64, capture looks good and it works. * Teach trace_dump about the new entries. The following patch teaches trace_dump about the new posix_memalign entry. It also teaches trace_dump about the new size2 and size3 fields. Tested by tracing a program that uses malloc, free, and memalign and verifying that the extra fields show the expected chunk sizes, and alignments dumped with trace_dump. Tested on x86_64 with no apparently problems. * Teach trace2wl and trace_run about new entries (a) trace2wl changes: The following patch teaches trace2wl how to output entries for valloc and pvalloc, it does so exactly the same way it does for malloc, since from the perspective of the API they are identical. Additionally trace2wl is taught how to output an event for memalign, storing alignment and size in the event record. Lastly posix_memalign is detected and the converter aborted if it's seen. It is my opinion that we should not ignore this data during conversion. If we see a need for it we should implement it later. (b) trace_run changes: Some cosmetic cleanup in printing 'pthread_t' which is always an address of the struct pthread structure in memory, so to make debugging easier we should print the value as a hex pointer. Teach the simulator how to run memalign. With the newly recorded alignment information we double check that the resulting memory is correctly aligned. We do not implement valloc and pvalloc, they will abort the simulator. This is incremental progress. Tested on x86_64 by converting and running a multithreaded test application that calls calloc, malloc, free, and memalign. * Disable recursive traces and save new data. (a) Adds support for disabling recurisvely recorded traces e.g. realloc calling malloc no longer produces a realloc and malloc trace event. We solve this by using a per-thread variable to disable new trace creation, but allow path bits to be set. This lets us record the code paths taken, but only record one public API event. (b) Save internal chunk size information into trace events for all APIs. The most important is free where we record the free size, this allows easier tooling to compute running idea RSS values. Tested on x86_64 with some small applications and test programs.
2016-07-15Add tunables for tcache count and max sizeDJ Delorie3-38/+108
2016-07-15Fix NULL return value handlingDJ Delorie2-5/+24
Decided that a call that returns NULL should be encoded in the workload but that the simulator should just skip those calls, rather than skip them in the converter.
2016-07-15Fix mmap/munmap trace bitsDJ Delorie1-4/+2
2016-07-13Add trace_dump toolDJ Delorie2-2/+203
trace_dump <binary-trace-or-workload> autodetects trace file vs workload, outputs the contents thereof
2016-07-13Fix a 32-bit sign-extension bug.Anton Blanchard1-1/+1
2016-07-13Fix double-padding bugDJ Delorie1-5/+6
The tcache was calling request2size which resulted in double padding. Store tcache's copy in a separate variable to avoid this.
2016-07-12Update to new binary file-based trace file.DJ Delorie9-386/+790
In order to not lose records, or need to guess ahead of time how many records you need, this switches to a mmap'd file for the trace buffer, and grows it as needed. The trace2dat perl script is replaced with a trace2wl C++ program that runs a lot faster and can handle the binary format.
2016-07-06Add README for testing copr repo of dj/malloc branchDJ Delorie1-0/+193
Includes some details on tracing and simulating too.
2016-07-06Use __gettid() function for tracing.Carlos O'Donell1-1/+21
Integrate with thread 'tid' cache and use the cached value if present, otherwise update the cache. This should be much faster than a syscall per trace event.
2016-07-0532-bit fixes, RSS tracking, Free wiping.DJ Delorie1-20/+82
More 32-bit vs 64-bit fixes. We now track "ideal RSS" and report its maximum vs what the kernel thinks our max RSS is. Memory is filled with a constant when free'd.
2016-07-05Bump up tst-malloc-thread-fail timeout from 20 to 30sChris Metcalf1-1/+1
Right now tilegx is right on the verge of timeout when it runs, so adding a bit of headroom seems like the right thing; we see failures when running tests in parallel.
2016-06-30Merge branch 'master' into dj/mallocDJ Delorie6-148/+163
2016-06-30Build fixes for in-tree and 32/64-bitDJ Delorie5-51/+101
Expand the comments in mtrace-ctl.c to better explain how to use this tracing controller. The new docs assume the SO is built and installed. Build fixed for trace_run.c Additional build pedantry to let trace_run.c be built with more warnings/errors turned on. Build/install trace_run and trace2dat trace2dat takes dump files from mtrace-ctl.so and turns them into mmap'able data files for trace_run, which "plays back" the logged calls. 32-bit compatibility Redesign tcache macros to account for differences between 64 and 32 bit systems.
2016-06-23test-skeleton.c: Add write_message functionFlorian Weimer1-11/+3
2016-06-21malloc: Avoid premature fallback to mmap [BZ #20284]Florian Weimer1-6/+4
Before this change, the while loop in reused_arena which avoids returning a corrupt arena would never execute its body if the selected arena were not corrupt. As a result, result == begin after the loop, and the function returns NULL, triggering fallback to mmap.
2016-06-20Revert __malloc_initialize_hook symbol poisoningFlorian Weimer5-23/+6
It turns out the Emacs-internal malloc implementation uses __malloc_* symbols. If glibc poisons them in <stdc-pre.h>, Emacs will no longer compile.
2016-06-11malloc_usable_size: Use correct size for dumped fake mapped chunksFlorian Weimer1-1/+6
The adjustment for the size computation in commit 1e8a8875d69e36d2890b223ffe8853a8ff0c9512 is needed in malloc_usable_size, too.