Garbage Collection
Variable Storage and Lifetime
We need to talk about memory first before talking about GC. So where can variables be stored?
- Static (compile-time or load time)
- Stack (runtime) - aka user/runtime/system stack
- Heap (runtime)
For primitive variables, there are 3 categories (given different lifetimes)
- Globals (static storage)
- Variables declared outside of any function or class (outermost scope)
- Scope: accessible to all statements in all functions in the file
- Lifetime: from start ofprogram (loading) to end (unloading)
- Good practice: use sparingly, make constant as often as possible
- Stored in read-only or read-write segments of the process virtual memory space - allocated/fixed before prograrm starts
- Read-only segment holds translated/native code as well if any
- Locals (stack storage)
- Parameters and variables declared within a function
- Scope: accessible to all statements in the function they are defined
- Lifetime: from start to end of the function invocation
- Stored in User/Runtime stack in process virtual memory space
- Allocated/deallocated with functino invocations and returns
- Dynamic variables, aka pointer variables (heap storage)
- Pointer variables that point to variables that are allocated explicitly
- Scope: global or local depending on where they are declared
- Lifetime: from program point at which they are allocated with
new
to the one that at which they are deallocated withdelete
- Pointer variables (the address) are either globals or locals
- The data they point to is stored in the heap.
Here is a graph of a process memory:
As we all know, we only do garbage collection on the heap for implicit memory allocation.
Terminology
- Collector
- Part of the runtime that implements memory management
- Mutator
- User program - change (mutate) program data structures
- Stop-the-world collector - all mutators stop during GC
- Values that a program can manipulate directly
- In processor registers
- On the program stack (includes locals/temporaries)
- In global variables (e.g., array of statics)
- Root set of the computation
- References to heap data held in these locations
- Dynamically allocated data only accessible via roots
- A program should not access random locations in heap
Roots, Liveness, and Reachability
- Individually allocated pieces of data in the heap are
- Nodes, cells, objects (interchangeably)
- Commonly have header that indicates the type (and thus can be used to identify any references within the object)
- AKA boxed
- Live objects on the heap
- Graph of objects that can be “reached” from roots
- Objects that cannot be reached are garbage
- An object in the heap is live if
- Its address is held in a root, or
- There is a pointer to it held in another live heap object
- Graph of objects that can be “reached” from roots
Liveness of Allocated Objects
- Determined indirectly or directly
- Indirectly
- Most common method: tracing
- Regenerate the set of live nodes whenever a request by the user program for more memory fails
- Start from each root and visit all reachable nodes (via pointers)
- Any node not visited is reclaimed
- Directly
- A record is associated with each node in the heap and all references to that node from other heap nodes or roots
- Most common method: reference counting
- Must be kept up to date as the mutator alters the connectivity of the heap graph
Classic GC Algorithms
There are mainly three classic GC algorithms
- Reference counting
- Mark & Sweep
- Copying
For the first two method, we need a thing called Free List which keeps 1+ lists of free chunks that we then fill or break off pieces of to allocated an object
Reference Counting GC
- Each obejct has an additional atomic field in header that holds the humber of pointers to that cell from roots or other objects
- All cells placed in free list initially with count of 0
-
freeList
points to the head of the free list - Each time a pointer is set to refer to this cell, the count is incremented
- Each time a reference is removed, count is decremented
- If the count goes to 0
- There is no way for the program to access this cell
- The cell is returned to the free list
- If the count goes to 0
- When a new cell is allocated
- Reference count is set to 1
- Removed from free list
- Assume, for now, that all cells are the same size and each has 3 fields left and right which are references
Strengths
- Memory management overheads are distributed throughout the computation
- Management of active and garbage cells is interleaved with execution
- Incremental
- Smoother response time
- Locality of reference
- Things related are accessed together (for memory hierarchy performance)
- No worse than program itself
- Short-lived cells can be reused as soon as they are reclaimed
- We don’t have to wait until memory is exhausted to free cells
- Immediate reuse generates fewer page faults for virtual memory
- Update in place is possible
Weaknesses
- High processing cost for each pointer update
- When a pointer is overwritten the reference count for both the old and new target cells must be adjusted
- May cause poor memory performance
- Hence, it is not used much in real systems
- Extra space in each cell to store count (normally
sizeof(int)
) - Cyclic data structures can’t be reclaimed (e.g. doubly linked lists)
Mark & Sweep GC
- Tracing collector
- mark-sweep, mark-scan
- use reachability (indirection) to find live objects
- Object are not reclaimed immediately when they become garbage
- Remain unreachable and undetected until storage is exhausted
- When reclamation happens the program is paused
- Sweep all currently unused cells back into the freeList
- GC performs a global traversal of all live objects to determine which cells are reachable (live or active)
- Trace, starting from roots, marking them as reachable
- Free all unmarked cells
- Each cell contains 1 bit (markBit) of extra info
- Cells in freeList have markBit set to 0
- No
Update(...)
routine necessary
Strengths
- Cycles are handled quite normally
- no overhead placed on pointer manipulations
- Better than (incremental) reference counting
Weaknesses
- Start-stop algorithm (aka stop-the-world)
- Computation is halted while GC happens
- Not practical for real-time systems
- Asymptotic complexity is proportional to the size of the heap not just the live objects for sweep
- Fragments memory (scatters free cells across memory)
- Loss of memory performace (caching/paging)
- Allocation is complicated (need to find a set of cells for the right size)
- Residency - heap ocupancy
- As this increases, the need for garbage collection will become more frequent
- Taking processing cycles away from the applicatino
- Allocation and program performance degrades as residency increases
Copying Collector
- Tracing, stop-the-world collector
- Divide the heap into two semispaces
- One with current data
- The other with obsolete data
- The roles of the two semispaces is continuously flipped
- Collector copies live data from the old semispace
FromSpace
- To the new semispace (
ToSpace
) when visited - Pointers to objects in
ToSpace
are updated - Program is restarted
- Scavengers
FromSpace
is not reclaimed, just abandoned
- Divide the heap into two semispaces
Strengths
- Have lead to its widespread adoption
- Active data is compact (not fragmented as in mark-sweep)
- More efficient allocation, just grab the next group of cells that fits
- The check for space remaining is simply a pointer comparison
- Handles variabl-sized objects naturally
- No overhead on pointer updates
- Allocation is a simple free-space pointer increment
- Fragmentation is eliminated
- Compaction offers improved memory hierarchy performance of the user program
Weaknesses
- Required address space is doubled compared with non-copying collectors
- Primary drawback is the need to divide memory into two
- Performance derades as residency increases (twice as quickly as mark&sweep because half the space)
- Touches every page (VM) of the heap regardless of residency of the user program
- Unless both semispaces can be held in memory simultaneously
The Principle of Locality
- A good GC should not only reclaim memory but improve the locality of the system on the whole
- Principle of locality programs access a relatively small portion of their address space at any particular time (temperal and spacial locality)
- GC should ensure that locality is exploited to improve performance wherever possible
- memory hierarchy was developed to exploit the natural principle of locality in programs
- Different levels of memory each with different speeds/sizes/cost
- Registers, cache, memory, virtual memory
Generational GC
- Observations with previous GCs
- Long-lived objects are hard to deal with
- Young objects (recently allocated) die young
- weak-generational hypothesis: most are young (80-90%)
- Large heaps (that can’t be held in memory) degrade performance
Goal: Make large heaps more efficient by concentrating effort where the reatest payoff is
- Segregate objects by age into two or more heap regions
- Generations
- Keep the young generation separate
- Collected at different frequencies
- The younger the more often
- The oldest, possible never
- Generations
- Can be implemented as an incremental scheme or as a stop-the-world scheme
- Using different algorithms on the different regions
- Promotion
- Move object to older generation if its survives long enough
- Concentrate on youngest generation for reclamation
- This is where most of the recyclable space will be found
- Make this region small so that its collection can be more frequent but with shorter interruption
- A younger generation can be collected without collecting an older generation
- The pause time to collect a younger generation is shorter than if a colection of the heap is performed
- Young objs that survive minor collections are promoted
- Minor collections reclaim shortlived objects
- Tenured garbage: garbage in older generations
- Allocation always from minor
- Except perhaps for large or known-to-be-old objects
- Minor frequent, major very infrequent
- Major/minor collections can be any type
- Mark/sweep, copying, mark/compact, hybrid
- Promotion is copying
- Can have more than 2 generations
- Each requiring collection of those lower/younger
- Minor Collection must be independent of major
- need to remember old-to-young references
- Usually not too many - mutations to old objects are infrequent
- What about young-to-old?
- We don’t need to worry about them if we always collect the young each time we collect the old (major collection)
- Write barriers
- Catching old-to-young pointers
- Code that puts old-generation object into a remembered set
- Traversed as art of root set
- All field assignments aka POINTER UPDATES IN YOUR CODE!
- Alternative to write barriers
- Check all old objeccts to see if they point to a nursery object
- Will negate any benefit we get from generational GC