1、Automatic Storage Management,Patrick Earl Simon Leonard Jack Newton,2,Overview,Terminology Why use Automatic Storage Management? Comparing garbage collection algorithms The “Classic” algorithms Copying garbage collection Incremental Tracing garbage collection Generational garbage collection Conclusi
2、ons,3,Terminology,Stack: a memory area where activation records or frames are pushed onto when a procedure is called and popped off when it returns Heap: a memory area where data structures can be allocated and deallocated in any order.,4,Terminology (Continued),Roots: values that a program can mani
3、pulate directly (i.e. values held in registers, on the program stack, and global variables.) Node/Cell/Object: an individually allocated piece of data in the heap. Children Nodes: the list of pointers that a given node contains. Live Node: a node whose address is held in a root or is the child of a
4、live node.,5,Terminology (Continued),Garbage: nodes that are not live, but are not free either. Garbage collection: the task of recovering (freeing) garbage nodes. Mutator: The program running alongside the garbage collection system.,6,Why Garbage Collect?,Language requirements In some situations it
5、 may be impossible to know when a shared data structure is no longer in use.,7,Why Garbage Collect? (Continued),Software Engineering Garbage collection increases abstraction level of software development. Simplified interfaces and decreases coupling of modules. Studies have shown a significant amoun
6、t of development time is spent on memory management bugs Rovner, 1985.,8,Comparing Garbage Collection Algorithms,Directly comparing garbage collection algorithms is difficult there are many factors to consider. Some factors to consider: Cost of reclaiming cells Cost of allocating cells Storage overh
7、ead How does the algorithm scale with residency? Will user program be suspended during garbage collection? Does an upper bound exist on the pause time? Is locality of data structures maintained (or maybe even improved?),9,Classes of Garbage Collection Algorithms,Direct Garbage Collectors: a record i
8、s associated with each node in the heap. The record for node N indicates how many other nodes or roots point to N. Indirect/Tracing Garbage Collectors: usually invoked when a users request for memory fails because the free list is exhausted. The garbage collector visits all live nodes, and returns a
9、ll other memory to the free list. If sufficient memory has been recovered from this process, the users request for memory is satisfied.,10,Quick Review: Reference Counting,Every cell has an additional field: the reference count. This field represents the number of pointers to that cell from roots or
10、 heap cells. Initially, all cells in the heap are placed in a pool of free cells, the free list.,11,Reference Counting (Continued),When a cell is allocated from the free list, its reference count is set to one. When a pointer is set to reference a cell, the cells reference count is incremented by 1;
11、 if a pointer is to the cell is deleted, its reference count is decremented by 1. When a cells reference count reaches 0, its pointers to its children are deleted and it is returned to the free list.,12,0,1,0,0,0,Reference Counting Example,1,2,1,1,13,2,1,1,1,Reference Counting Example (Continued),0,
12、1,14,2,1,1,1,Reference Counting Example (Continued),0,1,15,2,1,1,1,Reference Counting Example (Continued),0,0,0,1,1,16,Reference Counting: Advantages and Disadvantages,Advantages: Garbage collection overhead is distributed. Locality of reference is no worse than mutator. Free memory is returned to f
13、ree list quickly.,17,Reference Counting: Advantages and Disadvantages (Continued),Disadvantages: High time cost (every time a pointer is changed, reference counts must be updated). Storage overhead for reference counter can be high. Unable to reclaim cyclic data structures. If the reference counter
14、overflows, the object becomes permanent.,18,Reference Counting: Cyclic Data Structure - Before,0,2,0,0,1,2,1,19,Reference Counting: Cyclic Data Structure After,0,1,0,0,1,2,1,20,Deferred Reference Counting,Optimisation Cost can be improved by special treatment of local variables. Only update referenc
15、e counters of objects on the stack at fixed intervals. Reference counts are still affected from pointers from one heap object to another.,21,Quick Review: Mark-Sweep,The first tracing garbage collection algorithm Garbage cells are allowed to build up until heap space is exhausted (i.e. a user progra
16、m requests a memory allocation, but there is insufficient free space on the heap to satisfy the request.) At this point, the mark-sweep algorithm is invoked, and garbage cells are returned to the free list.,22,Mark-Sweep (Continued),Performed in two phases: Mark phase: identifies all live cells by s
17、etting a mark bit. Live cells are cells reachable from a root. Sweep phase: returns garbage cells to the free list.,23,Mark-Sweep Example,24,Mark-Sweep: Advantages and Disadvantages,Advantages: Cyclic data structures can be recovered. Tends to be faster than reference counting.,25,Mark-Sweep: Advant
18、ages and Disadvantages (Continued),Disadvantages: Computation must be halted while garbage collection is being performed Every live cell must be visited in the mark phase, and every cell in the heap must be visited in the sweep phase. Garbage collection becomes more frequent as residency of a progra
19、m increases. May fragment memory.,26,Mark-Sweep: Advantages and Disadvantages (Continued),Disadvantages: Has negative implications for locality of reference. Old objects get surrounded by new ones (not suited for virtual memory applications). However, if objects tend to survive in clusters in memory
20、, as they apparently often do, this can greatly reduce the cost of the sweep phase.,27,Mark-Compact Collection,Remedy the fragmentation and allocation problems of mark-sweep collectors. Two phases: Mark phase: identical to mark sweep. Compaction phase: marked objects are compacted, moving most of th
21、e live objects until all the live objects are contiguous.,28,Mark-Compact: Advantages and Disadvantages (Continued),Advantages: The contiguous free area eliminates fragmentation problem. Allocating objects of various sizes is simple. The garbage space is “squeezed out“, without disturbing the origin
22、al ordering of objects. This ameliorate locality.,29,Mark-Compact: Advantages and Disadvantages (Continued),Disadvantages: Requires several passes over the data are required. “Sliding compactors“ takes two, three or more passes over the live objects. One pass computes the new location Subsequent pas
23、ses update the pointers to refer to new locations, and actually move the objects,30,Copying Garbage Collection,Like mark-compact, copying garbage collection does not really “collect“ garbage. Rather it moves all the live objects into one area and the rest of the heap is know to be available. Copying
24、 collectors integrate the traversal and the copying process, so that objects need only be traversed once. The work needed is proportional to the amount of live date (all of which must be copied).,31,Semispace Collector Using the Cheney Algorithm,The heap is subdivided into two contiguous subspaces (
25、FromSpace and ToSpace). During normal program execution, only one of these semispaces is in use. When the garbage collector is called, all the live data are copied from the current semispace (FromSpace) to the other semispace (ToSpace).,32,Semispace Collector Using the Cheney Algorithm,A,B,C,D,FromS
26、pace,ToSpace,33,Semispace Collector Using the Cheney Algorithm,FromSpace,ToSpace,A,B,C,D,A,B,C,D,34,Semispace Collector Using the Cheney Algorithm (Continued),Once the copying is completed, the ToSpace is made the “current“ semispace. A simple form of copying traversal is the Cheney algorithm. The i
27、mmediately reachable objects from the initial queue of objects for a breadth-first traversal. A scan pointer is advanced through the first object location by location. Each time a pointer into FromSpace is encountered, the referred-to-object is transported to the end of the queue and the pointer to
28、the object is updated.,35,Cheney Algorithm: Example,Root Nodes,A,B,F,E,D,C,A,A,A,B,B,B,C,C,C,D,D,E,A,B,C,D,E,F,B,A,36,Semispace Collector Using the Cheney Algorithm (Continued),Multiple paths must not be copied to tospace multiple times. When an object is transported to tospace, a forwarding pointer
29、 is installed in the old version of the object. The forwarding pointer signifies that the old object is obsolete and indicates where to find the new copy.,37,Copying Garbage Collection: Advantages and Disadvantages,Advantages: Allocation is extremely cheap. Excellent asymptotic complexity. Fragmenta
30、tion is eliminated. Only one pass through the data is required.,38,Copying Garbage Collection: Advantages and Disadvantages (Continued),Disadvantages: The use of two semi-spaces doubles memory requirement needs Poor locality. Using virtual memory will cause excessive paging.,39,Problems with Simple
31、Tracing Collectors,Difficult to achieve high efficiency in a simple garbage collector, because large amounts of memory are expensive. If virtual memory is used, the poor locality of the allocation/reclamation cycle will cause excessive paging. Even as main memory becomes steadily cheaper, locality w
32、ithin cache memory becomes increasingly important.,40,Problems with Simple Tracing Collectors (Continued),With a simple semispace copy collector, locality is likely to be worse than mark-sweep. The memory issue is not unique to copying collectors. Any efficient garbage collection involves a trade-of
33、f between space and time. The problem of locality is an indirect result of the use of garbage collection.,41,Incremental Tracing Collectors Overview,Introduction to Incremental Collectors Coherence and Conservatism Tricolor Marking Write Barrier Algorithms Bakers Read Barrier Algorithm,42,Incrementa
34、l Tracing Collectors,Program (Mutator) and Garbage Collector run concurrently. Can think of system as similar to two threads. One performs collection, and the other represents the regular program in execution. Can be used in systems with real-time requirements. For example, process control systems.,
35、43,Coherence & Conservatism,Coherence: A proper state must be maintained between the mutator and the collector. Conservatism: How aggressive the garbage collector is at finding objects to be deallocated.,44,Tricoloring,White Not yet traversed. A candidate for collection. Black Already traversed and
36、found to be live. Will not be reclaimed. Grey In traversal process. Defining characteristic is that its children have not necessarily been explored.,45,The Tricolor Abstraction,46,Tricoloring Invariant,There must not be a pointer from a black object to a white object.,47,Violation of Coloring Invari
37、ant,Before,After,A,B,C,D,A,B,C,D,48,Steps in Violation,Read a pointer to a white object Assign that pointer to a black object Original pointer must be destroyed without collection system noticing.,49,Read Barrier,Barriers are essentially memory access detection systems. We detect when any pointers t
38、o any white objects are read. If a read to the pointer occurs, we conceptually color that object grey.,50,Write Barrier,When a pointer is written to an object, we record the write somehow. The recorded write is dealt with at a later point. Read vs. Write efficiency considerations.,51,Write Barrier A
39、lgorithms,Snapshot-at-beginning Incremental update,52,Snapshot-at-beginning,Conceptually makes a copy-on-write duplication of the pointer graph. Can be implemented with a simple write barrier that records pointer writes and adds the old addresses to a stack to be traversed later.,53,Snapshot-at-begi
40、nning Example,Before,After,A,B,C,D,A,B,C,D,Stack,Pointer to D is now On stack,54,Comments on Snapshot-at-beginning,Very conservative. All overwritten pointer values are saved and traversed. No objects can be freed while collection process is occurring.,55,Incremental Update Write-Barrier Algorithm,N
41、o copy of tree is made. Catches overwrites of pointers that have been copied. If a pointer is not copied before being written, it will be freed. The object with the overwritten pointer is colored grey and the algorithm must search that node again at the end.,56,Incremental Update Example,Before,Afte
42、r,A,B,C,D,A,B,C,D,57,Comments on Incremental Update,Things that are freed during collection are far more likely to be collected than with the snapshot algorithm. (Less conservative) Although the collector restarts the traversal in some places, it is guaranteed to do a full search and will eventually
43、 terminate.,58,Bakers Read Barrier Algorithms,Incremental Copying Non-copying Algorithm (The Treadmill),59,Incremental Copying,Variation of Copying Collector “Garbage collection cycle begins with an atomic flip.” All objects directly pointed to by the root are copied into tospace.,60,Read Barrier in
44、 Incremental Copying,Whenever an object is read that is not already in ToSpace, the read barrier catches that and copies the object over to ToSpace at that point. Normal “background scavenging” occurs simultaneously to ensure that all objects are traversed and reclamation can occur.,61,Incremental C
45、opying Example,A,B,C,D,FromSpace,ToSpace,Atomic Flip, then a read to D occurs,E,D,A,B,C,62,Comments on Read Barrier,If implemented in software can be quite slow due to numerous reads to heap. Specialized hardware is available on some unique machines that allow this type of tracing to be done quickly
46、.,63,Bakers Incremental Non-Copying Algorithm,Doubly Linked Lists New area for allocations since started collection To/From spaces Free list,64,Example - Allocation,Take an object from the free list and move it to the new list.,65,Example - Scanning,Searching nodes in ToSpace for references to objec
47、ts in FromSpace. When found, object is unlinked in FromSpace and is linked in ToSpace.,66,Treadmill Workings,When starting collection cycle: New list is empty From list contains all New and To objects from last cycle. Collection proceeds and scanning and allocation are performed. When finished: From
48、 list is merged with Free list.,67,Comments on Treadmill,As in Incremental Copying, the garbage found in the FromSpace is reclaimed in constant time. Conservative with new objects Conservative also in that reached objects will not be removed even if they become garbage before scan ends.,68,Incremental Collectors Summary,Incremental Tracing Collectors Tricolor Marking and Invariant Read and Write Barriers Snapshot-at-beginning Incremental Update Bakers Incremental Copying Bakers Non-copying (Treadmill),