It's very likely the majority of programmers today are not dealing with manual memory management in their respective programming languages for their day to day tasks.
In the early days of programming and not so long time ago, the most used programming languages didn't provide built-in automatic memory management management and the task had to be done manually.
Not only the options for automatic memory management at that time were not so sophisticated, but also there were limitations in hardware resource due to the overhead that came with those abstracted memory management techniques .
In the last 3 decades or so, we have seen a major rise of programming languages that provide different approaches to remove the burden of manual memory management from the programmer's concerns, which was extremely error prone. This is corroborated by the number of critical bugs that have happened and still happen today due to that manual management of memory.
This rise of memory managed programing languages coincided with the advancement in garbage collection research and techniques, and also the increase of performance in common hardware to minimize the overhead that would come with self managed languages.
Different programming languages have opted to solve the problem in very different and interesting approaches, in this article I will be covering the 2 main approaches to automated memory management which are prevalent in the modern programming languages we use on daily basis today.
I. Tracing Garbage collection
Tracing garbage collection - often simply referred as garbage collection, is an automatic memory management technique which consists of de-allocating memory for objects which are not anymore reachable by the root objects ( class loaders, in the case of the JVM )
This means, the engine runs in cycles and mark the non root reachable objects to be later cleaned or swept in subsequent GC cycles to free the memory for later usage.
The root object in this technique keeps the tracing connected graph of all allocated objects in a program lifetime.
Tracing garbage collection technique has become almost the de-facto approach for implementing automatic memory management in modern programming languages.
Lisp is known to be the first programming language to have a built-in garbage collector, today the majority of programming languages come with it built-in : JVM ( Java/Kotlin/Scala), Ruby, Javascript, SmallTalk, etc, ...
II. Automatic Reference Counting (ARC)
Automatic reference counting (ARC) is a form of automatic memory management where objects are deallocated once there's no more any reference pointing to them. This happens when there's no anymore variable in the scope that refer to the object that's due for deallocation.
On a high level, reference counting is implemented by adding an extra field to objects, that field holds the count of the number of references to that object, in occurrence where the counter reaches zero, the object will end up deleting itself and do a callback to the destructor and finally the memory previously in use by that object will be freed.
ARC is pretty common in Apple ecosystem languages : Objective-C and Swift are the standard bearer of this approach to memory management. Other programming languages that use reference counting are : Delphi, Perl, Tcl, etc, ...
III. Approaches Comparison
While both of these approaches have same goal to abstract and manage memory management, they are quite different and each have their own pros and cons.
- Automatic reference counting (ARC) still requires the programmer to think about objects relationships to avoid accidental circular references.
E.g : Object A references object B, B references object C and then C has a reference back to A, when these occur we can definitely imagine how confused the reference counter might be.
Luckily most languages that use ARC have built some mechanisms to work around these scenarios ( Weak & Strong references ), which still look like some leaky abstractions, at least to me. - Tracing garbage collection runs in cycles pausing all threads to mark and eventually sweep non reachable objects. These cycles can create some delays and possible sporadic performance drops, this makes GC languages almost unusable for ultimate real-time applications.
- ARC provides a much more consistent performance behavior since it's not being impacted by sporadic pauses that might happen in a tracing garbage collector.
But, ARC is not perfect, it also get some performance penalty since for every object assignment, allocation and deallocation, the reference counter has to be incremented or decremented depending on the operation.
Synchronizing reference counters become also a very complex problem in a multithreaded environment, where objects are shared in different threads and there has to be a way to synchronize the incrementation/decrementation of the counters across threads. - While a tracing garbage collector doesn't get impacted negatively by reference counting synchronization, it tends to use a lot more memory to function well compared to ARC.
Well, that's pretty much all I wanted to share in this article, hoping you had a good time reading it.