Home InternationalAllocating on the Stack...
International⭐ Featured

Allocating on the Stack

A description of some of the recent changes to do allocations on the stack instead of the heap.

7 April 2026 at 04:49 am
1 views
Allocating on the Stack

In recent years, the Go programming language has been making significant strides to improve the performance of its programs. One of the primary areas of focus has been mitigating the slowness caused by heap allocations. Each time a Go program allocates memory from the heap, a substantial amount of code must run to satisfy that allocation. Additionally, heap allocations impose extra load on the garbage collector, which, even with recent enhancements like Green Tea, still incurs substantial overhead. To address these issues, the Go team has been exploring ways to perform more allocations on the stack instead of the heap.

Stack allocations are significantly cheaper to perform, sometimes even free. They also present no load to the garbage collector, as stack allocations can be collected automatically alongside the stack frame itself. Furthermore, stack allocations enable prompt reuse, which is highly cache-friendly. This makes them a more efficient and faster alternative to heap allocations in many cases.

One of the areas where stack allocations have been particularly beneficial is in the handling of constant-sized slices. Consider the task of building a slice of tasks to process, as shown in the following code snippet:

```go

func process(c chan task) {

var tasks []task

for t := range c {

tasks = append(tasks, t)

}

processAll(tasks)

}

```

Let's walk through what happens at runtime when pulling tasks from the channel `c` and adding them to the slice `tasks`. On the first loop iteration, there is no backing store for `tasks`, so the `append` function must allocate one. Since it doesn't know how big the slice will eventually be, it can't be too aggressive. Currently, it allocates a backing store of size 1.

On the second loop iteration, the backing store now exists, but it is full. The `append` function must allocate a new backing store, this time of size 2. The old backing store of size 1 is now garbage. On the third loop iteration, the backing store of size 2 is full. `append` must allocate a new backing store, this time of size 4. The old backing store of size 2 is now garbage. On the fourth loop iteration, the backing store of size 4 has only 2 elements, but it will be resized to 8 when full.

These repeated allocations and garbage collections can significantly slow down the program, especially in scenarios where large numbers of tasks are being processed. By allocating slices on the stack, Go can avoid these inefficiencies and improve performance.

The Go team has been working on optimizing stack allocations for slices, particularly for constant-sized cases. This involves determining the maximum size a slice might need and allocating that space on the stack upfront. By doing so, the need for repeated allocations and garbage collections is eliminated, leading to faster and more efficient code.

In addition to slices, the Go team has also been exploring other areas where stack allocations can be beneficial. For example, they have been investigating ways to allocate structs and other data types on the stack to reduce the load on the garbage collector and improve performance.

In conclusion, the Go programming language has been making significant strides in optimizing performance by shifting allocations from the heap to the stack. This approach not only reduces the overhead associated with heap allocations but also improves cache efficiency and reduces the load on the garbage collector. As the Go team continues to refine these techniques, developers can expect even faster and more efficient Go programs in the future.

Source: The Go Blog
📰 Related News
Ollama 0.2.6 Released with Native Gemma 4 Support and Enhanced Performance
Ollama 0.2.6 Released with Native Gemma 4 Support and Enhanced Performance
Ollama 0.2.6 is now live, featuring native support for Google's Gemma 4 models and improved local inference performance for Windows, macOS, and Linux.
14 Apr
Weekly news roundup: Shortages spread to MLCCs; SK Hynix reportedly in talks with Microsoft and Google
Weekly news roundup: Shortages spread to MLCCs; SK Hynix reportedly in talks with Microsoft and Google
Below are the most-read DIGITIMES Asia stories from the week of April 6-April 13, 2026:
14 Apr
cutile-stencil 0.2.0
cutile-stencil 0.2.0
An xDSL-based stencil compiler that generates optimized GPU kernels via NVIDIA cuTile
14 Apr
merlin-llm added to PyPI
merlin-llm added to PyPI
Merlin — a fast local LLM for agentic coding on Apple Silicon
14 Apr
Fluent Cut - Craft and compose videos programmatically in PHP with an elegant fluent API
Fluent Cut - Craft and compose videos programmatically in PHP with an elegant fluent API
Craft and compose videos programmatically in PHP with an elegant fluent API - b7s/fluentcut
14 Apr
Crypto Investor at Center of Trump Corruption Allegations Now Sees Himself as ‘Victim’
Crypto Investor at Center of Trump Corruption Allegations Now Sees Himself as ‘Victim’
Justin Sun has accused Trump-affiliated World Liberty Financial of misconduct and a general lack of transparency.
14 Apr
nvidia-nat-weave 1.7.0a20260413
nvidia-nat-weave 1.7.0a20260413
Subpackage for Weave integration in NeMo Agent Toolkit
14 Apr
nvidia-nat-s3 1.7.0a20260413
nvidia-nat-s3 1.7.0a20260413
Subpackage for S3-compatible integration in NeMo Agent Toolkit
14 Apr
Social Security Trust Fund to Run Dry in 2032: Just 6 Years From Now
Social Security Trust Fund to Run Dry in 2032: Just 6 Years From Now
Six years. That is how much time separates retirees from a Social Security system that, by its own projections, runs out of money. If you are 56 years old...
14 Apr
cane-gpu-perf added to PyPI
cane-gpu-perf added to PyPI
GPU inference benchmarking with opinionated diagnostics
13 Apr